pooch.HTTPDownloader

pooch.HTTPDownloader#

class pooch.HTTPDownloader(progressbar=False, chunk_size=1024, **kwargs)[source]#

Download manager for fetching files over HTTP/HTTPS.

When called, downloads the given file URL into the specified local file. Uses the requests library to manage downloads.

Use with pooch.Pooch.fetch or pooch.retrieve to customize the download of files (for example, to use authentication or print a progress bar).

Parameters:
  • progressbar (bool or an arbitrary progress bar object) – If True, will print a progress bar of the download to standard error (stderr). Requires tqdm to be installed. Alternatively, an arbitrary progress bar object can be passed. See Using custom progress bars for details.

  • chunk_size (int) – Files are streamed chunk_size bytes at a time instead of loading everything into memory at one. Usually doesn’t need to be changed.

  • **kwargs – All keyword arguments given when creating an instance of this class will be passed to requests.get.

Examples

Download one of the data files from the Pooch repository:

>>> import os
>>> from pooch import __version__, check_version
>>> url = "https://github.com/fatiando/pooch/raw/{}/data/tiny-data.txt"
>>> url = url.format(check_version(__version__, fallback="main"))
>>> downloader = HTTPDownloader()
>>> # Not using with Pooch.fetch so no need to pass an instance of Pooch
>>> downloader(url=url, output_file="tiny-data.txt", pooch=None)
>>> os.path.exists("tiny-data.txt")
True
>>> with open("tiny-data.txt") as f:
...     print(f.read().strip())
# A tiny data file for test purposes only
1  2  3  4  5  6
>>> os.remove("tiny-data.txt")

Authentication can be handled by passing a user name and password to requests.get. All arguments provided when creating an instance of the class are forwarded to requests.get. We’ll use auth=(username, password) to use basic HTTPS authentication. The https://httpbin.org website allows us to make a fake a login request using whatever username and password we provide to it:

>>> user = "doggo"
>>> password = "goodboy"
>>> # httpbin will ask for the user and password we provide in the URL
>>> url = f"https://httpbin.org/basic-auth/{user}/{password}"
>>> # Trying without the login credentials causes an error
>>> downloader = HTTPDownloader()
>>> try:
...     downloader(url=url, output_file="tiny-data.txt", pooch=None)
... except Exception:
...     print("There was an error!")
There was an error!
>>> # Pass in the credentials to HTTPDownloader
>>> downloader = HTTPDownloader(auth=(user, password))
>>> downloader(url=url, output_file="tiny-data.txt", pooch=None)
>>> with open("tiny-data.txt") as f:
...     for line in f:
...         print(line.rstrip())
{
  "authenticated": true,
  "user": "doggo"
}
>>> os.remove("tiny-data.txt")

Methods Summary

HTTPDownloader.__call__(url, output_file, pooch)

Download the given URL over HTTP to the given output file.


HTTPDownloader.__call__(url, output_file, pooch, check_only=False)[source]#

Download the given URL over HTTP to the given output file.

Uses requests.get.

Parameters:
  • url (str) – The URL to the file you want to download.

  • output_file (str or file-like object) – Path (and file name) to which the file will be downloaded.

  • pooch (Pooch) – The instance of Pooch that is calling this method.

  • check_only (bool) – If True, will only check if a file exists on the server and without downloading the file. Will return True if the file exists and False otherwise.

Returns:

availability (bool or None) – If check_only==True, returns a boolean indicating if the file is available on the server. Otherwise, returns None.