DEV Community

Seraph★776
Seraph★776

Posted on • Updated on

Download PDF Files Using Python

This article discusses how to download a PDF using Python's requests library.

Approach

  1. Import requestslibrary
  2. Request the URL and get the response object.
  3. Get the PDF file using the response object, and return True.
  4. If the PDF cannot be downloaded, return False

Implementation

The following program downloads a PDF files from the provided URL.

#!/usr/bin/env python3
import os
import requests


def download_pdf_file(url: str) -> bool:
    """Download PDF from given URL to local directory.

    :param url: The url of the PDF file to be downloaded
    :return: True if PDF file was successfully downloaded, otherwise False.
    """

    # Request URL and get response object
    response = requests.get(url, stream=True)

    # isolate PDF filename from URL
    pdf_file_name = os.path.basename(url)
    if response.status_code == 200:
        # Save in current working directory
        filepath = os.path.join(os.getcwd(), pdf_file_name)
        with open(filepath, 'wb') as pdf_object:
            pdf_object.write(response.content)
            print(f'{pdf_file_name} was successfully saved!')
            return True
    else:
        print(f'Uh oh! Could not download {pdf_file_name},')
        print(f'HTTP response status code: {response.status_code}')
        return False


if __name__ == '__main__':
    # URL from which pdfs to be downloaded
    URL = 'https://raw.githubusercontent.com/seraph776/DevCommunity/main/PDFDownloader/assests/the_raven.pdf'
    download_pdf_file(URL)

Enter fullscreen mode Exit fullscreen mode

Output

the_raven.pdf was successfully saved!
Enter fullscreen mode Exit fullscreen mode

Conclusion

After reading this article you should now be able to download a PDF using Python's requestslibrary. Remember that some website might more difficult than others to get data from. If you are unable to download the PDF file, analyze the HTTP response status codes to help determine what wrong. Please leave a comment if you found this article helpful.


Code available at GitHub

Top comments (0)