loading...
Cover image for Python Facebook Posts Scraper with Requests and BeautifulSoup4

Python Facebook Posts Scraper with Requests and BeautifulSoup4

hhsm95 profile image Hugo Sandoval Updated on ・3 min read

A few weeks ago I wrote about the Importance of using User Agents when we scrap data, and my examples shows the response from Twitter when we used the correct User Agent. This time I want to do the same but Facebook. We gonna scrape the posts in users profiles, Facebook pages and groups.

What we gonna get?

A list of items with the next values:

params description
published Formatted datetime of published
description Post text content
images List of images in posts
post_url The unique post url
external_links External links found in description
like_url The Like url

Let's start

1.Get Python (recommended Python 3.7+)

2.Clone or download this repository

git clone https://github.com/adeoy/FacebookPostsScraper.git

3.Install the project requirements

pip install -r requirements.txt

Lets explain

First of all, all the code is in my Github repository https://github.com/adeoy/FacebookPostsScraper

  1. We create a requests session.

  2. Set a User Agent of and old Nokia C3 phone to the requests session (Nokia C3 gives me better results during the scraping than other phones).

  3. Check if we have a session cookie saved in our computer, if not, then login to Facebook with email and password and save the session cookie in our computer (we need to log because our friends private profiles can't be scraped without auth).

  4. Request a profile and scrape the posts using BeautifulSoup and CSS selectors.

  5. Return the results.

  6. Have fun :)

I already made a class to manage all the process, first we need to instantiate an object of FacebookPostsScraper, pass of email and password, and optionally if your Facebook account isn't in English you need to set the Text in the url that opens a Post that only appears in the Facebook mobile version. Don't worry if you don't understand, I will respond for you if ask me in the comments the language what you need. BTW, this are for English and Spanish:

  • English: 'Full Story'
  • Spanish: 'Historia completa'

Once you instantiate an object, in the process, the class automatically logs to Facebook and prepare the session for the requests. Now you can call the method get_posts_from_profile and pass a Facebook profile url to get the posts.

Edit June 27th, 2020. Now you can export the scraped posts to CSV, Excel and JSON. See the end of the examples to check out.

Examples

Example with single url

from FacebookPostsScraper import FacebookPostsScraper as Fps
from pprint import pprint as pp

# Enter your Facebook email and password
email = 'YOUR_EMAIL'
password = 'YOUR_PASWORD'

# Instantiate an object
fps = Fps(email, password, post_url_text='Full Story')

# Example with single profile
single_profile = 'https://www.facebook.com/BillGates'
data = fps.get_posts_from_profile(single_profile)
pp(data)

fps.posts_to_csv('my_posts')  # You can export the posts as CSV document
# fps.posts_to_excel('my_posts')  # You can export the posts as Excel document
# fps.posts_to_json('my_posts')  # You can export the posts as JSON document

Example with multiple urls

from FacebookPostsScraper import FacebookPostsScraper as Fps
from pprint import pprint as pp

# Enter your Facebook email and password
email = 'YOUR_EMAIL'
password = 'YOUR_PASWORD'

# Instantiate an object
fps = Fps(email, password, post_url_text='Full Story')

# Example with multiple profiles
profiles = [
    'https://www.facebook.com/zuck', # User profile
    'https://www.facebook.com/thepracticaldev', # Facebook page
    'https://www.facebook.com/groups/python' # Facebook group
]
data = fps.get_posts_from_list(profiles)
pp(data)

fps.posts_to_csv('my_posts')  # You can export the posts as CSV document
# fps.posts_to_excel('my_posts')  # You can export the posts as Excel document
# fps.posts_to_json('my_posts')  # You can export the posts as JSON document

Questions

Please be free of ask anything in you want in the comments section.

Posted on by:

hhsm95 profile

Hugo Sandoval

@hhsm95

Passionate about technology and motivated to solve problems and optimize processes, always seeking to learn and improve

Discussion

markdown guide
 

hello Hugo thanks for sharing this .am working on scraping facebook data but the facebook blocked my account several times . please make a video to help as to see how it works.
how many posts does it extract ?

 

The key to avoid being blocked is scrap slowly (about 3-5 between profiles) and use an User Agent like above (an old phone) that don't requires javascript so the detection is harder for Facebook.

 

Hugo, this is great. When I run this, as is, I'm only getting about 6 posts. Is this normal? Also, how would you recommend I edit to search for group posts with a specific tag_id? I tried to rewrite a few lines to specify my url with "groups/{group_id}/post_tags/?post_tag_id={tag_ID}" but I was only ever brought to the group's standard timeline.

 

Hi, i had the same problem, when i run, i'm only getting 6 posts ! you resolved this ?
Thank you.

 
[deleted]
 

Hi, I'm not sure if I can post an email here, but ask me any issues in the Github project.
github.com/adeoy/FacebookPostsScraper

 

Hi!

Could you please advise on how to build a scraper that extracts comments on a specific subject on Facebook?
Thanks for assisting.

 

Hi, the key to scraping on Facebook is the use of a User Agent from an old phone, as in this case use a Nokia C3. To extract the comments from the posts, it is necessary to enter the post and locate the CSS selectors that identify the comments, it is possible that I update the post adding this functionality.

 

Hi, Thank you for sharing this, that's works well, but when i run the program, i get just 6 posts from a facebook page? how can i get more posts ?
Thank you.

 

getting error TypeError: 'module' object is not callable while running the code. What may be the problem ? Thanks in advance