haha thank you! So it did eventually finish without error but then I appeared to have a list of "Post" objects of which I could not tell how I was to get the data from. From reading the GitHub documentation I tried various methods but to no avail (this isn't a knock on you more a knock on my learning curve).
So now after a few hours of messing around I tried to run the "joe biden code" for my own account and even though I am setting login_first=False in the get_posts function, the chrome driver brings me to a login page. Im able to log into instagram but meanwhile my code says it has finished running without error but my posts and scraped_posts objects are now just empty lists.
oh I guess I should also mention that my end goal is to collect data similar to the data you analyzed in your donald trump post. I saw you published a notebook of the analysis code (thank you!) but didn't see a line-by-line on how you got that data.
scraped Post objects contain the scraped data as instance attributes! Try using the to_dict method on one of the Post's and it should return a dictionary with the data it scraped for that Post. The key/values of the returned dict will correspond one-to-one with the available instance attributes
I'll take a look at the login_first bug rn and see if I can replicate it, it might be on the library's end! Instagram has been making a lot of changes the last month or so and have been making it increasingly harder to scrape
ahhh okay, so when you set login_first=False, Instagram is still redirecting to the login page automatically but instascrape is trying to start scrolling immediately which results in an empty list since there are no posts rendered on the page
to access dynamically rendered content like posts you're pretty much always gonna have to be logged in so it's best to leave login_first as True unless you're chaining scrapes and your webdriver is already logged in manually
amazing thank you! So I was able to get my first 10 posts no problem by specifying amount=10 but then I tried to do all ~500 pictures and after 232 pictures I came across this error:
haha thank you! So it did eventually finish without error but then I appeared to have a list of "Post" objects of which I could not tell how I was to get the data from. From reading the GitHub documentation I tried various methods but to no avail (this isn't a knock on you more a knock on my learning curve).
So now after a few hours of messing around I tried to run the "joe biden code" for my own account and even though I am setting login_first=False in the get_posts function, the chrome driver brings me to a login page. Im able to log into instagram but meanwhile my code says it has finished running without error but my posts and scraped_posts objects are now just empty lists.
oh I guess I should also mention that my end goal is to collect data similar to the data you analyzed in your donald trump post. I saw you published a notebook of the analysis code (thank you!) but didn't see a line-by-line on how you got that data.
scraped
Post
objects contain the scraped data as instance attributes! Try using theto_dict
method on one of thePost
's and it should return a dictionary with the data it scraped for thatPost
. The key/values of the returneddict
will correspond one-to-one with the available instance attributesI'll take a look at the
login_first
bug rn and see if I can replicate it, it might be on the library's end! Instagram has been making a lot of changes the last month or so and have been making it increasingly harder to scrapeahhh okay, so when you set
login_first=False
, Instagram is still redirecting to the login page automatically butinstascrape
is trying to start scrolling immediately which results in an empty list since there are no posts rendered on the pageto access dynamically rendered content like posts you're pretty much always gonna have to be logged in so it's best to leave
login_first
asTrue
unless you're chaining scrapes and your webdriver is already logged in manuallyamazing thank you! So I was able to get my first 10 posts no problem by specifying amount=10 but then I tried to do all ~500 pictures and after 232 pictures I came across this error:
ConnectionError: ('Connection aborted.', OSError("(54, 'ECONNRESET')"))
Im guessing this means instagram blocked my request? Have you come across this issue?