DEV Community

Discussion on: Scraping every post on an Instagram profile with less than 10 lines of Python

Collapse
gorpcorrespondent profile image
gorpcorrespondent

InvalidArgumentException: invalid argument: 'url' must be a string
Do you know why I might be getting this error?
Code is as follows:
import pandas as pd
from selenium.webdriver import Chrome
from instascrape import Profile, scrape_posts
from webdriver_manager.chrome import ChromeDriverManager

defining path for Google Chrome webdriver;

driver = webdriver.Chrome(ChromeDriverManager().install())

Scraping Joe Biden's profile

SESSIONID = 'session id' #Actual session id excluded on purpose
headers = {"user-agent": "Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Mobile Safari/537.36 Edg/87.0.664.57",
"cookie": f"sessionid={SESSIONID};"}
prof = Profile('instagram.com/username/') #username exlcuded as well
prof.scrape()

Scraping the posts

posts = prof.get_posts(webdriver=driver, login_first=True)
scraped, unscraped = scrape_posts(posts, silent=False, headers=headers, pause=10)

posts_data = [post.to_dict() for post in posts]
posts_df = pd.DataFrame(posts_data)
print(posts_df[['upload_date', 'comments', 'likes']])

The issue seems to be stemming from my 'get_posts' call