DEV Community

Cover image for How To Build Your Own Instagram Email Scraper
Braydon Buckner
Braydon Buckner

Posted on • Updated on

How To Build Your Own Instagram Email Scraper

Important note: Please be advised that automatically scraping emails from Instagram is against their terms of service.

This is just an educational resource and for those of you who don’t have the time, resources, and want to be 100% on the legal side of things, get in touch with Influencers Club and just buy email addresses from Instagram.

Here's how you can build your own Instagram email scraper and get a lot of other data points like phone numbers, bio text, images, and more.

1. Access the unofficial Instagram API

To access the unofficial Instagram API you need to use mobile endpoints with Python or PHP.

You can take the code sample from this Github public repos.

def login(self, force=False):
"""
Authenticate this API instance.
If already logged in (and not later logged out) does nothing (unless forced).
:param force: if true, will attempt to log in even if already logged in.
:return: dictionary of responses.
"""
if not self._isloggedin or force:
self._session = requests.Session()
# if you need proxy make somethin://proxyip:proxyport"}
full_response = self._sendrequest(
'si/fetch_headers/?challenge_type=signup&guid=' + self.generate_uuid(False), login=True)
data = {
'phone_id': self.generate_uuid(True),
'_csrftoken': full_response.cookies['csrftoken'],
'username': self._username,
'guid': self._uuid,
'device_id': self._deviceid,
'password': self._password,
'login_attempt_count': '0'}
try:
full_response = self._sendrequest(
'accounts/login/',
post=self._generatesignature(json.dumps(data)),
login=True)
except InstagramAPIBase._2FA_Required as exception:
# In order to login, need to provide the second factor (i.e. SMS code or backup code).
# Use call-back to get this string.
if not self._two_factor_callback:
raise AuthenticationError("This account requires support for Two-Factor Authentication")
two_factor_info = exception.two_factor_info = exception.two_factor_info
verification_string = self._two_factor_callback(two_factor_info)
data = {
'verification_code': verification_string,
'two_factor_identifier': g like this:
# self.s.proxies = {"https": "httptwo_factor_info['two_factor_identifier'],
'_csrftoken': full_response.cookies['csrftoken'],
'username': self._username,
'device_id': self._deviceid,
'password': self._password,
}
full_response = self._sendrequest(
'accounts/two_factor_login/',
post=self._generatesignature(json.dumps(data)),
login=True)
self._isloggedin = True
decoded_text = json.loads(full_response.text)
self._loggedinuserid = decoded_text["logged_in_user"]["pk"]
self._ranktoken = "%s_%s" % (self._loggedinuserid, self._uuid)
self._csrftoken = full_response.cookies["csrftoken"]
return decoded_text

Use a proxy from the location you're already at and make sure to complete any captchas if necessary. Then let the profile rest for couple of days before the scraping because Zucker is now watching you.

2. Get a lot of Instagram profiles

Two main things to remember here - is to always use aged Instagram profiles that are phone-validated otherwise they'll get banned immediately.

The second thing is to never use your personal profiles.

You can buy Instagram accounts from Facebook groups, DM's on Instagram or online stores.

3. Proxies

Don't stimulate too many IPs because logging in more than 5 accounts on the same IP a huge no-no.

Same as with Instagram accounts, the issue occurs with proxies. Zucker can detect proxy providers and before you find a good one you're in deep trouble.

To check whether the provider is not on the radar use this website and simply paste your proxies IP.

Building an Instagram email scraper

Here's a Github Repo for all the code samples you need for creating your own Instagram scraper.

Once you're logged with an IG profile from a proxy, getting the data is "easy enough".

You only need the API endpoints.

And the one for getting emails is:
/api/v1/users/{{user_id}}/info/

User.public_email Email address
user.username The Username
user.is_private If this is a private account
user.full_name User’s full name
user.profile_pic_url User’s profile photo URL
user.biography User’s bio
user.external_url User’s website
user.follower_count Follower count
user.following_count Following count
user.media_count Number of posts

For more information check Influencers Club.

Top comments (2)

Collapse
 
hbm1__ profile image
Hina Batool

Thanks for the post. To explore public sources for scraping email addresses, check out this blog.

Collapse
 
s0mnaths profile image
Somnath Sharma

Thanks for this.
One small query, does this work on private Instagram accounts as well?