Having lots of LinkedIn connections can be convenient for many people. You and your connection agreed to be connected through the platform, thus sharing some public information including your email (in most cases, you can choose not to though). This is all nice and dandy until you actually want to use all the data you have from your connections... Depending on what data you want...
Let's say you want to export all of your connections' data from LinkedIn, you can do this by following their instructions found here. It generates a CSV file containing the following information from each connection:
First Name, Last Name, Email Address, Company, Position, Connected On
So what's the issue here? Well even though it gives you a Email Address column on the csv, it doesn't really provide any of your connections' emails! I guess they used to provided it and never updated the export csv to remove that column. I also checked out their public API and found nothing related you connections emails, but I did find this StackOverflow discussion which indicated that they in fact used to provided that info, but now they do not. WTF LinkedIn? So I just decided to just scrape all of my connections' emails. I mean, I can access them manually, but it would take a shit load of time to get all of my 2000+ connection emails.
What did I need the script to do to achieve this? Well first I needed it to log in, then search the connection's name, enter it's profile page, and get the email. Simple... right?
By using LinkedIn's search input getting the emails was working until they semi-blocked my account for suspicious behavior due to too many search requests. This was about 500 connections in.
Maybe I just have to be more careful with the amount of searches between x amount of time. So I added the option to set an interval (default to 1 hour) and to set the amount of emails to search between each interval (default to 50).
LinkedIn are some sneaky bastards, they semi-blocked me again! I searched for information on this semi-blocking and found that is specifically designed to avoid automated bots to do stuff on the site. Great....
I thought that maybe the search limit only applied to general searches, so let's try clicking directly on the connection when it appears on the suggestion box that appears after typing in the connection's name.
Well, turns out the library I'm using to scrape the page (NightmareJS) did not detect that DOM element, so I couldn't do anything with it. sigh....
After some head scratching and some thought to just quit the little project I finally came up with another approach... Going directly to my connections section, and using the connections search input, which only searches my connections. And this finally worked with no search limit!!
After all emails are scraped I just create a
email.txt file with all the emails in there. And that was it!
I wanted to get all of my LinkedIn connections' emails. LinkedIn does not allow an option to retrieve them by exporting your connections data, so I created a web scraper to get them.
For anyone interested in checking out the script, you can access it here.
If LinkedIn updates their page and changes the class of an element used in the script it will stop working. You can check out the source code and verify if any class has changed on LinkedIn and update the script to make it work again.
Thanks for reading!