Washington! Oh yes, Washington. Scraping the Washington Secretary of State has turned out to be a gem. I feel they offer the best search options for the average person. Their site is built on some version of Angularjs and it’s well done.
This is the sixth state in the Secretary of State scraping series.
Look at that search. Start and end date. Business status. Business type. Name containing. And it all searches so fast! It’s very good. Good job, Washington.
Also, as another bonus, they allow you to download the search results as CSV! That’s huge. It doesn’t contain address or any other juicy information but it’s still pretty neat.
Investigation
This post may be short because of how easy Washington made it to get the data I wanted. First thing I started with was the search. Washington allowed me to search with only date of incorporation. Yes, you heard correctly. I just enter the start date and it gives me a list of all recently registered businesses. Money beets.
It was quickly obvious that the calls were being made with ajax so I just checked with the network tab and there was the network request with the search parameters.
This POST request goes out and gets the list of companies. The form data is extensive but certainly could be used to just make a call directly to this list. This list only returns a smaller view instead of all of the details from the company, as I would expect.
Also….hold up. What do we have here?
You can see that everything is null here but the fact that these fields exist suddenly makes this whole scrape especially interesting. Let’s take a look at what returns for the details of the business. Clicking on a business shows a request a simple request URL with a numeric businessID. https://cfda.sos.wa.gov/api/BusinessSearch/BusinessInformation?businessID=1348199
Looking at the page directly and we see the standard stuff. Business name, registered agent, date of incorporation, address. All of which is valuable information.
Look at the ajax request, though?
Yes. Phone number and email address, while not publicly visible on the site, is returned in the ajax request. Even though this information is totally public as easily accessible to anyone who is reading this, I still didn’t feel fully comfortable exposing the information here. Hence my very craftily censored screenshot.
I should clarify. It doesn’t look like they all have phone number and email address but from my spot checks, most had them. This was on both a search of recent filings and a search of businesses 10 years ago. If the business was still active, it almost always had an email address and often had a phone number.
The code
async function getBusinessDetails(id: number) {
const axiosResponse = await axios.get(`https://cfda.sos.wa.gov/api/BusinessSearch/BusinessInformation?businessID=${id}`);
if (axiosResponse.data) {
if (axiosResponse.data.PrincipalOffice) {
console.log('principal office PhoneNumber', axiosResponse.data.PrincipalOffice.PhoneNumber);
console.log('principal office EmailAddress', axiosResponse.data.PrincipalOffice.EmailAddress);
if (axiosResponse.data.PrincipalOffice.PrincipalStreetAddress) {
console.log('principal office PrincipalStreetAddress ', axiosResponse.data.PrincipalOffice.PrincipalStreetAddress.FullAddress);
}
}
console.log('BusinessName', axiosResponse.data.BusinessName);
console.log('BusinessID', axiosResponse.data.BusinessID);
console.log('DateOfIncorporation', axiosResponse.data.DateOfIncorporation);
}
await timeout(1000);
}
That’s it. Seriously. I pass in the id I want, do a GET request, and get JSON back with all of the details.
There is a custom timeout function at the bottom because I’m not a monster. I don’t want to put burden on their servers. The id for the most recently registered business was 1348653 as of April 12, 2020. Date of incorporation was 03-11-2020.
Conclusion
Washington made it easy to scrape and provided a little bonus with the extra data it provided.
Looking for business leads?
Using the techniques talked about here at javascriptwebscrapingguy.com, we’ve been able to launch a way to access awesome business leads. Learn more at Cobalt Intelligence!
The post Jordan Scrapes Secretary of States: Washington appeared first on JavaScript Web Scraping Guy.
Top comments (0)