DEV Community

Jordan Hansen
Jordan Hansen

Posted on • Originally published at javascriptwebscrapingguy.com on

Jordan Scrapes Secretary of States: Washington

Demo code here

Washington! Oh yes, Washington. Scraping the Washington Secretary of State has turned out to be a gem. I feel they offer the best search options for the average person. Their site is built on some version of Angularjs and it’s well done.

This is the sixth state in the Secretary of State scraping series.

washington secretary of state business search

Look at that search. Start and end date. Business status. Business type. Name containing. And it all searches so fast! It’s very good. Good job, Washington.

Also, as another bonus, they allow you to download the search results as CSV! That’s huge. It doesn’t contain address or any other juicy information but it’s still pretty neat.

Investigation

pretty washington gif

This post may be short because of how easy Washington made it to get the data I wanted. First thing I started with was the search. Washington allowed me to search with only date of incorporation. Yes, you heard correctly. I just enter the start date and it gives me a list of all recently registered businesses. Money beets.

It was quickly obvious that the calls were being made with ajax so I just checked with the network tab and there was the network request with the search parameters.

scraping washington secretary of state

This POST request goes out and gets the list of companies. The form data is extensive but certainly could be used to just make a call directly to this list. This list only returns a smaller view instead of all of the details from the company, as I would expect.

Also….hold up. What do we have here?

phone number and email address potential fields returned<br>

You can see that everything is null here but the fact that these fields exist suddenly makes this whole scrape especially interesting. Let’s take a look at what returns for the details of the business. Clicking on a business shows a request a simple request URL with a numeric businessID. https://cfda.sos.wa.gov/api/BusinessSearch/BusinessInformation?businessID=1348199

Looking at the page directly and we see the standard stuff. Business name, registered agent, date of incorporation, address. All of which is valuable information.

business details from washington secretary of state

Look at the ajax request, though?

details of the ajax request of a business

Yes. Phone number and email address, while not publicly visible on the site, is returned in the ajax request. Even though this information is totally public as easily accessible to anyone who is reading this, I still didn’t feel fully comfortable exposing the information here. Hence my very craftily censored screenshot.

I should clarify. It doesn’t look like they all have phone number and email address but from my spot checks, most had them. This was on both a search of recent filings and a search of businesses 10 years ago. If the business was still active, it almost always had an email address and often had a phone number.

The code

maybe a gif from washington

async function getBusinessDetails(id: number) {

    const axiosResponse = await axios.get(`https://cfda.sos.wa.gov/api/BusinessSearch/BusinessInformation?businessID=${id}`);

    if (axiosResponse.data) {
        if (axiosResponse.data.PrincipalOffice) {
            console.log('principal office PhoneNumber', axiosResponse.data.PrincipalOffice.PhoneNumber);
            console.log('principal office EmailAddress', axiosResponse.data.PrincipalOffice.EmailAddress);

            if (axiosResponse.data.PrincipalOffice.PrincipalStreetAddress) {
                console.log('principal office PrincipalStreetAddress ', axiosResponse.data.PrincipalOffice.PrincipalStreetAddress.FullAddress);
            }
        }
        console.log('BusinessName', axiosResponse.data.BusinessName);
        console.log('BusinessID', axiosResponse.data.BusinessID);
        console.log('DateOfIncorporation', axiosResponse.data.DateOfIncorporation);

    }
    await timeout(1000);
}

That’s it. Seriously. I pass in the id I want, do a GET request, and get JSON back with all of the details.

There is a custom timeout function at the bottom because I’m not a monster. I don’t want to put burden on their servers. The id for the most recently registered business was 1348653 as of April 12, 2020. Date of incorporation was 03-11-2020.

Conclusion

Washington made it easy to scrape and provided a little bonus with the extra data it provided.

Demo code here

Looking for business leads?

Using the techniques talked about here at javascriptwebscrapingguy.com, we’ve been able to launch a way to access awesome business leads. Learn more at Cobalt Intelligence!

The post Jordan Scrapes Secretary of States: Washington appeared first on JavaScript Web Scraping Guy.

Top comments (0)