DEV Community

Cover image for Scrape Google Search Results With Go
ApiForSeo
ApiForSeo

Posted on • Originally published at serpdog.io

Scrape Google Search Results With Go

Scrape Google Search Results With Go

GO, a procedural programming language was launched by three Google developers, Robert Griesemer, Rob Pike, and Ken Thompson, as an open-source programming language in 2009. It is a statically-typed language and has excellent support for concurrency, making it a useful tool for web scraping.

GO is designed to be simple to learn, and with the support of concurrency, it has become a fast and robust language.

Web scraping, data scraping, or data extraction can be defined as the process of extracting a specific piece of data from websites. It can be done manually at a small scale, but the term specifically refers to an automated extraction of data using a scraping bot or a crawler.

Scrape Google Search Results With Go

In this tutorial, we’ll be scraping Google Search Results using GO. We will also discuss why GO can be used as a choice for other languages to scrape Google Search Results.

By the end of the article, you will be able to deal with the complex HTML structure of Google Search Results. You can also leverage or use this knowledge for other web scraping tasks.

Why Scrape Google Search Results?

Scraping Google Search Results can provide you with a variety of benefits:

Reasons for Scraping Google Search Results

SERP Monitoring — Scraping Google Search Results allows you to monitor your search rankings on Google, and you can also use this data to improve your positions on the search engine.

Media Monitoring — With the help of scraped Google data, you can monitor any negative perspectives that are built among the public to malign your company's image.

Competition Analysis — With Google Search Data, you can monitor your competitor tactics and engage in various strategic activities to stay ahead in the market.

Why GO for scraping Google?

GO Lang, has gained great popularity in recent years due to its quality features:

Concurrency — One of the great features offered by Go is support for concurrency, which allows multiple threads to run under a single process, making it possible to scrape multiple pages at once.

Simple Syntax — GO language is designed to be easy to learn and read, minimalizing any complexity for web scraping tasks.

Complied Language — GO is a compiled language with an excellent garbage setup. That is why it can offer such extreme performance.

Let’s start scraping Google Search Results With Go

In this section, we will focus on preparing a basic script to scrape the initial ten Google search results, including their title, description, link, etc.

Set-Up:

If you have not already installed PHP, you can watch these videos for the installation.

  1. How to set up GO on Windows?

  2. How to set up GO on MacOS?

Requirements:

For scraping Google search results with GO, we will install a library:

  1. GoQuery —An library in GO Lang that brings a set of features similar to jQuery and is used for parsing HTML.

You can also install this library in your project folder by running the below command.

 go get github.com/PuerkitoBio/goquery
Enter fullscreen mode Exit fullscreen mode

Process:

So, I assume that you have set up your GO Lang project folder. We will begin with scraping HTML from the web page URL and then parsing it using GoQuery to extract the required data.

This is the URL we are going to target:

https://www.google.com/search?q=go+tutorial&gl=us&hl=en
Enter fullscreen mode Exit fullscreen mode

So, let us start creating our scraper by importing the libraries we’ll use later.

import (
 "fmt"
 "log"
 "net/http"

 "github.com/PuerkitoBio/goquery"
)
Enter fullscreen mode Exit fullscreen mode

Then, we will define a function to get the data from Google Search results.

func getData() {
 url := "https://www.google.com/search?q=go+tutorials&gl=us&hl=en"
 req, err := http.NewRequest("GET", url, nil)
 if err != nil {
  log.Fatal(err)
 }

 req.Header.Set("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.54 Safari/537.36")
Enter fullscreen mode Exit fullscreen mode

After initializing the URL, we made a request object using http.NewRequest() which takes three parameters, the request type, the target URL, and the request body (nil in our case). If any error occurs while creating the object, we print the error in the terminal using the log.Fatal() and exit the function without returning anything.

We also set the header as the User Agent so that our scraping bot can mimic an organic user.

 client := &http.Client{}
 res, err := client.Do(req)
 if err != nil {
  log.Fatal(err)
 }
 defer res.Body.Close()

 doc, err := goquery.NewDocumentFromReader(res.Body)
 if err != nil {
  log.Fatal(err)
 }
Enter fullscreen mode Exit fullscreen mode

Step-by-step explanation:

  1. In the first line, we created an HTTP client object and called client.Do() to make an HTTP request on the server using the req object.

  2. Then, we used the defer to close the response body.

  3. Finally, we called goquery.NewDocumentFromReader() with res.Body as a parameter to create a Document Object Model.

So, we have completed our scraping part of this program. Let us now move to the parsing part by searching for the required elements from the HTML.

Inspecting Google Search Results

If you inspect the HTML, you will find that every organic result is under the “g” tag.

So, looping over this g tag will help us to get the data it holds inside it.

 c := 0
 doc.Find("div.g").Each(func(i int, result *goquery.Selection) {
Enter fullscreen mode Exit fullscreen mode

That c variable is for displaying the position of the result.

Then, we will extract the tags for the title, link, and description from the HTML.

Finding tags for the Required Element

If you look inside the div.g container, you will find that the tag for the title is h3, and the tag for the link is .yuRUbf > a, and the tag for the description is .VwiC3b.

This makes our parser looks like this:

  title := result.Find("h3").First().Text()
  link, _ := result.Find("a").First().Attr("href")
  snippet := result.Find(".VwiC3b").First().Text()

  fmt.Printf("Title: %s\n", title)
  fmt.Printf("Link: %s\n", link)
  fmt.Printf("Snippet: %s\n", snippet)
  fmt.Printf("Position: %d\n", c+1)
  fmt.Println()

  c++
 })
}
Enter fullscreen mode Exit fullscreen mode

Then, we will call the getData() function to execute our scraper.

func main() {
 getData()
}
Enter fullscreen mode Exit fullscreen mode

Run this code in your terminal. You will get the results like this:

Title: Tutorial: Get started with Go
Link: https://go.dev/doc/tutorial/getting-started
Snippet: In this tutorial, you'll get a brief introduction to Go programming. Along the way, you will: Install Go (if you haven't already). Write some simple "Hello, ...
Position: 1

Title: Go Tutorial
Link: https://www.w3schools.com/go/
Snippet: Well organized and easy to understand Web building tutorials with lots of examples of how to use HTML, CSS, JavaScript, SQL, PHP, Python, Bootstrap, ...
Position: 2


Title: Go Tutorial
Link: https://www.tutorialspoint.com/go/index.htm
Snippet: This tutorial is designed for software programmers with a need to understand the Go programming language from scratch. This tutorial will give you enough ...
Position: 4
Enter fullscreen mode Exit fullscreen mode

Congratulations🎉🎉!!! You have successfully created a scraper to extract Google Search Results.

But, this solution can result in an IP block if you use it for scraping large amounts of data from the Google search engine. Instead, you can use several Google Scraper APIs available in the market, which uses a large pool of residential and data center proxies to bypass anti-scraping mechanisms implemented by Google.

Using Google Search API to Scrape Search Results

Serpdog gives an easy and simple API solution to scrape Google Search Results using its powerful Google SERP APIs. Additionally, it manages the proxies and CAPTCHAs for a smooth scraping experience, and not only provides organic results but tons of other featured snippets found in the Google Search Results.

Serpdog — Google Search API

You will also receive 100 free requests upon signing up.

You will get an API Key after registering on our website. Embed the API Key in the code below, and you will be able to scrape Google Search Results at a rapid speed.

     url := "https://api.serpdog.io/search?api_key=APIKEY&q=go+lang+tutorial&gl=us"

     client := &http.Client{}
     req, err := http.NewRequest("GET", url, nil)
     if err != nil {
      fmt.Println(err)
      return
     }

     req.Header.Set("Content-Type", "application/json")

     res, err := client.Do(req)
     if err != nil {
      fmt.Println(err)
      return
     }
     defer res.Body.Close()

     body, err := ioutil.ReadAll(res.Body)
     if err != nil {
      fmt.Println(err)
      return
     }

     fmt.Println(string(body))
Enter fullscreen mode Exit fullscreen mode

Conclusion:

In this tutorial, we learned to scrape Google Search Results using GO Lang. Feel free to message me anything you need clarification on. Follow me on Twitter. Thanks for reading!

Additional Resources

  1. Web Scraping With Python

  2. Web Scraping With Node JS

  3. Scrape Yelp Business Reviews

  4. Scraping Google News Results

  5. Scrape Google Maps Reviews

Top comments (0)