DEV Community

Cover image for Analyzing data accurancy with Puppeteer and Axios( example: book prices )
QAProEngineer
QAProEngineer

Posted on

Analyzing data accurancy with Puppeteer and Axios( example: book prices )

const puppeteer = require("puppeteer");
const axios = require("axios");

const main = async () => {
  // Launch a new browser instance
  const browser = await puppeteer.launch({
    headless: false,
    defaultViewport: null,
  });

  // Create a new page
  const page = await browser.newPage();

  // Navigate to the book product details page
  const baseUrl = "https://www.example.com/book-top-100?p=";

  // define the number of pages to iterate through.
  const pageCount = 2;

  // Create a new array for the results
  const results = [];

  //Iterate through each page
  for (let i = 1; i < pageCount; i++) {
    //construct the URL for the current page
    const currentUrl = `${baseUrl}${i}`;
    //navigate to the page
    await page.goto(currentUrl);

    // get all the product listing urls
    const urls = await page.$$eval("li.item.product > div > a", (links) =>
      links.map((link) => link.href)
    );

    for (const url of urls) {
      await page.goto(url);

      // Get all the swatches for the book
      const swatchElements = await page.$$("p.label");

      // Loop through each swatch and click on it to show the price and SKU
      for (let swatchElement of swatchElements) {
        await swatchElement.click();

        // Extract the price and SKU
        const sku = await page.$eval(".col.data.isbn_13", (elem) =>
          elem.textContent.trim()
        ); // extract the SKU for each selection of format.

        let priceText;
        const swatchElementType = await swatchElement.evaluate((el) =>
          el.textContent.trim()
        );
        if (
          swatchElementType.includes("Audiobook") ||
          swatchElementType.includes("eBook")
        ) {
          priceText = await page.$$eval(
            ".price-swatch span.price, .normal-price",
            (prices) => prices.map((price) => price.textContent.trim())
          );
        } else if (
          swatchElementType.includes("Paperback") ||
          swatchElementType.includes("Hardcover")
        ) {
          priceText = await page.$$eval(
            "p.old-price, span.old-price",
            (prices) => prices.map((price) => price.textContent.trim())
          );
        }

        const price = priceText.flatMap((price) =>
          price.split("$").filter(Boolean)
        ); // extract the numerical price values for this product

        // Make an API call to API to extract the price values for this product then compare them against the web prices.
        const api_book = `https://api.example.com/products/${sku}`;
        const response = await axios.get(api_book);

        const apiPrice = response.data[0].price_amount;
        const webPrice = Number.parseFloat(price);

        console.log(
          "this the api price=" + apiPrice + " and " + "web price=" + price
        );
        if (webPrice == apiPrice) {
          console.log(`Price for SKU ${sku} matches in web and api price`);
        } else {
          console.log(`Price for SKU ${sku} does not match in web and API`);
          // Add the results to the array
          results.push({
            price,
            apiPrice,
            sku,
          });
        }
      }
    } // end url loop
  }
  //Append The results to a string with new lines
  const resultString = results
    .map(
      (result) =>
        `Price: ${result.price}\nAPI Price: ${result.apiPrice}\nSKU: ${result.sku}\n`
    )
    .join("\n");
  // Close the browser instance
  await browser.close();

  // Write the results to a JSON file
  const fs = require("fs");
  fs.writeFileSync("book-prices-details.json", JSON.stringify(resultString));
};

main();

Enter fullscreen mode Exit fullscreen mode

Introduction:
In today's digital age, it's essential for businesses to stay competitive, especially when it comes to pricing their products. One way to do this is by regularly comparing your product prices with those of your competitors. This blog post will explain the purpose of a Puppeteer script that automates the process of extracting book prices from a website and comparing them with prices from an API.

Puppeteer and Axios:
Puppeteer is a headless browser automation tool that allows you to control and interact with web pages programmatically. Axios is a popular JavaScript library for making HTTP requests. Together, they provide a powerful toolset for web scraping and data extraction.

Script Overview:
The provided Puppeteer script has several key objectives:

1.Launch a headless web browser using Puppeteer.
2.Navigate to a specific website that lists the top 100 books.
3.Iterate through multiple pages of book listings.
4.Extract the URLs of individual book product pages.
5.For each book product page, extract the price and SKU information for different book formats (e.g., Audiobook, eBook, Paperback, Hardcover).
6.Make API calls to a price comparison service to fetch the correct prices for the books.
7.Compare the web prices with the API prices to identify any discrepancies.
8.Store the results in a structured format (JSON file).
Enter fullscreen mode Exit fullscreen mode

Conclusion:
This Puppeteer and Axios script automates the process of comparing book prices between a website and an API, helping businesses make informed pricing decisions. It demonstrates the power of web scraping and data extraction with Puppeteer and the ease of making API requests with Axios. By regularly running such scripts, businesses can stay competitive and ensure their pricing remains competitive in the market.

Top comments (0)