DEV Community

loading...

Scrape the latest stock prices with node.js and puppeteer!

Code_Jedi
Javascript, Node.js, Python, PHP, React and Vue. Coding since 2017
Updated on ・4 min read

Hey, fellow devs 👋

If you're looking into web-scraping with javascript, then I've got a great, simple project to start you off, because in this tutorial, I will be showing you guys how to scrape the latest Tesla stock prices using Node.js and puppeteer.

Let's get started!


First of all, you will need to install puppeteer using "npm i puppeteer". Now if you don't have npm, package.json and node_modules setup, here's a great tutorial on how to do so: https://www.sitepoint.com/npm-guide/.

After you've installed puppeteer, create a new javascript file and require puppeteer on the first line:

const puppeteer = require('puppeteer');
Enter fullscreen mode Exit fullscreen mode

Then create the async function in which we are going to write our main code:

const puppeteer = require('puppeteer');

async function start() {

}
start();
Enter fullscreen mode Exit fullscreen mode

Now we're ready to start scraping.

First of all, you need to initiate a new browser instance, as well as define the url which your web-scraper is going to be visiting:

const puppeteer = require('puppeteer');

async function start() {
  const url = 'https://finance.yahoo.com/quote/TSLA?p=TSLA&.tsrc=fin-srch';
  const browser = await puppeteer.launch({
    headless: false
  });
}
Enter fullscreen mode Exit fullscreen mode

Next, you need to call the "newPage()" function to open a new page in the browser, and go to the url that we defined using the "goto()" function:

const puppeteer = require('puppeteer');

async function start() {
  const url = 'https://finance.yahoo.com/quote/TSLA?p=TSLA&.tsrc=fin-srch';
  const browser = await puppeteer.launch({
    headless: false
  });
  const page = await browser.newPage();
  await page.goto(url);
}
Enter fullscreen mode Exit fullscreen mode

For this next step, you will have to go to https://finance.yahoo.com/quote/TSLA?p=TSLA&.tsrc=fin-srch, right click on the current stock price and click on inspect:
tesla

A pop-up will appear on the right of your window, you will need to find the stock price element:
tesinspect2

Next, you will need to right click on the stock price element and click on "copy full Xpath".
This will give us a way of accessing the stock price element:
tes3


Once we have the Xpath of the stock price element, we can add these 3 lines of code into our function:

  var element = await page.waitForXPath("put the stock price Xpath here")
  var price = await page.evaluate(element => element.textContent, element);
  console.log(price);
Enter fullscreen mode Exit fullscreen mode

The "page.waitForXPath()" function will locate the stock price element.
Next, the "page.evaluate" function will get the text contents of the stock price element which would then be printed by the "console.log()" function.


At this point, our code would look something like this:

const puppeteer = require('puppeteer');

async function start() {
  const url = 'https://finance.yahoo.com/quote/TSLA?p=TSLA&.tsrc=fin-srch';
  const browser = await puppeteer.launch({
    headless: false
  });
  const page = await browser.newPage();
  await page.goto(url);
  var element = await page.waitForXPath("/html/body/div[1]/div/div/div[1]/div/div[2]/div/div/div[5]/div/div/div/div[3]/div[1]/div[1]/span[1]")
  var price = await page.evaluate(element => element.textContent, element);
  console.log(price);
}
start()
Enter fullscreen mode Exit fullscreen mode
If you were to execute your current code, you will find that when going to the url that you defined earlier, a pop-up will appear:

pop-up

To get around this, plug these 2 lines of code into your function before defining the "element" variable:

var accept = ("#consent-page > div > div > div > form > div.wizard-body > div.actions.couple > button");
await page.click(accept)
Enter fullscreen mode Exit fullscreen mode

This will locate the "Accept All" button and click it to make the popup go away.

Now you will have a working function which goes to your defined url, scrapes the latest Tesla stock price and prints it in your terminal.


To go one step further, you can put these lines of code in a for loop:

    for(var k = 1; k < 2000; k++){
      var element = await page.waitForXPath("/html/body/div[1]/div/div/div[1]/div/div[2]/div/div/div[5]/div/div/div/div[3]/div[1]/div[1]/span[1]")
      var price = await page.evaluate(element => element.textContent, element);
      console.log(price);
      await page.waitForTimeout(1000);
    }
Enter fullscreen mode Exit fullscreen mode

The "page.waitForTimeout(1000)" function will wait 1000 milliseconds(1 second) before repeating the for loop.

And finally add a "browser.close()" function after the for loop to close the browser and finish your code execution:

const puppeteer = require('puppeteer');

async function start() {
    const url = 'https://finance.yahoo.com/quote/TSLA?p=TSLA&.tsrc=fin-srch';
    const browser = await puppeteer.launch({
      headless: false
    });  
    const page = await browser.newPage();
    await page.goto(url);
    var accept = ("#consent-page > div > div > div > form > div.wizard-body > div.actions.couple > button");
    await page.click(accept);
    for(var k = 1; k < 2000; k++){
      var element = await page.waitForXPath("/html/body/div[1]/div/div/div[1]/div/div[2]/div/div/div[5]/div/div/div/div[3]/div[1]/div[1]/span[1]");
      var price = await page.evaluate(element => element.textContent, element);
      console.log(price);
      await page.waitForTimeout(1000);
    }
    browser.close();
}
start();
Enter fullscreen mode Exit fullscreen mode

That's it for this web-scraping tutorial!

If you're having problems with the code, leave a comment and I'll see how I can help.

Byeeeeeee 👋

Discussion (7)

Collapse
marianpirvan profile image
marianpirvan

Can you put github.com link or package .json file? Or that you use? or this is all code? Thank you.

Collapse
code_jedi profile image
Code_Jedi Author

Hey, here is the source code with the package.json and package-lock.json files: github.com/matveynikon/Stock_scrap...

Collapse
marianpirvan profile image
marianpirvan • Edited

It's work if you run node index.js ,but why show 652.81 multiple time?

Dont' work on console .log :(
index.js:2 Uncaught ReferenceError: require is not defined
{
"name": "stocks",
"version": "1.0.0",
"description": "",
"main": "stock.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"author": "Me",
"license": "ISC",
"dependencies": {
"puppeteer": "^10.1.0"
}
}

Thread Thread
code_jedi profile image
Code_Jedi Author

that means that the stock price still hasn't changed

Thread Thread
marianpirvan profile image
marianpirvan

At that time check value of stock, hourly, daily?

Thread Thread
code_jedi profile image
Code_Jedi Author

First of all, the price will change only when the stock market is open, so it's useless to run the scraper when the market is closed, and even when it's open, the stock price won't change all the time. With that said, this scraper checks the stock price every second.

Thread Thread
marianpirvan profile image
marianpirvan

Thanks a lot, i love your work, and help, cheers!