I am 31 years old and I didn't knew this. Everytime I wanted to scrape some data table from a website I had to run a script to do it for me, when in fact it was so easy... like a walk in a park.
Let's have a brief example of what I mean. So how do you go from this...
to this...
... in less than 10 seconds.
All you need is the page you want to get the data from, in my case it's this one, and the Excel app opened (you can use Google drive as well).
We will use the IMPORTXML
function which takes as parameters an url
, an xpath_query
and a locale
. All we need is to pass the url
and the xpath_query
of the table.
Our single line of code will look like this:
=IMPORTXML("https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population", "//tr")
So basically we are telling the program to fetch any table row from the page and display it in our sheet. Easy, right?
Remember that you can modify the xpath_query
to fetch anything from the page. For example, if I want to fetch the 22nd row of the table I change the function like this:
=IMPORTXML("https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population", "//table[1]/tbody/tr[24]")
Isn't it awesome? I bet it is. Hope you've learned something! :)
Top comments (3)
Or - even easier - just copy & paste the table (be sure to hold down CMD or Ctrl when selecting - to select the cells rather than the text content) - this will even deal with column spanning cells etc. This table to spreadsheet copy/pasting functionality has been in browsers for as long as I can remember
Yeah that works like a charm..
You can use
IMPORTXML(url, xpath_query)
in Google Sheets as well.