On the web you can find countless of tables. Those tables (and any webpage) is defined in HTML. So you need to parse HTML right?
Not exactly, there's a module called Pandas which parses the data for you. That data is then stored in a data structure named data frame.
Say you grab the table from https://www.fdic.gov/bank/individual/failed/banklist.html
#!/usr/bin/python3
import pandas as pd
import numpy as np
url ='https://www.fdic.gov/bank/individual/failed/banklist.html'
res2=pd.read_html(url)
print(res2)
print("+"*50)
print(res2[0]["Bank Name"])
So the line
res2=pd.read_html(url)
gets the whole table and puts it in a pandas data frame. That easy!
This line shows the whole table
print(res2)
for a specific column
print(res2[0]["Bank Name"])
So you can easily grab data from a webpage, without having to parse html language yourself.
Top comments (0)