Almost everyone is familiar with spreadsheet software. It's the de-facto standard tool for many industries and the first tool that comes to mind when someone thinks about data entry.
Most programmers when they want to use data from a spreadsheet, such as Excel, they automatically think of CSV (comma-separated value files). These files are quite simple, and as their name implies they are separating each column using a comma delimiter, and that's why they are preferred. Not all Excel files can be converted to CSV through. An Excel spreadsheet with multiple sheets will have to be converted into multiple CSV files, one for each sheet. This makes it extremely less convenient than having a single Excel file with multiple sheets.
What is not widely known, is that in Python at least, there are quite powerful tools that allow you to work with Excel files, both for reading and writing. Let's quickly see how to do that in Python.
Set up
Firstly, you would need to install Pandas, the famous data analytics library, along openpyxl, which actually does the heavy work of reading/writing Excel files, to your virtual environment (or globally, depending on your setup).
pip install pandas
pip install openpyxl
Write
import pandas as pd
writer = pd.ExcelWriter('path/to/output') # 1.
df1 = pd.DataFrame(
[['value1', 'value2',], ['value1', 'value2',]],
columns=['col1', 'col2',],
)
df1.to_excel(writer, sheet_name='Sheet 1', index=False) # 2.
df2 = pd.DataFrame(
[['value1', 'value2',], ['value1', 'value2',]],
columns=['col1', 'col2',],
)
df2.to_excel(writer, sheet_name='Sheet 2', index=False)
writer.save() # 3.
- This is the main object that represents the entire Excel file. Instead of the path, you can provide a stream (such as
BytesIO
) instead. - For each sheet you want your Excel file to have, create a DataFrame with the column and row data. The
to_excel
method appends the sheet to the Excel file. Theindex=False
parameter is for the row labels not to be written on the sheet. - This actually flushes the data to the disk and creates the file.
Read
import pandas as pd
xl = pd.ExcelFile('path/to/input')
df = pd.read_excel(xl,
sheet_name='Sheet 1',
usecols=['col1', 'col2',], ) # 1.
for i in range(len(df)): # 2.
col1 = df.loc[i, 'col1']
if not pd.isnull(col1): # 3.
print(col1)
- For each sheet, you read one DataFrame where you define which columns you want to load.
- Loop through each row of the sheet.
- Check if the cell is empty, otherwise proceed with accessing its contents.
Hopefully, this was a quick and easy example of how to write and read Excel files in Python.
Happy coding!
Top comments (1)
let me know how to read a particular sheet in excel and edit it then back to saving it in the excel without disturbing the other sheets in the excel