DEV Community

Cover image for Automate Excel with Python
Stokry
Stokry

Posted on

Automate Excel with Python

With Python, you can merge or combine Excel workbooks if you have multiple workbooks to analyze or anything similar.

It can be painful to manually retrieve the data. You would have to open every single file and you may end up working in a confusing environment. We can automate that in Python in less than 10-15 lines of code.

We have two files data1 and data2 and we will merge these two files into one.

First of all, we want to import pandas module in our code.
And for those of you who don't know what it is, it's a module used mainly to manipulate data in Python.

import pandas.as pd
Enter fullscreen mode Exit fullscreen mode

After that, we want to specify the location of the Excel files that we want to merge.
This is the list we are going to loop through later to make sure we go over each file.
Also, we need to create a blank DataFrame and stored in a variable merge.

excel_files = ['location_of_your_first_excel_file', 'location_of_your_second_excel_file']
merge = pd.DataFrame()
Enter fullscreen mode Exit fullscreen mode

It's now time to loop through our Excel list and read those files in a DataFrame. I also make sure to not copy the header from the second workbook file by using the skiprows argument. We are telling pandas to ignore the first row of the second workbook, we don't wanna that in our file. We just need the merged data.

for file in excel_files:
   df = pd.read_excel(file, skiprows = 1)
   merge = merge.append(df, ignore_index = True)

merge.to_excel('Merged_Files.xlsx')
Enter fullscreen mode Exit fullscreen mode

We finally append the results to merge and we should be done with logic.
At the end we need to have all in the output file, we can call this "Merged_Files"

Whole code looks like this:

import pandas.as pd
excel_files = ['location_of_your_first_excel_file', 'location_of_your_second_excel_file']
merge = pd.DataFrame()

for file in excel_files:
   df = pd.read_excel(file, skiprows = 1)
   merge = merge.append(df, ignore_index = True)

merge.to_excel('Merged_Files.xlsx')
Enter fullscreen mode Exit fullscreen mode

That is all for today.
Have a nice day.

Top comments (5)

Collapse
 
pratikaambani profile image
Pratik Ambani

I need to send mails via excel on a date mentioned in particular cells. Failing to achieve this. Anybody?

Collapse
 
stokry profile image
Stokry

Hello, you can use smtplib library and openpyxl library that will read data from Excel. After that, you can compose an email that you want to send, if you have code, you can post or send me, then I can help you.

Collapse
 
pratikaambani profile image
Pratik Ambani

Thanks for your response. Well, I need to trigger mail (when meeting certain criteria) when I open excel itself. I'm attempting to make it possible within a single file.

Can we achieve this?

Collapse
 
youhakin profile image
youhakin

This is very useful 👍

Collapse
 
stokry profile image
Stokry

Thank you. Yes, you can search and filter data in workbooks, apply formulas across workbooks, etc. In the next post, I will do something like that.