DEV Community

Cover image for How to Automatically Highlight Cells in Excel with Duplicate Values Using Python
Sona
Sona

Posted on

How to Automatically Highlight Cells in Excel with Duplicate Values Using Python

In the bustling city of Spreadsheets, where data flowed like streams of numbers and formulas whispered through the air, there lived a diligent analyst named Alice. Alice had a keen eye for detail and a knack for finding patterns in the vast sea of data that surrounded her.

One day, while working on a particularly complex project, Alice stumbled upon a problem that seemed to plague her spreadsheet – duplicate values. These pesky duplicates made it challenging to spot unique data points and often led to errors in her analysis.

Determined to tackle this issue head-on, Alice turned to her trusty companion, Python. With its powerful libraries, Python was like a wizard's wand in her hands. She quickly devised a plan to highlight cells with duplicate values in her Excel spreadsheet automatically.

Using the openpyxl library, Alice wrote a script that would iterate through each cell in her spreadsheet. For each cell, the script would check if the value had been seen before in the same column. If it had, the script would apply a vibrant color to highlight the cell, making it stand out from the rest.

With a flick of her wand – or rather, a press of the "Run" button – Alice set her script in motion. As the code executed, cells with duplicate values lit up like beacons in the darkness, guiding her towards a clearer, more accurate analysis.

Thanks to Python's magic and Alice's ingenuity, the city of Spreadsheets became a little brighter that day, with duplicate values no longer hiding in the shadows. And so, armed with her newfound knowledge, Alice continued her journey through the world of data, ever curious and always ready to uncover the next great insight.

Now, let us get into the coding where first we will create an Excel sheet that has duplicate values in it. You can also import or read your own Excel sheet

from openpyxl import Workbook
from random import randint

# Create a new workbook
wb = Workbook()
ws = wb.active

# Sample data with some duplicates
data = [
    ['Name', 'Age', 'City', 'Grade'],
    ['John', 25, 'New York', 'A'],
    ['Alice', 30, 'Chicago', 'B'],
    ['Bob', 25, 'New York', 'A'],
    ['Eve', 35, 'Los Angeles', 'C'],
    ['John', 25, 'New York', 'A'],
    ['Alice', 30, 'Chicago', 'B'],
    ['Bob', 25, 'New York', 'A'],
    ['Eve', 35, 'Los Angeles', 'C'],
    ['Alice', 30, 'Chicago', 'B']
]

# Populate the Excel sheet with sample data
for row_data in data:
    ws.append(row_data)

# Save the workbook
wb.save('sample_data.xlsx')
Enter fullscreen mode Exit fullscreen mode

Read More

Top comments (0)