DEV Community

Cover image for Data cleaning with Excel
DoreenNangira
DoreenNangira

Posted on

Data cleaning with Excel

Data cleaning is one of the most essential steps in data analysis. Whenever we deal with data we are bound to encounter data inconsistencies such as duplicate values, missing values or spelling mistakes. To ensure that you get the desired results out of your data, it is advisable to equip yourself with various skills used for data cleaning. This article will take you through data cleaning methods in excel.

The following are the most common data inconsistencies and ways of getting rid of them.

Spelling mistakes

Spelling mistakes in your data can hinder you from getting the correct results from your data. To get rid of this, select the column you want to modify, click the review tab on excel window then click spelling. This will highlight the word that appears to be wrongly spelt and a suggestion of other words will be shown. Pick your desired suggested word and click change then okay. This will replace the wrongly spelt word with new spelling.

Empty rows

To find empty rows, press CTRL+SHIFT+END tabs to select all data including those with blanks. Click Data tab then click filter. After this unselect the select all option then scroll down and select blank. This will show you a list of rows which are blank. Select the first blank row then press CTRL+SHIFT+END. This will highlight all blank rows. Right click then click delete rows. This will delete all empty rows. Click the clear button on the ribbon, this will take you back to your data that is clean without empty rows.

Duplicate data

Sometimes data analysts face the challenge of having unnecessary repetition of a particular data entered. For example, one can enter the details of one particular employee in the company database more than once. This leads to incorrect results when trying to get conclusions from such data. There are two ways of removing duplicate data in excel. One involves previewing the data before removing the duplicates while the other involves getting rid of the duplicate at once before preview of it. The safest method is always to preview the data before getting rid of duplicates. The following methods are used:
a. Secure method
Select the column you want to modify, click conditional formatting tab on the home screen, click highlight cells rules then click duplicate values option. After this click ok and duplicate values will be shown. Select the rows you want to delete, right click then press delete. This will get rid of duplicate values.
b. Less secure method
Select the whole data in your excel sheet. Click the Data tab then click remove duplicates. Unselect all columns then select the column where you want to remove the duplicates. This gets rid of duplicates in the selected column.

Find and replace

Find and replace feature is another essential tool in excel that comes in handy. Take an instance where a client calls in and says that his or her last name was wrongly spelt. How do you correct this in the shortest time possible? This is where the find and replace feature comes in. Instead of going through all the rows one by one to spot the word to be corrected, we can use find and replace feature to get the word in a second.
Click the find & select tab on the excel ribbon in home screen. Type in the word you want to find, click find next then click find all to get all instances of the word. Click the replace button, write the word you want then click replace all. This will replace all instances with that new word.

Top comments (0)