Duplicate values in your data can be a big problem! It can lead to substantial errors and over estimate your results.
Remove duplicates but keep rest of row values with VBA. In Excel, there is a VBA code that also can remove duplicates but keep rest of row values. Press Alt + F11 keys to display Microsoft Visual Basic for Applications window. Click Insert Module, and paste below code to the Module. VBA: Remove duplicates but keep rest of row values. Reverse look up while removing duplicates excel. Ask Question Asked 4 years, 4 months ago. I want this formula or something similar to this that will remove duplicates with out manual labor – Christopher Schmidt Aug 14 '15 at 4:13. Add a comment. Remove Rows in excel where duplicates are present.
But finding and removing them from your data is actually quite easy in Excel.
In this tutorial, we are going to look at 7 different methods to locate and remove duplicate values from your data.
Duplicate values happen when the same value or set of values appear in your data.
For a given set of data you can define duplicates in many different ways.
In the above example, there is a simple set of data with 3 columns for the Make, Model and Year for a list of cars.
The results from duplicates based on a single column vs the entire table can be very different. You should always be aware which version you want and what Excel is doing.
Removing duplicate values in data is a very common task. It’s so common, there’s a dedicated command to do it in the ribbon.
Select a cell inside the data which you want to remove duplicates from and go to the Data tab and click on the Remove Duplicates command.
Excel will then select the entire set of data and open up the Remove Duplicates window.
When you press OK, Excel will then remove all the duplicate values it finds and give you a summary count of how many values were removed and how many values remain.
This command will alter your data so it’s best to perform the command on a copy of your data to retain the original data intact.
There is also another way to get rid of any duplicate values in your data from the ribbon. This is possible from the advanced filters.
Select a cell inside the data and go to the Data tab and click on the Advanced filter command.
This will open up the Advanced Filter window.
Press OK and you will eliminate the duplicate values.
Advanced filters can be a handy option for getting rid of your duplicate values and creating a copy of your data at the same time. But advanced filters will only be able to perform this on the entire table.
Pivot tables are just for analyzing your data, right?
You can actually use them to remove duplicate data as well!
You won’t actually be removing duplicate values from your data with this method, you will be using a pivot table to display only the unique values from the data set.
First, create a pivot table based on your data. Select a cell inside your data or the entire range of data ➜ go to the Insert tab ➜ select PivotTable ➜ press OK in the Create PivotTable dialog box.
With the new blank pivot table add all fields into the Rows area of the pivot table.
You will then need to change the layout of the resulting pivot table so it’s in a tabular format. Daisy model 36. With the pivot table selected, go to the Design tab and select Report Layout. There are two options you will need to change here.
You will also need to remove any subtotals from the pivot table. Go to the Design tab ➜ select Subtotals ➜ select Do Not Show Subtotals.
You now have a pivot table that mimics a tabular set of data!
Pivot tables only list unique values for items in the Rows area, so this pivot table will automatically remove any duplicates in your data.
Power Query is all about data transformation, so you can be sure it has the ability to find and remove duplicate values.
Select the table of values which you want to remove duplicates from ➜ go to the Data tab ➜ choose a From Table/Range query.
With Power Query, you can remove duplicates based on one or more columns in the table.
You need to select which columns to remove duplicates based on. You can hold Ctrl to select multiple columns.
Right click on the selected column heading and choose Remove Duplicates from the menu.
You can also access this command from the Home tab ➜ Remove Rows ➜ Remove Duplicates.
If you look at the formula that’s created, it is using the Table.Distinct function with the second parameter referencing which columns to use.
To remove duplicates based on the entire table, you could select all the columns in the table then remove duplicates. But there is a faster method that doesn’t require selecting all the columns.
There is a button in the top left corner of the data preview with a selection of commands that can be applied to the entire table.
Click on the table button in the top left corner ➜ then choose Remove Duplicates.
If you look at the formula that’s created, it uses the same Table.Distinct function with no second parameter. Without the second parameter, the function will act on the whole table.
In Power Query, there are also commands for keeping duplicates for selected columns or for the entire table.
Follow the same steps as removing duplicates, but use the Keep Rows ➜ Keep Duplicates command instead. This will show you all the data that has a duplicate value.
You can use a formula to help you find duplicate values in your data.
First you will need to add a helper column that combines the data from any columns which you want to base your duplicate definition on.
The above formula will concatenate all three columns into a single column. It uses the ampersand operator to join each column.
If you have a long list of columns to combine, you can use the above formula instead. This way you can simply reference all the columns as a single range.
You will then need to add another column to count the duplicate values. This will be used later to filter out rows of data that appear more than once.
Copy the above formula down the column and it will count the number of times the current value appears in the list of values above.
If the count is 1 then it’s the first time the value is appearing in the data and you will keep this in your set of unique values. If the count is 2 or more then the value has already appeared in the data and it is a duplicate value which can be removed.
Add filters to your data list.
Now you can filter on the Count column. Filtering on 1 will produce all the unique values and remove any duplicates.
You can then select the visible cells from the resulting filter to copy and paste elsewhere. Use the keyboard shortcut Alt + ; to select only the visible cells.
With conditional formatting, there’s a way to highlight duplicate values in your data.
Just like the formula method, you need to add a helper column that combines the data from columns. The conditional formatting doesn’t work with data across rows, so you’ll need this combined column if you want to detect duplicates based on more than one column.
Then you need to select the column of combined data.
To create the conditional formatting, go to the Home tab ➜ select Conditional Formatting ➜ Highlight Cells Rules ➜ Duplicate Values.
This will open up the conditional formatting Duplicate Values window.
Warning: The previous methods to find and remove duplicates considers the first occurrence of a value as a duplicate and will leave it intact. However, this method will highlight the first occurrence and will not make any distinction.
With the values highlighted, you can now filter on either the duplicate or unique values with the filter by color option. Make sure to add filters to your data. Go to the Data tab and select the Filter command or use the keyboard shortcut Ctrl + Shift + L.
You can then select just the visible cells with the keyboard shortcut Alt + ;.
There is a built in command in VBA for removing duplicates within list objects.
The above procedure will remove duplicates from an Excel table named CarList.
The above part of the procedure will set which columns to base duplicate detection on. In this case it will be on the entire table since all three columns are listed.
The above part of the procedure tells Excel the first row in our list contains column headings.
You will want to create a copy of your data before running this VBA code, as it can’t be undone after the code runs.
Duplicate values in your data can be a big obstacle to a clean data set.
Thankfully, there are many options in Excel to easily remove those pesky duplicate values.
So, what’s your go to method to remove duplicates?
This page shows you how to remove duplicates in Excel using three different methods.
Note that these methods show how to remove duplicate cells from your spreadsheet. If you want to find and remove entire rows that are duplicated, see the Remove Duplicate Rows in Excel page.
For each of the methods decribed below, we use the above simple spreadsheet on the right, which has a list of names in column A.
We first show how to use Excel's Remove Duplicates Command to remove duplicates and then we show how to use Excel's Advanced Filter to perform this task. Finally, we show how to remove duplicates using the Excel Countif Function.
The Remove Duplicates command is located in the 'Data Tools' group, within the Data tab of the Excel ribbon.
To remove duplicate cells using this command:
You will be presented with the 'Remove Duplicates' dialog box shown below:
This dialog box allows you to select which columns of your data set you want to check for duplicate entries. In the example spreadsheet above, we only have one column of data (the 'Name' field). Therefore we leave the 'Name' field selected within the dialog box.
Once you have ensured that the required field(s) are checked in the dialog box, click OK.
Excel will then delete the duplicate rows, as required and will present you with a message, informing you of the number of records removed and the number of unique records remaining (see below).
The resulting example spreadsheet is shown on the rightabove. As required, the duplicate cell A7 (containing the second occurrence of the name 'Laura CARTER') has been removed.
Note that the Excel Remove Duplicates command can also be used on data sets with multiple columns. An example of this is provided on the Remove Duplicate Rows page.
The Excel advanced filter has an option that allows you to filter unique records in a spreadsheet and copy the resulting filtered list to a new location.
This gives you a list that contains the first occurrence of a duplicated record, but does not contain any further occurrences.
To remove duplicates using the Advanced Filter:
Select the column(s) to be filtered (column A in the example spreadsheet above);(Alternatively, if you select any cell within the current data set, Excel will automatically select the entire range of data when you activate the advanced filter).
Select the Excel Advanced Filter option from the Data tab at the top of your Excel workbook(or in Excel 2003, this option is located in Data→Filter menu).
You will be presented with a dialog box showing you the options for the Excel advanced filter (see below).
Within this dialog box:
In the Copy to field, enter the location that you want to copy the new list to.(Note that this location must be in the current worksheet. In this example, cell C1 of the current Worksheet 'Sheet1' has been selected as the 'copy to' location);
The resulting spreadsheet, with the new data list in column C, is shown on the rightabove.
It can be seen that the duplicate value 'Laura CARTER' has been removed from the list.
You can now delete the columns to the left of your new data list (columns A-B in the example spreadsheet) to return to the original spreadsheet format.
Warning: This method will only work if the contents of your cells are less than 256 characters in length, as Excel functions cannot handle text strings that are longer than this.
Another way to remove duplicates in a range of Excel cells is to use the Excel Countif Function.
In order to illustrate this, we will again, use the simple example spreadsheet (repeated on the rightabove), that has a list of names in column A.
In order to find any duplicates in the list of names, we enter the Countif function in column B of the spreadsheet (see below). This function shows the number of occurrences of each name up to the current row.
As shown in the formula bar of the above spreadsheet, the format of the Countif function in cell B2 is:
Note that this function uses a combination of Absolute and Relative Cell References. Due to this combination of reference styles, as the formula is copied down column B, it becomes,
|=COUNTIF( A$2:A3, A3 )|
=COUNTIF( A$2:A4, A4 )
=COUNTIF( A$2:A5, A5 )
Therefore, the formula in cell B4 returns the value 1 for the first occurrence of the text string 'Laura CARTER', but the formula in cell B7 returns the value 2 for the second occurence of this text string.
Now that we have used the Excel Countif function to highlight the duplicates in column A of the example spreadsheet, we need to delete the rows for which the count is greater than 1.
In the simple example spreadsheet, it is easy to see, and to delete, the single duplicate row. However, if you have several duplicates, you might find it faster to use the Excel Autofilter to delete all the duplicate rows at once.
The following steps show how to remove several duplicates at once, (after they have been highlighted using the Countif function):
Select the column containing the Countif function (column B in the example spreadsheet);(Alternatively, if you select any cell within the current data set, Excel will automatically select the entire range of data when you activate the autofilter).
Use the filter at the top of column B to select rows that are not equal to 1.I.e. click on the filter and, from the list of values, uncheck the value 1;
You will be left with a spreadsheet in which the first occurrence of each value is hidden. I.e. only the duplicate values are displayed.
You can delete these rows by highlighting them, then right clicking with the mouse and selecting Delete Rows.
Remove the filter and you will be left with the spreadsheet shown above on the right, in which the duplicate in cell A7 has been removed.You can now delete the column containing the Countif function to return to the original spreadsheet format.