Today we’ll be looking at preventing duplicate values from being entered into your Excel sheet. This is the final entry in a series that looked at several ways to ensure that data being entered into your Excel sheet is valid.
In this case, we’re going to use a custom validation rule, which we’ve already seen briefly when validating phone numbers. We’ll combine that with the “CountIf” function, so let’s start with a quick overview of that one.
The CountIf function allows you to count the number of values that match a given criteria. Let’s see an example:
As you can see, I’ve got a list of students and their favorite colors in columns A and B. In column E, I’m using CountIf to summarize that data. Take a look at the formula (show in the formula bar, above column E). You can see that the CountIf function has two arguments: the list that contains the values; and the criteria. I’m telling Excel to count each cell in B2 through B8 if the value is “Blue”.
Simple enough, right?
So now let’s think about how we can utilize that function to check for duplicate values.
Let’s say I’m going to check for duplicates on column A, to make sure no student is counted more than once. Of course, this is a simplified example, and in reality, I’d need a last name or student ID to make this work correctly. But for the sake of this demonstration, let’s assume that our class does not have any duplicate first names.
Before we add the validation, I’ll add the “CountIf” function in column C, so that we can see how it’s going to work. Here’s what the formula looks like on the first row:
Here I’m using the CountIf function to count values in column A, down to row 60 ($A$2:$A$60). I’m comparing that to the value in cell A2, which is “Kevin”. Since “Kevin” only appears once on the list, I see that the count is 1.
Note the dollar signs in the first argument. Those specify an absolute range, which means that as I copy and past this formula to subsequent rows, the row numbers won’t change. They are locked in.
Compare that the to the second argument, which is A2. No dollar signs on this one. That means this is a relative reference. As I copy and paste the formula to subsequent rows, the row number on will automatically be adjust. Let’s see what it looks like on row 3:
Here you can see that the second argument is A3, which contains “Jenn”. That means that the count in this cell is counting the number of “Jenn” values on the list.
Let me create a duplicate so that you can see the formula in action:
As you can see here, the count has changed to two on both rows that contain “Jenn” (rows 3 and 9). Note: the yellow highlighted was added to the image, and does not appear in Excel.
For our validation, we’re going to say that CountIf should always return a 1. If it returns a 2 (or any number greater than one), we know we have a duplicate value.
I’ll now use this CountIf formula to set up the validation rule. I start by selecting all of the cells that I want to apply validation to, and then click the “Data Validation” button on the Data ribbon:
I’ll change the “Allow” drop-down to “Custom”, and then put in my formula. Remember that we always type in the value that is valid for the formula. I’m saying that the value is valid if CountIf is 1:
If CountIf is not 1, that will throw an error, as we see here:
The validation rule is now telling me that “Kevin” is already on my list. If I’m so inclined, I can go back to the Data Validation dialog box and improve the error message to make it clearer.
I can also now remove column C, since it's not used in the validation. It's purpose was to help you understand the CountIf function. Now that you get it, we can delete the column.
This data validation will help to ensure that your Excel data set is clean and accurate. You could take this validation further using VBA to catch near-duplicates. In my next blog post, I’ll be covering that topic. Subscribe below if you’d like to be notified when that post goes live.
This concludes my series on Excel data validation. There is a lot more that can be done, of course, and I could probably devote many blogs to that subject. But I think this series has shown some of the possibilities that you might want to explore on your own, so I’m going to leave you to that. If you've tried this (or any of the other validations I've covered), let me know how you made out in the comments.
My next Excel topic will be retrieving data into Excel from SQL Server, Microsoft Access, and many other data sources. This is a great option that allows you to tap into databases that are managed elsewhere – maybe in your CRM or ERP software, for example – and apply Excel’s data visualization tools. Think charts, pivot tables, and more. I'll post a series of blogs on that topic, so let me know in the comments if there's something specific you'd like me to cover. To be notified when the first post in that series goes live, subscribe to my mailing list, using the form below.
As always, don’t hesitate to reach out with your questions on Excel, Access, SQL Server, or other software that you might be using. As a long-time data geek, I love to chat with other business people about their data. Let’s set up a call or zoom to talk about yours!
About the Author
Kimberlee Martin is the owner of North Port Solutions and has 30 years of IT experience. She's worked with several programming languages and database tools over the years, with her favorites being Microsoft Excel VBA, SQL Server, and Visual C#. Her passion is helping small businesses gain insights into their business with effective reporting and data management.
Contact Kim at email@example.com.
© North Port Solutions, LLC, 2022. All rights reserved.