Though its not a database tool, Excel is often used by small businesses to store various pieces of data. Some even use Excel as a full-on database. Lots of software experts will tell you that’s a bad idea, but for small businesses, it can often be a cost-effective solution.
The problem is that Excel lacks the data control facilities that a true database product offers. Think about addresses, for example. In most databases, you’ll be allowed to enter one or two lines of street address, a city, a state, and a zip code. A good database will put some controls in place to ensure that the information you enter is valid. State, for example, should always be two upper-case letters for US addresses. No spaces, no punctuation.
In Microsoft Access, you’d define this field to have a maximum length of two in that scenario. Same for SQL Server and most other database tools. You could also employ other features of those apps to make sure it was upper case, and to exclude punctuation.
In Excel, you can get sloppy. Let’s look at an example:
As you can see, I’ve got some rows that look good, while others are not. Row 4 has “Mass” instead of “MA”. Row 6 includes punctuation. Row 7 has the state spelled out.
To keep your data set clean, you can use some Excel features to limit the values in this column. Some of these methods are easier than others. Let’s start with the easiest, and work our way up.
Limit the Number of Characters
We’ll start by adding validation that limits each state to just two characters. To do so, I’ll start with a similar worksheet that does not yet have any states:
I select the cells that I want to apply validation to. I need to make sure I account for new rows, so I’m going to select the entire column. As I add new data, that rule will be applied.
Next, I’ll click on the “Data” ribbon, and then the “Data Validation” button. Depending on where you click, you may see a drop-down menu. If you do, click on “Data Validation”:
On the Data Validation dialog box, I’ll set the following values:
As you can see, I’m adding a “Text Length” rule to this column, and setting the parameter equal to 2. This means that every value entered must be exactly 2 characters long. Notice that I unchecked the “Ignore blank” check box. You might think that this will prevent users from leaving the column blank, but you’d be wrong about that. I’ll explain that one later.
Let’s test our validation. Back on the spreadsheet, I’ll try typing in “Mass” in one of the states. I get this error message:
It’s not a great message, but you can correct that by going back to the Data Validation dialog box, and setting the “Error Alert”:
Now my error message looks like this:
If I try to enter “M” or “N.H.” or “Massachusetts” in this column, I’ll see the error. This is a great start to our validation.
Try adding a new entry to the list, and leave the state blank. Did it let you?
Yes, it did. Remember that we unchecked the “Ignore Blanks” option. If we’re not ignoring blanks, it seems like that should prevent this, right? Wrong.
The data validation rules are only applied when you have typed something into a cell, and then set the value in that cell, either by moving to a different cell, or by using Ctl-Enter. If you type nothing in the cell, the validation rule isn’t checked.
So, what’s the “Allow Blanks” check box all about?
Find the “Data Validation” button once again, which is on the “Data” ribbon. But rather than clicking the button, click the little drop-down arrow that appears on it’s bottom, right corner. A menu will pop up. Click the one that says “Circle Invalid Data”.
Excel will now run down through all cells in column B, and circle each one that violates our rule. Of course, it’s going overboard, since it’s checking all of our blank rows, and also our header row. But you now see how it’s implementing the rule:
Back on the “Data Validation” button, use that drop-down arrow to choose “Clear Validation Circles” when you get sick of looking at those.
Now we’ve got some basic validation turned on, and this is a big improvement. We’ve got a reasonable chance that most rows will end up with a valid state.
List of States
It’s possible to take this a step further, ensuring that ONLY valid state abbreviations are entered. For example, is CO the abbreviation for Colorado, or is it CL? The correct answer in CO, but some users will mistakenly use CL. Since CL is 2 digits long, our validation rule will be satisfied.
Instead of using the “Length” validation rule, we can instead tell the program to compare the state to a list of states, and only allow values that are on that list. Here’s how you’d do that.
First, add a worksheet to your Excel document that lists all valid state abbreviations. It’s easy to find a list online which you can copy and paste into Excel. Mine looks like this:
Notice that the name of the worksheet is “State List”, and the states are listed in column A. You can hide this sheet if you like. The validation that we’re about to implement will still work.
I’ll now change the validation on my vendor list to refer to this state list. I start by selecting column B, once again, and then go to the Data Validation dialog box. I fill in the values, as follows:
The “Source” argument refers to the new worksheet that we created. This tells Excel that every value we type into column B of the vendor list must appear in column A of the state list.
One of the best features of this method is the “In-cell dropdown” check box. As the name implies, I’m now going to have a drop-down in the State cell, making it quick and easy for me to pick a state. Here’s what that looks like:
As you can see, when I click in cell B2, I get a drop-down arrow beside that cell. Clicking the arrow opens this list of states, and I can click one to fill that value into this cell.
I can still type in a value, rather than using the drop-down list. If I do, the value I enter will be checked against the list, and I’ll get an error message if I enter one that doesn’t exist:
As you can see, when I type in “CL” in cell B3, I get a message letting me know that it’s not valid.
Although our states are now validated, Excel doesn’t care if we use upper-case or lower-case letters. You can see here that it allows me to type a lower case “ma” for Massachusetts:
There’s a way to force this cell to be upper case only in the Data Validation dialog box, but to do that, we’d have to shut off the list validation that’s already in place.
Another option is to use a font that appears in all upper case. I’ve got one on my computer called “Montserrat Subrayada” that does this. If you don’t have this one, there are plenty you can download online for free. Note, though, that this method is purely cosmetic. If you refer to this cell in another cell, it may still appear as lower-case if the new cell has a different font:
There are some other techniques you could use to solve this problem, but none of them are particularly easy or intuitive. If this is a problem for you, I’d suggest writing a small bit of Visual Basic code to address this. And, in fact, that’ll be the topic of my next blog post. If you’d like to be notified when that post goes live, join my mailing list by filling out the form at the bottom of this page.
Have you tried this? If so, share your experiences in the comments below.
As always, I’m here to answer your questions. Call (978-582-9550) or email (email@example.com) to discuss your Excel or Access issues. I’m always happy to talk over your issue with you. If I can’t steer you in the right direction with just a short answer, we’ll schedule a one-on-one session to work together on it, or I’ll prepare an estimate for a longer project.
About the Author
Kimberlee Martin is the owner of North Port Solutions and has 30 years of IT experience. She's worked with several programming languages and database tools over the years, with her favorites being Microsoft Excel VBA, SQL Server, and Visual C#. Her passion is helping small businesses gain insights into their business with effective reporting and data management.
Contact Kim at firstname.lastname@example.org.