5 Part Blog Series: Part 3 – Standardize/Normalize Your Data
One of the most daunting tasks to face when building a marketing database is also your greatest opportunity! As a marketer, how often have you had to deal with inconsistent data? Inconsistent data complicates queries and only causes confusion for your marketing staff as they try to segment their promotions.
Standardizing data can be broken down into the following categories;
A. Coded fields
B. Summarized code fields
C. Common output structures
D. Standardizing address data to USPS® requirements
Standardizing Code Fields:
When a database contains multiple sources it is inevitable that there will be variances in data that will need to be standardized. With standardized code fields, selection, segmentation and reporting become much easier!
Standardizing code fields can be a complex undertaking as the definitions of each code from the multiple input files needs to be reviewed and compared against definitions from the other files. Very often, decisions will need to be made by the client on how to standardize the codes.
A simple example would be gender coding. Many files use a “Male, Female, Ambiguous” coding structure while others will use a numeric coding such as “1 = Male, 2 = Female, 3 = Ambiguous”. In this case, it is easy to pick one coding structure over the other and convert all records to the selected coding structure.
There are simple standardizations, such as gender, and more complex, such as age ranges or multi-value code fields. If the number of values increases and the consistency of the values decreases, it is sometimes better to create a new coding structure and convert the existing input codes to that.
Standardizing Summarized Code Fields:
Summarize code fields are not as simple as our previous example. When dealing with code fields such as age ranges, how would you handle the following scenario?

If the actual age is available, you can define new age ranges and recode the data. But what if it’s not? It is easy to convert file 1 codes 3 & 4 to match file 2 code 2 and so on, but what to do with the discrepancies where file 1 covers 18 and 19 year olds under code 1? The database designer will work with you to come up with a workable solution to meet your objectives.
When you consider the amount of coded data on most marketing databases, you can now see how standardizing your data is an important challenge you will face.
Standardize Data into common output formats:
Another form of standardization involves normalizing name and address data. Many legacy systems did not standardize data collection. While this has improved with the new business systems, the influx of data captured over the web has created new challenges for correctly handling names and addresses. Mare sure the vendors you’re interviewing have software available to standardize your name and address data.
Creating a common output layout is a challenge because the software needs to recognize data for what it is, when data is not contained within the appropriate field. If the input file has multiple address fields, which is the primary address? Is the contact name in the company field and vice versa? What other data is appearing in the record?
What if you have a record that looks like this…?

Which address is the primary address? Clearly Address 1 is a company name but does the PO Box or street address take precedence? In the marketing database, you want to standardize the name, company and address data to make sure you have a clean record.
By creating an output that looks like this…

There are benefits to standardizing your name and address including;
1. Improved mail deliverability.
2. Creating a better record for use in matching in all data hygiene processing.
3. Increase the likelihood of identifying duplicates.
4. Allow for personalization of promotional pieces.
Standardizing address data to USPS® requirements:
When building the database to support marketing promotional efforts, you should make use of the
data hygiene products offered by the USPS®. Each of these products provides major benefits within
your marketing database as well as affording you significant savings on your promotional budget.
These products include;
CASSTM – USPS Certification Program for ZIP + 4® Matching Software – Standardizes address data to USPS® guidelines.
NCOALink™ – Change of Address – Provides the latest Change-Of-Address data filed with the United States Postal Service® for the last 48 months.
LACSLink™ – 911 Address Conversion Matching Tool – Convert rural route addresses to city style addresses and renumbers and renames existing city style addresses as required by local authorities.
DSF2TM – Identifies Deliverable Addresses and Specific Address Attributes – Validate addresses in your database by comparing your addresses to every deliverable address in the U.S.A.
In addition to the cost savings of reducing undeliverable mail, the results of CASSTM and NCOALink™ can have a positive impact in the ability of the service bureau to identify duplicate records and merge the results into a single surviving record. By doing that, you will have a more complete view of the history and relationship that individual or company has with your business.
Improve Mail Deliverability
Read Blog Series: Critical Factors in Building a Marketing Database
Part 1 – Begin at the End! Define Your Objectives
Part 2 – KISS (Keep It Simple Stupid)
Part 3 – Standardize/Normalize Your Data
Part 4 – Create An Accurate Budget
Part 5 – Set Measurable Objectives
Summary