SlideShare une entreprise Scribd logo
1  sur  7
Télécharger pour lire hors ligne
A M e l i s s a D a t a W h i t e Pa p e r




Scalable Data Quality:
A Seven Step Plan For Any Size Organization
Scalable Data Quality: A Seven Step Plan for Any Size Organization                                            




                             Scalable Data Quality:
                   A Seven Step Plan for Any Size Organization

    The term Data Quality can mean different things depending upon the nature of one’s organization. When
    applied to customer address records, data quality can be summed up by the following requirements:

         • The data is accurate. The address actually exists within the city, state and ZIP Code™ given. In
         addition, if a person or business is associated with the address in the record, that person or business
         listed is actually located at that address.

         • The data is up to date. The name and address in any given record reflect the most current informa-
         tion on that person and business.

         • The data is complete. Each address contains all of the necessary information for mailing, including
         apartment or suite number, ZIP Code™ and, if needed, carrier route and walk sequence.

         • The data is not redundant. There is only one record per contact for every address in a mailing list.

         • The data is standardized. Each record follows a recognized standard for names, punctuation and
         abbreviations.

    Every record that fails to meet the above standards of quality can lead to either lost revenue or unneces-
    sary costs. This is true regardless of the size of the enterprise; from a local florist to and multi-national
    conglomerate. In fact, data quality is probably even more crucial for the small to medium-sized business or
    organization than it is to the large corporation.

    Not only does each customer potentially represent a much larger percentage of a small business’s sales
    volume but smaller businesses are generally expected to deliver a higher degree of personal service.
    Therefore, every misdirected or undelivered piece of mail has a greater impact on that business’s bottom
    line than it would for a larger enterprise.

    Given these factors, organizations of virtually
                                                                       Extra time to reconcile data                                 87%
    any size can benefit from a strong commitment                Delay in deploying a new system                        64%
    to a data quality initiative, one that addresses                 Loss of credibility in a system                          81%
    immediate needs and provides flexibility to meet                                  Lost revenue                54%
    changing business requirements.                            Extra costs (e.g. duplicate mailings)                      72%
                                                                          Customer dissatisfaction                      67%
                                                                             Compliance problems            38%

    The Scope of the Problem                                                                  Other    5%




    Undeliverable as Addressed Mail                                                  Figure 1:
                                                                    Problems Due to Poor Data Quality, TDWI (2002)

    According to a recent study undertaken by PricewaterhouseCoopers and the United States Postal Service®
    (USPS), on average, approximately 23.6% of all mail is incorrectly addressed and requires correction of
    some kind. An additional 2.7% is completely undeliverable.

    The USPS currently charges a minimum of $0.21 per mail piece for its address correction service. For a




www.MelissaData.com
Scalable Data Quality: A Seven Step Plan for Any Size Organization                          




    one-time mailing of 10,000 pieces, this could potentially add another $500 to the cost of the mailing. Manual
    correction more than triples this cost.

    Bulk parcels returned as undeliverable cost nearly $2.00 per item. For items sent via delivery services like
    UPS and FedEx, the cost is $5.00 per item. It isn’t difficult to see how these costs can add up in a very short
    time, diluting profit margins and potentially damaging customer relations.

    If the address data is used for billing, incorrect addresses cannot only lead to unnecessary expenses, but
    also delay collections.


                                                                                                              23.6%
                         Figure 2:
                23.6% incorrectly addressed,
                    2.7% undeliverable                                                                          2.7%
     USPS and PricewaterhouseCoopers Report (2002)




    Points of Entry

    Bad address data has multiple points of entry in any organization. If an organization collects sales leads
    over the web, the customer can either mistype their address or deliberately provide false information. Even
    if employees of the organization collect the addresses, the possibility for errors still exists.

    If an organization buys address lists from a vendor or other third-party, there are also multiple entry points
    for error. The list may contain errors due to poor quality control by the vendor. One simple component of a
    data quality initiative is to only patronize list vendors that consistently deliver a quality product. But unlike
    wine, address data does not improve with age and even an error-free list will not stay that way forever.
    Customers move, companies merge and go out of business.

    Whatever the source of the data, these risk factors create the need for a data quality firewall - a set of tools
    and regular procedures that protect your data from errors at the point of entry to ensure only valid, usable
    addresses enter the database. This data quality firewall should also scan, correct and update your previ-
    ously acquired data in batch, preventing any existing errors from “escaping” into the real world and affecting
    the organization.


    Unnecessary Duplication

    A database that is relatively error-free and up-to-date can still overlap with existing data. This creates dupli-
    cate records, leading to unnecessary expense when a potential customer is contacted more than once.




www.MelissaData.com
Scalable Data Quality: A Seven Step Plan for Any Size Organization                      




    Selecting the Right Approach to Data Quality

    Selecting the right method for ensuring data quality depends on many factors: how the data is acquired,
    how it is to be used and the amount of data that needs to be processed at one time.

    No matter what data quality solution is ultimately chosen, at the very minimum it should accomplish the
    following:

         • Trap inaccurate addresses before they enter your database.

         • Process existing records to flag undeliverable mailing addresses for correction or deletion.

         • Update current records when information changes (address change, etc).

         • Enhance records by appending related mailing information to the record, (ie. ZIP + 4® codes and
           Carrier Route) for faster processing and discounts.

         • Standardize addresses using preferred USPS®                                Figure 3:
           spelling, abbreviation and punctuation.                             An address checking tool

                                                                                           US Input:
    Using an API                                                                     22342 emprisa ste 100
                                                                                            92688


    If your organization acquires individual sales leads or ac-
    cepts orders via a web site, you would probably be best
    served by an address checking component that can be
    incorporated into a web application. One possible ap-
    proach is to buy programming tools to be integrated into
    the application. The advantages of this approach include                     US Output (CASS Certified™):
    speed of execution as well as data security (the custom-                        22342 Avenida Empresa
                                                                             Rancho Santa Margarita CA 92688-2156
    er’s information never leaves your site after it has been                         Carrier Route: C056
                                                                                      Delivery Point: 821
    entered). To use the address data for mailings, the ad-                          County Name: Orange
    dress checking logic should also be CASS Certified™ to                            Time Zone: Pacific
                                                                                          (and more)
    qualify for postal discounts.


    Using a Web Service

    Another option is to use a web service that offers the same functionality as an API. This option has two
    advantages: it may be a less expensive option if you process a relatively low number of addresses and the
    web service vendor maintains and updates the postal databases for you.

    Another point of entry your own data entry personnel can also be a source of inadvertent errors. The abil-
    ity to check and standardize an address “on the fly,” before it is even stored, will serve to minimize bad
    addresses. To build an address checking logic into your data entry or CRM systems, consult the address
    verification tools mentioned above. If you don’t have the resources for that sort of development or your
    systems do not allow that level of customization, you can also purchase a standalone application that ac-
    complishes a similar function.




www.MelissaData.com
Scalable Data Quality: A Seven Step Plan for Any Size Organization                        




    Using Standalone Software

    If you work with lists acquired from an outside source, such as a vendor, or if you choose to process your
    address data in bulk, there is still more than one option available to you.

    The address checking components mentioned earlier can also be used to build in-house applications. This
    option requires that you have the necessary resources on hand to create, develop and maintain this appli-
    cation. However, if you are already using the same object to support a web site, your developers’ familiarity
    with the component’s interface can be extended to developing other applications as well.

    If your organization is not large enough to support this kind of effort, you will have the option of purchasing
    a standalone application that handles the same functions. While this is something of a “one-size-fits-all”
    solution, it will fill many of the same needs and be ready to run almost from the moment you receive the
    software.

    Another important criterion is that the solution scales well with the growth of your organization. If your ap-
    proach to data quality works the same with a hundred thousand or a million records as it does with ten
    thousand, then your operations can grow without worrying about hitting an artificial “ceiling” imposed by a
    data quality system that you have outgrown.



    SEVEN STEPS TO SCALABLE DATA QUALITY
    Data quality does not happen by accident. It requires a plan that addresses any current weak points in
    your data, delivers the desired results and can be accomplished with the available resources. It should
    also attempt to anticipate future needs when possible, and allow the data quality solution to grow with the
    organization.

    The following section outlines a good framework for creating a plan for your own data quality initiative. It is
    not the only approach, of course, but it represents a good starting point.


    Step 1: Acknowledge that there is a problem

    Before anything can happen, your company must be aware that it has a problem with poor data quality and
    needs to find a solution that addresses this problem. This requires a reasonably rigorous audit of your mail-
    ing operations and the costs incurred by inaccurate data.

    It should include expenses directly related to bad address data (such as USPS or parcel carrier fees), lost
    revenue and employee hours spent correcting problems that could be prevented by a data quality initia-
    tive.


    Step 2: Assign a “point person”

    Assign a person or a team that will be responsible for executing your data quality initiative. This person will
    be in charge of identifying the problem, the possible solutions and bringing these recommendations back
    to management. After an approach is decided upon, the “point person” will also be in charge of putting the
    initiative into effect.




www.MelissaData.com
Scalable Data Quality: A Seven Step Plan for Any Size Organization                                  




    Step 3: Identify the problem

    The first responsibility of the point person will be to identify the root             CASE STUDY - A
    cause of any data quality problems in your organization. This will re-
    quire a detailed examination of how data quality problems enter the
    “pipeline” and an evaluation of what would be the most effective point                Shari’s Berries
    to enforce data quality.                                                              Sacramento-based Shari’s Ber-
                                                                                          ries ships chocolate-covered
    Step 4: Technology assessment                                                         strawberries nationwide via
                                                                                          FedEx, guaranteeing on-time
    Deciding upon an approach to data quality requires a thorough and                     delivery of its perishable prod-
    realistic appraisal of the technology and resources available within the              uct line. They accept as many
    organization. Ask yourself the following questions:                                   as 200 orders per hour during
                                                                                          peak holiday times.
         1. If data quality is to be enforced via a web site, is it feasible to
         integrate the technology into the web site or will the site have to              Prior to deploying a data qual-
         be retooled to make it possible?                                                 ity solution, the company re-
                                                                                          lied upon their FedEx package
         2. Is the volume of addresses to be verified small enough to make                scanner to detect bad ad-
         a web service a more economical solution?                                        dresses. A misaddressed pack-
         3. Does the organization have the programming resources to inte-                 age would prompt a call to the
         grate a data quality tool into an existing application?                          customer to verify the address
                                                                                          before shipping. During peak
         4. Does the software used by the organization allow customization                times, this process could easily
         or is it a closed system? If it’s a closed system, will a standalone             place a strain on their customer
         address verification program fill the needs of the organization?                 service department.
         5. What would be the best way to ensure that all address data                    After implementing their data
         passes through a data quality process before it is used in any                   quality initiative, the rate of ad-
         way? What changes to the network might be necessary to make                      dress errors was less than one
         this happen?                                                                     percent, reducing the number
                                                                                          of customer service verifica-
    Step 5: Evaluate vendors                                                              tion calls by more than ninety
                                                                                          percent.
    Once the extent of the problem and the available resources has been
    identified, the point person then identifies possible data quality solu-
    tions vendors and evaluates their products. This evaluation should in-
    clude downloading and using trial versions of their software (if these
    are available), as well as possibly contacting other users of the same
    software or organizations similar to your own.

    After evaluating as many possible solutions as is feasible, the point
    person then makes his or her recommendations to management re-
    garding the best solution for the organization.




www.MelissaData.com
Scalable Data Quality: A Seven Step Plan for Any Size Organization                                




    Step 6: Implementation
    At this stage, the recommended data quality solution is put into practice,
    possibly on a limited basis, for a trial period. The point person tracks costs         CASE STUDY - B
    and manpower usage to compare with those of a similar period before the
    data quality initiative.
                                                                                           Simon  Schuster

    Step 7: Validation                                                                     Each week, the corporate
                                                                                           communications department of
    After a trial period, the results are collected and reported back to manage-           Simon  Schuster mails out
    ment. At this point, the effectiveness of the data quality initiative can be           thousands of books to re-
    evaluated and any adjustments, if necessary, can be made before putting                viewers.      Book     reviewers
    the solution into full-scale use throughout the organization.                          are housed in a database of
                                                                                           100,000 records of journalists,
                                                                                           editors, educators and others.
    Where to Go For More Information                                                       Review books are sent out
                                                                                           daily to individuals or to groups
    A well-executed data quality initiative is not difficult to accomplish, but it         of people. All contacts are en-
    is crucial to getting the maximum value out of your data. While a larger               tered into the system by the
    organization could probably absorb the costs of bad data quality, there is             publicity department.
    no reason why it should when, with the proper tools and a well-planned
    initiative, a solution is within easy reach.                                           Prior to implementing a data
                                                                                           quality solution to clean, verify
    In a smaller organization, for which every sales lead, order or potential              and standardize addresses,
    customer is proportionately more valuable, poor data quality is really not             Simon  Schuster was relying
    an option at all. If your organization survives on its ability to reach its cus-       on UPS to make the correc-
    tomers, implementing a thorough data quality solution is crucial to your               tion. ZIP Code™ errors were
    success.                                                                               the most common address
    To learn more about practical and affordable data quality solutions that can           problem. But UPS charged $5
    scale to your growing business needs, visit http://www.MelissaData.com or              per package to correct in the
    call 1-800-MELISSA.                                                                    field, plus returns as undeliver-
                                                                                           able also cost $5, not to men-
                                                                                           tion labor time at SS to pro-
                                                                                           cess the return.

                                                                                           All in all, bad data was cost-
                                                                                           ing SS about $250,000 per
                                                                                           year. Now Simon  Schuster
                                                                                           corrects addresses before the
                                                                                           packages are handed over to
                                                                                           UPS for delivery, preventing
                                                                                           any charges for in-the-field
                                                                                           correction and re-routing and
                                                                                           eliminating approximately nine-
                                                                                           ty-percent of the problems.




www.MelissaData.com

Contenu connexe

Plus de DATAVERSITY

Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectDATAVERSITY
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at ScaleDATAVERSITY
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?DATAVERSITY
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...DATAVERSITY
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?DATAVERSITY
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsDATAVERSITY
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayDATAVERSITY
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise AnalyticsDATAVERSITY
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best PracticesDATAVERSITY
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?DATAVERSITY
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best PracticesDATAVERSITY
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageDATAVERSITY
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...DATAVERSITY
 
Empowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business IntelligenceEmpowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business IntelligenceDATAVERSITY
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsDATAVERSITY
 
Including All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and AnalyticsIncluding All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and AnalyticsDATAVERSITY
 
Assessing New Database Capabilities – Multi-Model
Assessing New Database Capabilities – Multi-ModelAssessing New Database Capabilities – Multi-Model
Assessing New Database Capabilities – Multi-ModelDATAVERSITY
 
What’s in Your Data Warehouse?
What’s in Your Data Warehouse?What’s in Your Data Warehouse?
What’s in Your Data Warehouse?DATAVERSITY
 
Achieving a Single View of Business – Critical Data with Master Data Management
Achieving a Single View of Business – Critical Data with Master Data ManagementAchieving a Single View of Business – Critical Data with Master Data Management
Achieving a Single View of Business – Critical Data with Master Data ManagementDATAVERSITY
 

Plus de DATAVERSITY (20)

Showing ROI for Your Analytic Project
Showing ROI for Your Analytic ProjectShowing ROI for Your Analytic Project
Showing ROI for Your Analytic Project
 
How a Semantic Layer Makes Data Mesh Work at Scale
How a Semantic Layer Makes  Data Mesh Work at ScaleHow a Semantic Layer Makes  Data Mesh Work at Scale
How a Semantic Layer Makes Data Mesh Work at Scale
 
Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?Is Enterprise Data Literacy Possible?
Is Enterprise Data Literacy Possible?
 
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
The Data Trifecta – Privacy, Security & Governance Race from Reactivity to Re...
 
Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?Emerging Trends in Data Architecture – What’s the Next Big Thing?
Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Data Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and ForwardsData Governance Trends - A Look Backwards and Forwards
Data Governance Trends - A Look Backwards and Forwards
 
Data Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement TodayData Governance Trends and Best Practices To Implement Today
Data Governance Trends and Best Practices To Implement Today
 
2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics2023 Trends in Enterprise Analytics
2023 Trends in Enterprise Analytics
 
Data Strategy Best Practices
Data Strategy Best PracticesData Strategy Best Practices
Data Strategy Best Practices
 
Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?Who Should Own Data Governance – IT or Business?
Who Should Own Data Governance – IT or Business?
 
Data Management Best Practices
Data Management Best PracticesData Management Best Practices
Data Management Best Practices
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
Keeping the Pulse of Your Data – Why You Need Data Observability to Improve D...
 
Empowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business IntelligenceEmpowering the Data Driven Business with Modern Business Intelligence
Empowering the Data Driven Business with Modern Business Intelligence
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Data Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and RoadmapsData Governance Best Practices, Assessments, and Roadmaps
Data Governance Best Practices, Assessments, and Roadmaps
 
Including All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and AnalyticsIncluding All Your Mission-Critical Data in Modern Apps and Analytics
Including All Your Mission-Critical Data in Modern Apps and Analytics
 
Assessing New Database Capabilities – Multi-Model
Assessing New Database Capabilities – Multi-ModelAssessing New Database Capabilities – Multi-Model
Assessing New Database Capabilities – Multi-Model
 
What’s in Your Data Warehouse?
What’s in Your Data Warehouse?What’s in Your Data Warehouse?
What’s in Your Data Warehouse?
 
Achieving a Single View of Business – Critical Data with Master Data Management
Achieving a Single View of Business – Critical Data with Master Data ManagementAchieving a Single View of Business – Critical Data with Master Data Management
Achieving a Single View of Business – Critical Data with Master Data Management
 

Dernier

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - AvrilIvanti
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneUiPathCommunity
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessWSO2
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialJoão Esperancinha
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfROWELL MARQUINA
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 

Dernier (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Français Patch Tuesday - Avril
Français Patch Tuesday - AvrilFrançais Patch Tuesday - Avril
Français Patch Tuesday - Avril
 
WomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyoneWomenInAutomation2024: AI and Automation for eveyone
WomenInAutomation2024: AI and Automation for eveyone
 
How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.How Tech Giants Cut Corners to Harvest Data for A.I.
How Tech Giants Cut Corners to Harvest Data for A.I.
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Accelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with PlatformlessAccelerating Enterprise Software Engineering with Platformless
Accelerating Enterprise Software Engineering with Platformless
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Kuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorialKuma Meshes Part I - The basics - A tutorial
Kuma Meshes Part I - The basics - A tutorial
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
QMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdfQMMS Lesson 2 - Using MS Excel Formula.pdf
QMMS Lesson 2 - Using MS Excel Formula.pdf
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 

Scalable Data Quality: A Seven-Step Plan for Any Size Organization

  • 1. A M e l i s s a D a t a W h i t e Pa p e r Scalable Data Quality: A Seven Step Plan For Any Size Organization
  • 2. Scalable Data Quality: A Seven Step Plan for Any Size Organization Scalable Data Quality: A Seven Step Plan for Any Size Organization The term Data Quality can mean different things depending upon the nature of one’s organization. When applied to customer address records, data quality can be summed up by the following requirements: • The data is accurate. The address actually exists within the city, state and ZIP Code™ given. In addition, if a person or business is associated with the address in the record, that person or business listed is actually located at that address. • The data is up to date. The name and address in any given record reflect the most current informa- tion on that person and business. • The data is complete. Each address contains all of the necessary information for mailing, including apartment or suite number, ZIP Code™ and, if needed, carrier route and walk sequence. • The data is not redundant. There is only one record per contact for every address in a mailing list. • The data is standardized. Each record follows a recognized standard for names, punctuation and abbreviations. Every record that fails to meet the above standards of quality can lead to either lost revenue or unneces- sary costs. This is true regardless of the size of the enterprise; from a local florist to and multi-national conglomerate. In fact, data quality is probably even more crucial for the small to medium-sized business or organization than it is to the large corporation. Not only does each customer potentially represent a much larger percentage of a small business’s sales volume but smaller businesses are generally expected to deliver a higher degree of personal service. Therefore, every misdirected or undelivered piece of mail has a greater impact on that business’s bottom line than it would for a larger enterprise. Given these factors, organizations of virtually Extra time to reconcile data 87% any size can benefit from a strong commitment Delay in deploying a new system 64% to a data quality initiative, one that addresses Loss of credibility in a system 81% immediate needs and provides flexibility to meet Lost revenue 54% changing business requirements. Extra costs (e.g. duplicate mailings) 72% Customer dissatisfaction 67% Compliance problems 38% The Scope of the Problem Other 5% Undeliverable as Addressed Mail Figure 1: Problems Due to Poor Data Quality, TDWI (2002) According to a recent study undertaken by PricewaterhouseCoopers and the United States Postal Service® (USPS), on average, approximately 23.6% of all mail is incorrectly addressed and requires correction of some kind. An additional 2.7% is completely undeliverable. The USPS currently charges a minimum of $0.21 per mail piece for its address correction service. For a www.MelissaData.com
  • 3. Scalable Data Quality: A Seven Step Plan for Any Size Organization one-time mailing of 10,000 pieces, this could potentially add another $500 to the cost of the mailing. Manual correction more than triples this cost. Bulk parcels returned as undeliverable cost nearly $2.00 per item. For items sent via delivery services like UPS and FedEx, the cost is $5.00 per item. It isn’t difficult to see how these costs can add up in a very short time, diluting profit margins and potentially damaging customer relations. If the address data is used for billing, incorrect addresses cannot only lead to unnecessary expenses, but also delay collections. 23.6% Figure 2: 23.6% incorrectly addressed, 2.7% undeliverable 2.7% USPS and PricewaterhouseCoopers Report (2002) Points of Entry Bad address data has multiple points of entry in any organization. If an organization collects sales leads over the web, the customer can either mistype their address or deliberately provide false information. Even if employees of the organization collect the addresses, the possibility for errors still exists. If an organization buys address lists from a vendor or other third-party, there are also multiple entry points for error. The list may contain errors due to poor quality control by the vendor. One simple component of a data quality initiative is to only patronize list vendors that consistently deliver a quality product. But unlike wine, address data does not improve with age and even an error-free list will not stay that way forever. Customers move, companies merge and go out of business. Whatever the source of the data, these risk factors create the need for a data quality firewall - a set of tools and regular procedures that protect your data from errors at the point of entry to ensure only valid, usable addresses enter the database. This data quality firewall should also scan, correct and update your previ- ously acquired data in batch, preventing any existing errors from “escaping” into the real world and affecting the organization. Unnecessary Duplication A database that is relatively error-free and up-to-date can still overlap with existing data. This creates dupli- cate records, leading to unnecessary expense when a potential customer is contacted more than once. www.MelissaData.com
  • 4. Scalable Data Quality: A Seven Step Plan for Any Size Organization Selecting the Right Approach to Data Quality Selecting the right method for ensuring data quality depends on many factors: how the data is acquired, how it is to be used and the amount of data that needs to be processed at one time. No matter what data quality solution is ultimately chosen, at the very minimum it should accomplish the following: • Trap inaccurate addresses before they enter your database. • Process existing records to flag undeliverable mailing addresses for correction or deletion. • Update current records when information changes (address change, etc). • Enhance records by appending related mailing information to the record, (ie. ZIP + 4® codes and Carrier Route) for faster processing and discounts. • Standardize addresses using preferred USPS® Figure 3: spelling, abbreviation and punctuation. An address checking tool US Input: Using an API 22342 emprisa ste 100 92688 If your organization acquires individual sales leads or ac- cepts orders via a web site, you would probably be best served by an address checking component that can be incorporated into a web application. One possible ap- proach is to buy programming tools to be integrated into the application. The advantages of this approach include US Output (CASS Certified™): speed of execution as well as data security (the custom- 22342 Avenida Empresa Rancho Santa Margarita CA 92688-2156 er’s information never leaves your site after it has been Carrier Route: C056 Delivery Point: 821 entered). To use the address data for mailings, the ad- County Name: Orange dress checking logic should also be CASS Certified™ to Time Zone: Pacific (and more) qualify for postal discounts. Using a Web Service Another option is to use a web service that offers the same functionality as an API. This option has two advantages: it may be a less expensive option if you process a relatively low number of addresses and the web service vendor maintains and updates the postal databases for you. Another point of entry your own data entry personnel can also be a source of inadvertent errors. The abil- ity to check and standardize an address “on the fly,” before it is even stored, will serve to minimize bad addresses. To build an address checking logic into your data entry or CRM systems, consult the address verification tools mentioned above. If you don’t have the resources for that sort of development or your systems do not allow that level of customization, you can also purchase a standalone application that ac- complishes a similar function. www.MelissaData.com
  • 5. Scalable Data Quality: A Seven Step Plan for Any Size Organization Using Standalone Software If you work with lists acquired from an outside source, such as a vendor, or if you choose to process your address data in bulk, there is still more than one option available to you. The address checking components mentioned earlier can also be used to build in-house applications. This option requires that you have the necessary resources on hand to create, develop and maintain this appli- cation. However, if you are already using the same object to support a web site, your developers’ familiarity with the component’s interface can be extended to developing other applications as well. If your organization is not large enough to support this kind of effort, you will have the option of purchasing a standalone application that handles the same functions. While this is something of a “one-size-fits-all” solution, it will fill many of the same needs and be ready to run almost from the moment you receive the software. Another important criterion is that the solution scales well with the growth of your organization. If your ap- proach to data quality works the same with a hundred thousand or a million records as it does with ten thousand, then your operations can grow without worrying about hitting an artificial “ceiling” imposed by a data quality system that you have outgrown. SEVEN STEPS TO SCALABLE DATA QUALITY Data quality does not happen by accident. It requires a plan that addresses any current weak points in your data, delivers the desired results and can be accomplished with the available resources. It should also attempt to anticipate future needs when possible, and allow the data quality solution to grow with the organization. The following section outlines a good framework for creating a plan for your own data quality initiative. It is not the only approach, of course, but it represents a good starting point. Step 1: Acknowledge that there is a problem Before anything can happen, your company must be aware that it has a problem with poor data quality and needs to find a solution that addresses this problem. This requires a reasonably rigorous audit of your mail- ing operations and the costs incurred by inaccurate data. It should include expenses directly related to bad address data (such as USPS or parcel carrier fees), lost revenue and employee hours spent correcting problems that could be prevented by a data quality initia- tive. Step 2: Assign a “point person” Assign a person or a team that will be responsible for executing your data quality initiative. This person will be in charge of identifying the problem, the possible solutions and bringing these recommendations back to management. After an approach is decided upon, the “point person” will also be in charge of putting the initiative into effect. www.MelissaData.com
  • 6. Scalable Data Quality: A Seven Step Plan for Any Size Organization Step 3: Identify the problem The first responsibility of the point person will be to identify the root CASE STUDY - A cause of any data quality problems in your organization. This will re- quire a detailed examination of how data quality problems enter the “pipeline” and an evaluation of what would be the most effective point Shari’s Berries to enforce data quality. Sacramento-based Shari’s Ber- ries ships chocolate-covered Step 4: Technology assessment strawberries nationwide via FedEx, guaranteeing on-time Deciding upon an approach to data quality requires a thorough and delivery of its perishable prod- realistic appraisal of the technology and resources available within the uct line. They accept as many organization. Ask yourself the following questions: as 200 orders per hour during peak holiday times. 1. If data quality is to be enforced via a web site, is it feasible to integrate the technology into the web site or will the site have to Prior to deploying a data qual- be retooled to make it possible? ity solution, the company re- lied upon their FedEx package 2. Is the volume of addresses to be verified small enough to make scanner to detect bad ad- a web service a more economical solution? dresses. A misaddressed pack- 3. Does the organization have the programming resources to inte- age would prompt a call to the grate a data quality tool into an existing application? customer to verify the address before shipping. During peak 4. Does the software used by the organization allow customization times, this process could easily or is it a closed system? If it’s a closed system, will a standalone place a strain on their customer address verification program fill the needs of the organization? service department. 5. What would be the best way to ensure that all address data After implementing their data passes through a data quality process before it is used in any quality initiative, the rate of ad- way? What changes to the network might be necessary to make dress errors was less than one this happen? percent, reducing the number of customer service verifica- Step 5: Evaluate vendors tion calls by more than ninety percent. Once the extent of the problem and the available resources has been identified, the point person then identifies possible data quality solu- tions vendors and evaluates their products. This evaluation should in- clude downloading and using trial versions of their software (if these are available), as well as possibly contacting other users of the same software or organizations similar to your own. After evaluating as many possible solutions as is feasible, the point person then makes his or her recommendations to management re- garding the best solution for the organization. www.MelissaData.com
  • 7. Scalable Data Quality: A Seven Step Plan for Any Size Organization Step 6: Implementation At this stage, the recommended data quality solution is put into practice, possibly on a limited basis, for a trial period. The point person tracks costs CASE STUDY - B and manpower usage to compare with those of a similar period before the data quality initiative. Simon Schuster Step 7: Validation Each week, the corporate communications department of After a trial period, the results are collected and reported back to manage- Simon Schuster mails out ment. At this point, the effectiveness of the data quality initiative can be thousands of books to re- evaluated and any adjustments, if necessary, can be made before putting viewers. Book reviewers the solution into full-scale use throughout the organization. are housed in a database of 100,000 records of journalists, editors, educators and others. Where to Go For More Information Review books are sent out daily to individuals or to groups A well-executed data quality initiative is not difficult to accomplish, but it of people. All contacts are en- is crucial to getting the maximum value out of your data. While a larger tered into the system by the organization could probably absorb the costs of bad data quality, there is publicity department. no reason why it should when, with the proper tools and a well-planned initiative, a solution is within easy reach. Prior to implementing a data quality solution to clean, verify In a smaller organization, for which every sales lead, order or potential and standardize addresses, customer is proportionately more valuable, poor data quality is really not Simon Schuster was relying an option at all. If your organization survives on its ability to reach its cus- on UPS to make the correc- tomers, implementing a thorough data quality solution is crucial to your tion. ZIP Code™ errors were success. the most common address To learn more about practical and affordable data quality solutions that can problem. But UPS charged $5 scale to your growing business needs, visit http://www.MelissaData.com or per package to correct in the call 1-800-MELISSA. field, plus returns as undeliver- able also cost $5, not to men- tion labor time at SS to pro- cess the return. All in all, bad data was cost- ing SS about $250,000 per year. Now Simon Schuster corrects addresses before the packages are handed over to UPS for delivery, preventing any charges for in-the-field correction and re-routing and eliminating approximately nine- ty-percent of the problems. www.MelissaData.com