SlideShare une entreprise Scribd logo
1  sur  19
Quality checklist for registers applied to
      online price information and
        offline route information

          Saskia J.L. Ossen, Piet J.H. Daas,
                   and Marco Puts

                Statistics Netherlands
                May 5, 2010, Helsinki, Finland
Overview
 Introduction
 Quality framework for registers
 Checklist for registers
 Application of checklist to other data sources
 • Offline routing information
 • Online (internet) price information
 Results
 Conclusions
 Future work
Introduction
 Statistics Netherlands wants to increase the use of
  data (sources) collected and maintained by others
  • Not only registers and administrative data sources
  • But also other data sources
     – internet
     – route information
     – ….

 As a result, Statistics Netherlands becomes:
  • More dependent on data sources from others
  • Must be able to monitor the quality of those data sources
     – How?
     – By applying the earlier developed checklist for registers?
Quality framework for registers

 Statistics Netherlands has developed a framework
  for the determination of the quality of registers


 Composed of:
  • 3 high level views on quality (Hyperdimensions)
 • Each view focuses on a different group of quality
   aspects
Quality framework
3 Different high level views on quality
     Quality framework
3 Different high level views on quality

                                         METADATA:
                                           Focuses on the
  SOURCE: - Focus on data source as a whole(availability of the)
          - Mainly delivery related aspectsinformation required to
          - and some other things          understand and use the
                                           data in the data source


      SO
           UR                             A
                CE                       T
                                      A        DATA:
                                    D          - Technical checks
                                               - Accuracy related
                                                 issues
Framework composition

                            Source
      HYPERDIMENSION        Metadata
                            Data

    n>1

                                       5 for Source
                    DIMENSION
                                       4 for Metadata


           n >= 1

                            QUALITY INDICATOR



                                   1:n



                            Measurement method
Determine Source and Metadata quality

  With a checklist
   • Used for both Source and
     Metadata



  Extensively tested on registers



  What about other data sources?
Apply checklist to other sources

 (1) Offline route information
 • For Transport statistics
      – Check number of km driven
      – Border crossing(s)
 Price information on the internet (www)
 •   (2) Flight ticket prices (manual and automatic)
 •   (3) Supermarket product prices
 •   (4) House prices
 •   (5) Product prices of unmanned filling stations
Approach used for testing checklist

     Applied the checklist to 5 data sources
    1. Looked at the scores obtained
       •   Identify quality issues
    2. Ease of use of checklist
       •   Applicability of questions
    3. Missing quality aspects
       •   Are any indicators missing?
Checklist scores (1) - Source

Table 1 Evaluation results for the Source hyperdimension
                       Offline   route   Internet Prices
                       information
                                         Supermarket       Prices   of   Prices         of   Prices of flight
                                         prices            houses        filling stations    tickets

Supplier               +                 ?                 ?             ?                   ?
Relevance              +                 +                 ?             ?                   +
Privacy and security   +                 +                 +             +                   +
Delivery               +                 +                 +             +                   +
Procedures             +/ o              o/+               o/+           o                   o
+, good; o, reasonable; -, poor; ?, unclear
Source conclusions

 Route information resembles registers a lot, no
  quality issues identified

 Internet data, more difficult
 • Who supplies price information on website?

 • Legal issues of collecting data via websites
 • Website change, often unexpected

 • No real deliveries when collecting internet data
Checklist scores (2) - Metadata

  Table 1 Evaluation results for the Metadata hyperdimension
                        Offline   route   Internet Prices
                        information
                                          Supermarket       Prices   of   Prices         of   Prices of flight
                                          prices            houses        filling stations    tickets

  Clarity               +                 +/o               +/o           +/ o                +/ o
  Comparability         +                 +                 ?             ?                   +
  Unique keys           +                 +                 +             +                   +
  Data treatment        o                 +                 +             +                   +

+, good; o, reasonable; -, poor; ?, unclear
Metadata conclusions

 No major issues for the Metadata part of checklist

 Routing information, no problems

 Internet data, somewhat more difficult
 • Clarity of internet population

 • Clarity of time periods to which prices refer
Checklist applicability
 Table 5 Applicability of the quality checklist for the Source hyperdimension
                                    Offline route information         Internet prices
             Supplier                           +                            -
            Relevance                           +                           +
       Privacy and security                     o                            o
             Delivery                           +                            -
           Procedures                           +                            o

 Table 6 Applicability of the quality checklist for the Metadata hyperdimension
                                    Offline route information         Internet prices
             Clarity                            +                           +
          Comparability                         +                           +
           Unique keys                          +                           +
          Data treatment                        +                            o


relevant (+), partly relevant (o), generally not directly applicable (-)
Missing quality aspects

 Only for internet data
 •   Availability of the website
 •   Burden on website
 •   Errors in data on website
 •   Representativity of website information
 •   Possibility for automatically collecting data
Overall conclusions
 Source hyperdimension
 • Directly applicable to route information
 • Inherent differences for internet prices
 Metadata hyperdimension
 • Generally applicable

 Future research will focus on:
 • Adapting checklist to internet data
 • Legal issues for internet data
 • Data quality
Thank you for your attention!

 Questions?

Contenu connexe

Similaire à Quality checklist for registers applied to online price information and offline route information.

Determination of administrative data quality: recent results and new developm...
Determination of administrative data quality: recent results and new developm...Determination of administrative data quality: recent results and new developm...
Determination of administrative data quality: recent results and new developm...Piet J.H. Daas
 
Not all data is born equal - B.C Open Data Summit 2013
Not all data is born equal - B.C Open Data Summit 2013Not all data is born equal - B.C Open Data Summit 2013
Not all data is born equal - B.C Open Data Summit 2013Stéphane Guidoin
 
Pragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in EnterprisesPragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in EnterprisesAmit Sheth
 
Lowry colorado state address dataset data quality
Lowry colorado state address dataset data qualityLowry colorado state address dataset data quality
Lowry colorado state address dataset data qualityGeCo in the Rockies
 
From Compliance to Customer 360: Winning with Data Quality & Data Governance
From Compliance to Customer 360: Winning with Data Quality & Data GovernanceFrom Compliance to Customer 360: Winning with Data Quality & Data Governance
From Compliance to Customer 360: Winning with Data Quality & Data GovernancePrecisely
 
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...Agile Testing Alliance
 
Linked_Open_Data_Rome_Netcamp_13
Linked_Open_Data_Rome_Netcamp_13Linked_Open_Data_Rome_Netcamp_13
Linked_Open_Data_Rome_Netcamp_13Michele Piunti
 
Proposal for a quality framework for the evaluation of administrative and sur...
Proposal for a quality framework for the evaluation of administrative and sur...Proposal for a quality framework for the evaluation of administrative and sur...
Proposal for a quality framework for the evaluation of administrative and sur...Piet J.H. Daas
 
An Agile & Adaptive Approach to Addressing Financial Services Regulations and...
An Agile & Adaptive Approach to Addressing Financial Services Regulations and...An Agile & Adaptive Approach to Addressing Financial Services Regulations and...
An Agile & Adaptive Approach to Addressing Financial Services Regulations and...Neo4j
 

Similaire à Quality checklist for registers applied to online price information and offline route information. (13)

Determination of administrative data quality: recent results and new developm...
Determination of administrative data quality: recent results and new developm...Determination of administrative data quality: recent results and new developm...
Determination of administrative data quality: recent results and new developm...
 
Not all data is born equal - B.C Open Data Summit 2013
Not all data is born equal - B.C Open Data Summit 2013Not all data is born equal - B.C Open Data Summit 2013
Not all data is born equal - B.C Open Data Summit 2013
 
Pragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in EnterprisesPragmatics Driven Issues in Data and Process Integrity in Enterprises
Pragmatics Driven Issues in Data and Process Integrity in Enterprises
 
Quality key users
Quality key usersQuality key users
Quality key users
 
Lowry colorado state address dataset data quality
Lowry colorado state address dataset data qualityLowry colorado state address dataset data quality
Lowry colorado state address dataset data quality
 
Methods for making the best use of admin data
Methods for making the best use of admin dataMethods for making the best use of admin data
Methods for making the best use of admin data
 
From Compliance to Customer 360: Winning with Data Quality & Data Governance
From Compliance to Customer 360: Winning with Data Quality & Data GovernanceFrom Compliance to Customer 360: Winning with Data Quality & Data Governance
From Compliance to Customer 360: Winning with Data Quality & Data Governance
 
2012 09 moldovan_internet_landscape
2012 09 moldovan_internet_landscape2012 09 moldovan_internet_landscape
2012 09 moldovan_internet_landscape
 
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
ATAGTR2017 Bee-Hive approach for Big Data Testing [End to End Continuous Test...
 
Linked_Open_Data_Rome_Netcamp_13
Linked_Open_Data_Rome_Netcamp_13Linked_Open_Data_Rome_Netcamp_13
Linked_Open_Data_Rome_Netcamp_13
 
Proposal for a quality framework for the evaluation of administrative and sur...
Proposal for a quality framework for the evaluation of administrative and sur...Proposal for a quality framework for the evaluation of administrative and sur...
Proposal for a quality framework for the evaluation of administrative and sur...
 
Tatiana Stebakova
Tatiana StebakovaTatiana Stebakova
Tatiana Stebakova
 
An Agile & Adaptive Approach to Addressing Financial Services Regulations and...
An Agile & Adaptive Approach to Addressing Financial Services Regulations and...An Agile & Adaptive Approach to Addressing Financial Services Regulations and...
An Agile & Adaptive Approach to Addressing Financial Services Regulations and...
 

Plus de Piet J.H. Daas

Big Data and official statistics with examples of their use
Big Data and official statistics with examples of their useBig Data and official statistics with examples of their use
Big Data and official statistics with examples of their usePiet J.H. Daas
 
IT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics NetherlandsIT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics NetherlandsPiet J.H. Daas
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)Piet J.H. Daas
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesPiet J.H. Daas
 
Use of social media for official statistics
Use of social media for official statisticsUse of social media for official statistics
Use of social media for official statisticsPiet J.H. Daas
 
Isi 2017 presentation on Big Data and bias
Isi 2017 presentation on Big Data and biasIsi 2017 presentation on Big Data and bias
Isi 2017 presentation on Big Data and biasPiet J.H. Daas
 
Responsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics NetherlandsResponsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics NetherlandsPiet J.H. Daas
 
CBS lecture at the opening of Data Science Campus of ONS
CBS lecture at the opening of Data Science Campus of ONSCBS lecture at the opening of Data Science Campus of ONS
CBS lecture at the opening of Data Science Campus of ONSPiet J.H. Daas
 
Ntts2017 presentation 45
Ntts2017 presentation 45Ntts2017 presentation 45
Ntts2017 presentation 45Piet J.H. Daas
 
Big Data presentation Mannheim
Big Data presentation MannheimBig Data presentation Mannheim
Big Data presentation MannheimPiet J.H. Daas
 
Extracting information from ' messy' social media data
Extracting information from ' messy' social media dataExtracting information from ' messy' social media data
Extracting information from ' messy' social media dataPiet J.H. Daas
 
Big data cbs_piet_daas
Big data cbs_piet_daasBig data cbs_piet_daas
Big data cbs_piet_daasPiet J.H. Daas
 
Gebruik van sociale media voor de officiële statistiek
Gebruik van sociale media voor de officiële statistiekGebruik van sociale media voor de officiële statistiek
Gebruik van sociale media voor de officiële statistiekPiet J.H. Daas
 
Profiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivityProfiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivityPiet J.H. Daas
 
Using Road Sensor Data for Official Statistics: towards a Big Data Methodology
Using Road Sensor Data for Official Statistics: towards a Big Data MethodologyUsing Road Sensor Data for Official Statistics: towards a Big Data Methodology
Using Road Sensor Data for Official Statistics: towards a Big Data MethodologyPiet J.H. Daas
 
Big Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in EindhovenBig Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in EindhovenPiet J.H. Daas
 
Big Data presentation for Statistics Canada
Big Data presentation for Statistics CanadaBig Data presentation for Statistics Canada
Big Data presentation for Statistics CanadaPiet J.H. Daas
 
Quality challenges in modernising business statistics
Quality challenges in modernising business statisticsQuality challenges in modernising business statistics
Quality challenges in modernising business statisticsPiet J.H. Daas
 
Quality Approaches to Big Data
Quality Approaches to Big DataQuality Approaches to Big Data
Quality Approaches to Big DataPiet J.H. Daas
 

Plus de Piet J.H. Daas (20)

Big Data and official statistics with examples of their use
Big Data and official statistics with examples of their useBig Data and official statistics with examples of their use
Big Data and official statistics with examples of their use
 
IT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics NetherlandsIT infrastructure for Big Data and Data Science at Statistics Netherlands
IT infrastructure for Big Data and Data Science at Statistics Netherlands
 
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)ESSnet Big Data WP8 Methodology (+ Quality, +IT)
ESSnet Big Data WP8 Methodology (+ Quality, +IT)
 
EMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniquesEMOS 2018 Big Data methods and techniques
EMOS 2018 Big Data methods and techniques
 
Use of social media for official statistics
Use of social media for official statisticsUse of social media for official statistics
Use of social media for official statistics
 
Isi 2017 presentation on Big Data and bias
Isi 2017 presentation on Big Data and biasIsi 2017 presentation on Big Data and bias
Isi 2017 presentation on Big Data and bias
 
Responsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics NetherlandsResponsible Data Science at Statistics Netherlands
Responsible Data Science at Statistics Netherlands
 
CBS lecture at the opening of Data Science Campus of ONS
CBS lecture at the opening of Data Science Campus of ONSCBS lecture at the opening of Data Science Campus of ONS
CBS lecture at the opening of Data Science Campus of ONS
 
Ntts2017 presentation 45
Ntts2017 presentation 45Ntts2017 presentation 45
Ntts2017 presentation 45
 
Big Data presentation Mannheim
Big Data presentation MannheimBig Data presentation Mannheim
Big Data presentation Mannheim
 
Extracting information from ' messy' social media data
Extracting information from ' messy' social media dataExtracting information from ' messy' social media data
Extracting information from ' messy' social media data
 
Big data cbs_piet_daas
Big data cbs_piet_daasBig data cbs_piet_daas
Big data cbs_piet_daas
 
Gebruik van sociale media voor de officiële statistiek
Gebruik van sociale media voor de officiële statistiekGebruik van sociale media voor de officiële statistiek
Gebruik van sociale media voor de officiële statistiek
 
Big Data @ CBS
Big Data @ CBSBig Data @ CBS
Big Data @ CBS
 
Profiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivityProfiling Big Data sources to assess their selectivity
Profiling Big Data sources to assess their selectivity
 
Using Road Sensor Data for Official Statistics: towards a Big Data Methodology
Using Road Sensor Data for Official Statistics: towards a Big Data MethodologyUsing Road Sensor Data for Official Statistics: towards a Big Data Methodology
Using Road Sensor Data for Official Statistics: towards a Big Data Methodology
 
Big Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in EindhovenBig Data @ CBS for Fontys students in Eindhoven
Big Data @ CBS for Fontys students in Eindhoven
 
Big Data presentation for Statistics Canada
Big Data presentation for Statistics CanadaBig Data presentation for Statistics Canada
Big Data presentation for Statistics Canada
 
Quality challenges in modernising business statistics
Quality challenges in modernising business statisticsQuality challenges in modernising business statistics
Quality challenges in modernising business statistics
 
Quality Approaches to Big Data
Quality Approaches to Big DataQuality Approaches to Big Data
Quality Approaches to Big Data
 

Dernier

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfChris Hunter
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...KokoStevan
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docxPoojaSen20
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfSanaAli374401
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.MateoGardella
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingTeacherCyreneCayanan
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfAyushMahapatra5
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 

Dernier (20)

Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
SECOND SEMESTER TOPIC COVERAGE SY 2023-2024 Trends, Networks, and Critical Th...
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
An Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdfAn Overview of Mutual Funds Bcom Project.pdf
An Overview of Mutual Funds Bcom Project.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.Gardella_Mateo_IntellectualProperty.pdf.
Gardella_Mateo_IntellectualProperty.pdf.
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 

Quality checklist for registers applied to online price information and offline route information.

  • 1. Quality checklist for registers applied to online price information and offline route information Saskia J.L. Ossen, Piet J.H. Daas, and Marco Puts Statistics Netherlands May 5, 2010, Helsinki, Finland
  • 2. Overview  Introduction  Quality framework for registers  Checklist for registers  Application of checklist to other data sources • Offline routing information • Online (internet) price information  Results  Conclusions  Future work
  • 3. Introduction  Statistics Netherlands wants to increase the use of data (sources) collected and maintained by others • Not only registers and administrative data sources • But also other data sources – internet – route information – ….  As a result, Statistics Netherlands becomes: • More dependent on data sources from others • Must be able to monitor the quality of those data sources – How? – By applying the earlier developed checklist for registers?
  • 4. Quality framework for registers  Statistics Netherlands has developed a framework for the determination of the quality of registers  Composed of: • 3 high level views on quality (Hyperdimensions) • Each view focuses on a different group of quality aspects
  • 6. 3 Different high level views on quality Quality framework
  • 7. 3 Different high level views on quality METADATA: Focuses on the SOURCE: - Focus on data source as a whole(availability of the) - Mainly delivery related aspectsinformation required to - and some other things understand and use the data in the data source SO UR A CE T A DATA: D - Technical checks - Accuracy related issues
  • 8. Framework composition Source HYPERDIMENSION Metadata Data n>1 5 for Source DIMENSION 4 for Metadata n >= 1 QUALITY INDICATOR 1:n Measurement method
  • 9. Determine Source and Metadata quality  With a checklist • Used for both Source and Metadata  Extensively tested on registers  What about other data sources?
  • 10. Apply checklist to other sources  (1) Offline route information • For Transport statistics – Check number of km driven – Border crossing(s)  Price information on the internet (www) • (2) Flight ticket prices (manual and automatic) • (3) Supermarket product prices • (4) House prices • (5) Product prices of unmanned filling stations
  • 11. Approach used for testing checklist  Applied the checklist to 5 data sources 1. Looked at the scores obtained • Identify quality issues 2. Ease of use of checklist • Applicability of questions 3. Missing quality aspects • Are any indicators missing?
  • 12. Checklist scores (1) - Source Table 1 Evaluation results for the Source hyperdimension Offline route Internet Prices information Supermarket Prices of Prices of Prices of flight prices houses filling stations tickets Supplier + ? ? ? ? Relevance + + ? ? + Privacy and security + + + + + Delivery + + + + + Procedures +/ o o/+ o/+ o o +, good; o, reasonable; -, poor; ?, unclear
  • 13. Source conclusions  Route information resembles registers a lot, no quality issues identified  Internet data, more difficult • Who supplies price information on website? • Legal issues of collecting data via websites • Website change, often unexpected • No real deliveries when collecting internet data
  • 14. Checklist scores (2) - Metadata Table 1 Evaluation results for the Metadata hyperdimension Offline route Internet Prices information Supermarket Prices of Prices of Prices of flight prices houses filling stations tickets Clarity + +/o +/o +/ o +/ o Comparability + + ? ? + Unique keys + + + + + Data treatment o + + + + +, good; o, reasonable; -, poor; ?, unclear
  • 15. Metadata conclusions  No major issues for the Metadata part of checklist  Routing information, no problems  Internet data, somewhat more difficult • Clarity of internet population • Clarity of time periods to which prices refer
  • 16. Checklist applicability Table 5 Applicability of the quality checklist for the Source hyperdimension Offline route information Internet prices Supplier + - Relevance + + Privacy and security o o Delivery + - Procedures + o Table 6 Applicability of the quality checklist for the Metadata hyperdimension Offline route information Internet prices Clarity + + Comparability + + Unique keys + + Data treatment + o relevant (+), partly relevant (o), generally not directly applicable (-)
  • 17. Missing quality aspects  Only for internet data • Availability of the website • Burden on website • Errors in data on website • Representativity of website information • Possibility for automatically collecting data
  • 18. Overall conclusions  Source hyperdimension • Directly applicable to route information • Inherent differences for internet prices  Metadata hyperdimension • Generally applicable  Future research will focus on: • Adapting checklist to internet data • Legal issues for internet data • Data quality
  • 19. Thank you for your attention!  Questions?