SlideShare une entreprise Scribd logo
1  sur  2
White House Open Data Roundtables Case Studies:Demand-Driven Open Data Page 1
Open Data Case Studies
Demand-Driven Open Data Program1: A user-focused method for quality improvement
Background HHS’s (U.S. Department of Health and Human Services) Demand-Driven Open Data (DDOD)
program2
provides a method for improving data quality and usability by focusing on the value delivered to the
ultimate users. DDOD aims to maximize the public good and economic value of an agency’s data assets by
taking a “Lean Startup approach to open data.” To explain DDOD in Lean Startup terms, it works by finding
“customers” for the data before doing work, delivering the data as a minimum viable product (MVP) with
iterative improvements and measuring the value created in a continuous improvement feedback loop.
How It Works In context of open data, a customer is a data user from an external entity, including industry,
research, media or other government organizations. DDOD provides a channel and platform for these data
users to communicate their needs via use cases. It enables public engagement in a way that’s more
systematic, ongoing and transparent than prior approaches. A user specifies their needs in terms of a “use
case”, which is a well-defined application of a dataset to a specific purpose, together with technical
specifications and a stated value to the requestor and public. At HHS, the vast majority of requests were to
improve the quality and usability of existing datasets, rather than asking for new releases.3
Public Engagement It should be noted that it’s not sufficient for an agency to provide the DDOD platform.
DDOD requires an agency to engage in proactive planning, marketing and outreach in order to connect with
the end users and educate them on the availability of this channel. For example, in the case of HHS, the
program reached out to most of the major healthcare accelerators and incubators, participated in industry
conferences and meetups, and published articles on the topic. Additionally, for participation in DDOD to grow,
the program must build public trust over time by demonstrating that the agency will respond to and work on
their data requests.
Metrics DDOD’s ability to measure the value delivered by datasets is crucial to driving data quality.
Previously, the only metric for progress was dataset counts, regardless of its quality or usefulness. For each
use case, DDOD categorizes the value delivered, ranging from simply documenting the data provenance and
examples of use, to several types of usability improvements, to new releases.4
Examples
There are several illustrative examples of DDOD helping improve data quality.
● A clinical value analytics company was frustrated that they weren’t able to reconcile among FDAs
multiple data sources for marketed medical devices and device recalls. The problems were caused by
1
Slides from Roundtable talk: http://www.slideshare.net/DavidPortnoy/impact-of-ddod-on-data-quality-white-house-2016;
HHS’s DDOD program website: http://ddod.healthdata.gov/
2
http://www.hhs.gov/idealab/projects-item/demand-driven-open-data/
3
http://ddod.healthdata.gov/wiki/Main_Page#Progress_on_Use_Cases
4
http://ddod.healthdata.gov/wiki/DDOD_Workflow
White House Open Data Roundtables Case Studies:Demand-Driven Open Data Page 2
ambiguous product names, lack of unique identifiers and unexplained changes to historical data. By
having justification of this use case and clear specifications, FDA was able to prioritize its work to
deploy a solution that addresses most of the issues, as well as making a release available via
OpenFDA, its main API portal.5
● ASPE provides Federal Poverty Level tables and eligibility calculation guidelines, updated twice a year.
Some nonprofits noted that it’s inefficient and error prone for each organization to do their own eligibility
calculations and keep them updated. This use case was solved by a developer outside of government
posting a solution and even hosting a prototype API server that eliminates the possibility of transcription
and calculation errors.6
● An organization serving the Medicaid population needed to analyze Medicaid enrollment reports across
multiple years in order to identify trends that indicate where their services were needed most. Prior to
DDOD, they would have to transcribe data from 50 PDF files (one for each state) per year for multiple
years. With DDOD, the data owners were able to follow the use case specifications to aggregate
everything into a single machine readable file, thereby removing both the overhead and transcription
errors.7
● It’s possible to make significant improvements in data quality by observing trends across multiple use
cases and finding common solutions. One such example was with healthcare provider network
directory standards.8
Observing 7 use cases9
, it became obvious that many datasets could benefit from
a standardized provider dimension. DDOD led to a formation of an industry workgroup that provided
input to CMS regulators on machine readability requirements for health insurance marketplaces.
Additionally, it drove a release of a compatible specification on Schema.org10
that reinforces the
adoption of this new standard. With the Schema.org adoption, medical groups can publish their list of
providers directly on their websites and the major search engines know how exactly to interpret this
data.
Besides these use case examples, DDOD often addresses problems that are not difficult to fix and thus
provide the greatest potential return on investment. Some examples include adding missing data provenance
and dictionary, supplying missing fields for joining to other datasets, and getting commitments from data
owners to continue with regular refreshes after an initial data release.
Improved Infrastructure Finally, as work on use cases progresses, an agency gets insight into missing
technical capabilities that impact quality across the board. For HHS, DDOD resulted in the creation of tools
that monitor the quality of the data catalog itself and analyze day-to-day changes in data availability. Thus it’s
able to quickly zero on data quality problems, funneling the insights to data owners and system support.
5
http://ddod.healthdata.gov/wiki/Use_Case_5:_Consolidated_registry_of_marketed_medical_de vices and
http://ddod.healthdata.gov/wiki/Use_Case_6:_Consolidated_reporting_of_medical_device_recalls
6
http://ddod.healthdata.gov/wiki/Use_Case_49:_API_for_Federal_Poverty_Guidelines and
https://github.com/demand-driven-open-data/ddod-intake/issues/49
7
http://ddod.healthdata.gov/wiki/Use_Case_46:_Medicaid_MCO_Data
8
http://ddod.healthdata.gov/wiki/Interoperability:_Provider_network_directories
9
http://ddod.healthdata.gov/wiki/Concept:_Group_use_cases_for_provider_registry
10
http://pending.webschemas.org/HealthInsurancePlan

Contenu connexe

En vedette

A Tour of PostgREST
A Tour of PostgRESTA Tour of PostgREST
A Tour of PostgRESTbegriffs
 
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Hatim CHAHDI
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business IntelligenceDavid Portnoy
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsDavid Portnoy
 
Teradata vs-exadata
Teradata vs-exadataTeradata vs-exadata
Teradata vs-exadataLouis liu
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataAsis Mohanty
 

En vedette (9)

A Tour of PostgREST
A Tour of PostgRESTA Tour of PostgREST
A Tour of PostgREST
 
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
 
Intro to column stores
Intro to column storesIntro to column stores
Intro to column stores
 
Apache HAWQ Architecture
Apache HAWQ ArchitectureApache HAWQ Architecture
Apache HAWQ Architecture
 
MPP vs Hadoop
MPP vs HadoopMPP vs Hadoop
MPP vs Hadoop
 
Agile Business Intelligence
Agile Business IntelligenceAgile Business Intelligence
Agile Business Intelligence
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop Implementations
 
Teradata vs-exadata
Teradata vs-exadataTeradata vs-exadata
Teradata vs-exadata
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs Exadata
 

Dernier

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 

Dernier (20)

Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 

Open Data Roundtable Case Studies - Demand-Driven Open Data

  • 1. White House Open Data Roundtables Case Studies:Demand-Driven Open Data Page 1 Open Data Case Studies Demand-Driven Open Data Program1: A user-focused method for quality improvement Background HHS’s (U.S. Department of Health and Human Services) Demand-Driven Open Data (DDOD) program2 provides a method for improving data quality and usability by focusing on the value delivered to the ultimate users. DDOD aims to maximize the public good and economic value of an agency’s data assets by taking a “Lean Startup approach to open data.” To explain DDOD in Lean Startup terms, it works by finding “customers” for the data before doing work, delivering the data as a minimum viable product (MVP) with iterative improvements and measuring the value created in a continuous improvement feedback loop. How It Works In context of open data, a customer is a data user from an external entity, including industry, research, media or other government organizations. DDOD provides a channel and platform for these data users to communicate their needs via use cases. It enables public engagement in a way that’s more systematic, ongoing and transparent than prior approaches. A user specifies their needs in terms of a “use case”, which is a well-defined application of a dataset to a specific purpose, together with technical specifications and a stated value to the requestor and public. At HHS, the vast majority of requests were to improve the quality and usability of existing datasets, rather than asking for new releases.3 Public Engagement It should be noted that it’s not sufficient for an agency to provide the DDOD platform. DDOD requires an agency to engage in proactive planning, marketing and outreach in order to connect with the end users and educate them on the availability of this channel. For example, in the case of HHS, the program reached out to most of the major healthcare accelerators and incubators, participated in industry conferences and meetups, and published articles on the topic. Additionally, for participation in DDOD to grow, the program must build public trust over time by demonstrating that the agency will respond to and work on their data requests. Metrics DDOD’s ability to measure the value delivered by datasets is crucial to driving data quality. Previously, the only metric for progress was dataset counts, regardless of its quality or usefulness. For each use case, DDOD categorizes the value delivered, ranging from simply documenting the data provenance and examples of use, to several types of usability improvements, to new releases.4 Examples There are several illustrative examples of DDOD helping improve data quality. ● A clinical value analytics company was frustrated that they weren’t able to reconcile among FDAs multiple data sources for marketed medical devices and device recalls. The problems were caused by 1 Slides from Roundtable talk: http://www.slideshare.net/DavidPortnoy/impact-of-ddod-on-data-quality-white-house-2016; HHS’s DDOD program website: http://ddod.healthdata.gov/ 2 http://www.hhs.gov/idealab/projects-item/demand-driven-open-data/ 3 http://ddod.healthdata.gov/wiki/Main_Page#Progress_on_Use_Cases 4 http://ddod.healthdata.gov/wiki/DDOD_Workflow
  • 2. White House Open Data Roundtables Case Studies:Demand-Driven Open Data Page 2 ambiguous product names, lack of unique identifiers and unexplained changes to historical data. By having justification of this use case and clear specifications, FDA was able to prioritize its work to deploy a solution that addresses most of the issues, as well as making a release available via OpenFDA, its main API portal.5 ● ASPE provides Federal Poverty Level tables and eligibility calculation guidelines, updated twice a year. Some nonprofits noted that it’s inefficient and error prone for each organization to do their own eligibility calculations and keep them updated. This use case was solved by a developer outside of government posting a solution and even hosting a prototype API server that eliminates the possibility of transcription and calculation errors.6 ● An organization serving the Medicaid population needed to analyze Medicaid enrollment reports across multiple years in order to identify trends that indicate where their services were needed most. Prior to DDOD, they would have to transcribe data from 50 PDF files (one for each state) per year for multiple years. With DDOD, the data owners were able to follow the use case specifications to aggregate everything into a single machine readable file, thereby removing both the overhead and transcription errors.7 ● It’s possible to make significant improvements in data quality by observing trends across multiple use cases and finding common solutions. One such example was with healthcare provider network directory standards.8 Observing 7 use cases9 , it became obvious that many datasets could benefit from a standardized provider dimension. DDOD led to a formation of an industry workgroup that provided input to CMS regulators on machine readability requirements for health insurance marketplaces. Additionally, it drove a release of a compatible specification on Schema.org10 that reinforces the adoption of this new standard. With the Schema.org adoption, medical groups can publish their list of providers directly on their websites and the major search engines know how exactly to interpret this data. Besides these use case examples, DDOD often addresses problems that are not difficult to fix and thus provide the greatest potential return on investment. Some examples include adding missing data provenance and dictionary, supplying missing fields for joining to other datasets, and getting commitments from data owners to continue with regular refreshes after an initial data release. Improved Infrastructure Finally, as work on use cases progresses, an agency gets insight into missing technical capabilities that impact quality across the board. For HHS, DDOD resulted in the creation of tools that monitor the quality of the data catalog itself and analyze day-to-day changes in data availability. Thus it’s able to quickly zero on data quality problems, funneling the insights to data owners and system support. 5 http://ddod.healthdata.gov/wiki/Use_Case_5:_Consolidated_registry_of_marketed_medical_de vices and http://ddod.healthdata.gov/wiki/Use_Case_6:_Consolidated_reporting_of_medical_device_recalls 6 http://ddod.healthdata.gov/wiki/Use_Case_49:_API_for_Federal_Poverty_Guidelines and https://github.com/demand-driven-open-data/ddod-intake/issues/49 7 http://ddod.healthdata.gov/wiki/Use_Case_46:_Medicaid_MCO_Data 8 http://ddod.healthdata.gov/wiki/Interoperability:_Provider_network_directories 9 http://ddod.healthdata.gov/wiki/Concept:_Group_use_cases_for_provider_registry 10 http://pending.webschemas.org/HealthInsurancePlan