Demand-Driven Open Data Program: A user-focused method for quality improvement
HHS’s (U.S. Department of Health and Human Services) Demand-Driven Open Data (DDOD) program provides a method for improving data quality and usability by focusing on the value delivered to the ultimate users. DDOD aims to maximize the public good and economic value of an agency’s data assets by taking a “Lean Startup approach to open data.” To explain DDOD in Lean Startup terms, it works by finding “customers” for the data before doing work, delivering the data as a minimum viable product (MVP) with iterative improvements and measuring the value created in a continuous improvement feedback loop.
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Open Data Roundtable Case Studies - Demand-Driven Open Data
1. White House Open Data Roundtables Case Studies:Demand-Driven Open Data Page 1
Open Data Case Studies
Demand-Driven Open Data Program1: A user-focused method for quality improvement
Background HHS’s (U.S. Department of Health and Human Services) Demand-Driven Open Data (DDOD)
program2
provides a method for improving data quality and usability by focusing on the value delivered to the
ultimate users. DDOD aims to maximize the public good and economic value of an agency’s data assets by
taking a “Lean Startup approach to open data.” To explain DDOD in Lean Startup terms, it works by finding
“customers” for the data before doing work, delivering the data as a minimum viable product (MVP) with
iterative improvements and measuring the value created in a continuous improvement feedback loop.
How It Works In context of open data, a customer is a data user from an external entity, including industry,
research, media or other government organizations. DDOD provides a channel and platform for these data
users to communicate their needs via use cases. It enables public engagement in a way that’s more
systematic, ongoing and transparent than prior approaches. A user specifies their needs in terms of a “use
case”, which is a well-defined application of a dataset to a specific purpose, together with technical
specifications and a stated value to the requestor and public. At HHS, the vast majority of requests were to
improve the quality and usability of existing datasets, rather than asking for new releases.3
Public Engagement It should be noted that it’s not sufficient for an agency to provide the DDOD platform.
DDOD requires an agency to engage in proactive planning, marketing and outreach in order to connect with
the end users and educate them on the availability of this channel. For example, in the case of HHS, the
program reached out to most of the major healthcare accelerators and incubators, participated in industry
conferences and meetups, and published articles on the topic. Additionally, for participation in DDOD to grow,
the program must build public trust over time by demonstrating that the agency will respond to and work on
their data requests.
Metrics DDOD’s ability to measure the value delivered by datasets is crucial to driving data quality.
Previously, the only metric for progress was dataset counts, regardless of its quality or usefulness. For each
use case, DDOD categorizes the value delivered, ranging from simply documenting the data provenance and
examples of use, to several types of usability improvements, to new releases.4
Examples
There are several illustrative examples of DDOD helping improve data quality.
● A clinical value analytics company was frustrated that they weren’t able to reconcile among FDAs
multiple data sources for marketed medical devices and device recalls. The problems were caused by
1
Slides from Roundtable talk: http://www.slideshare.net/DavidPortnoy/impact-of-ddod-on-data-quality-white-house-2016;
HHS’s DDOD program website: http://ddod.healthdata.gov/
2
http://www.hhs.gov/idealab/projects-item/demand-driven-open-data/
3
http://ddod.healthdata.gov/wiki/Main_Page#Progress_on_Use_Cases
4
http://ddod.healthdata.gov/wiki/DDOD_Workflow
2. White House Open Data Roundtables Case Studies:Demand-Driven Open Data Page 2
ambiguous product names, lack of unique identifiers and unexplained changes to historical data. By
having justification of this use case and clear specifications, FDA was able to prioritize its work to
deploy a solution that addresses most of the issues, as well as making a release available via
OpenFDA, its main API portal.5
● ASPE provides Federal Poverty Level tables and eligibility calculation guidelines, updated twice a year.
Some nonprofits noted that it’s inefficient and error prone for each organization to do their own eligibility
calculations and keep them updated. This use case was solved by a developer outside of government
posting a solution and even hosting a prototype API server that eliminates the possibility of transcription
and calculation errors.6
● An organization serving the Medicaid population needed to analyze Medicaid enrollment reports across
multiple years in order to identify trends that indicate where their services were needed most. Prior to
DDOD, they would have to transcribe data from 50 PDF files (one for each state) per year for multiple
years. With DDOD, the data owners were able to follow the use case specifications to aggregate
everything into a single machine readable file, thereby removing both the overhead and transcription
errors.7
● It’s possible to make significant improvements in data quality by observing trends across multiple use
cases and finding common solutions. One such example was with healthcare provider network
directory standards.8
Observing 7 use cases9
, it became obvious that many datasets could benefit from
a standardized provider dimension. DDOD led to a formation of an industry workgroup that provided
input to CMS regulators on machine readability requirements for health insurance marketplaces.
Additionally, it drove a release of a compatible specification on Schema.org10
that reinforces the
adoption of this new standard. With the Schema.org adoption, medical groups can publish their list of
providers directly on their websites and the major search engines know how exactly to interpret this
data.
Besides these use case examples, DDOD often addresses problems that are not difficult to fix and thus
provide the greatest potential return on investment. Some examples include adding missing data provenance
and dictionary, supplying missing fields for joining to other datasets, and getting commitments from data
owners to continue with regular refreshes after an initial data release.
Improved Infrastructure Finally, as work on use cases progresses, an agency gets insight into missing
technical capabilities that impact quality across the board. For HHS, DDOD resulted in the creation of tools
that monitor the quality of the data catalog itself and analyze day-to-day changes in data availability. Thus it’s
able to quickly zero on data quality problems, funneling the insights to data owners and system support.
5
http://ddod.healthdata.gov/wiki/Use_Case_5:_Consolidated_registry_of_marketed_medical_de vices and
http://ddod.healthdata.gov/wiki/Use_Case_6:_Consolidated_reporting_of_medical_device_recalls
6
http://ddod.healthdata.gov/wiki/Use_Case_49:_API_for_Federal_Poverty_Guidelines and
https://github.com/demand-driven-open-data/ddod-intake/issues/49
7
http://ddod.healthdata.gov/wiki/Use_Case_46:_Medicaid_MCO_Data
8
http://ddod.healthdata.gov/wiki/Interoperability:_Provider_network_directories
9
http://ddod.healthdata.gov/wiki/Concept:_Group_use_cases_for_provider_registry
10
http://pending.webschemas.org/HealthInsurancePlan