10. WHAT IS A ZETTABYTE?
1,000,000,000,000 Gigabytes
1,000,000,000 Terabytes
1,000,000 Petabytes
1,000 Exabytes
1 Zetabyte
1 Terabyte holds
about as much
as 210 DVDs
11. 10%
Structured
These are the data that
exist in databases
90%
Unstructured
Sensors, pictures, video, audio. These are the elements people and machines generate
regularly, and are most of the story to be told.
12. EMR
Research
Info
Org. Info
Quality of
Care
Treatment
Decisions Demogr
Health
Insurance
Knowledge generation
Decisions
Prediction Visualization Reporting
ETL Data Mining Data Integration
Data Collection and Storage
Sander Klaus, KPMG http://www.slideshare.net/sanderklous/big-data-in-healthcare
Internal Data External Data
14. • In 2012, OECD Countries spent $59 Billion in biomedical
research
• Bayer could replicate only 25% of 67 studies
• Amgen only 11% of 53 studies
15. • Two studies: 1500 BMJ reviewers missed 92% of errors
• Even in RCT studies, reviewers failed to detect important
deficiencies of 93 control studies
• Bohannon paper accepted at 157 journals (of 304).
16. • John Ioannidis at Stanford University argues that most published findings are
false.
• In February of 2014, Regina Nuzzo argues in Nature that P-values are highly
skewed.
• The more implausible the hypothesis — telepathy, aliens, homeopathy — the
greater the chance that an exciting finding is a false alarm, no matter what
the P value is.
17. Question: If our “n’s” continue to get larger and larger, and the volume of data
is ever increasing, what happens to the role of probability and the nature of
modeling?
19. CONFRONTING A CHANGING PARADIGM
The Evolution of Incentives for Providers
Fee for Service
DRG / Quality Cost
Incentives
Accountable Care
Patient Volume
Length of Stay
Ancillary Testing
Health Care Environmental
Paradigm
• Volume driven primary & specialty
care
• Emergence of quality & safety
processes & metrics
• Increased transparency on pricing
& outcomes
The “Triple Aim” (Value)
• Improve the experience of care
• Improve the health of populations
• Reduce the per capita costs of
health care
• Two-way risk sharing
• Appropriate utilization
20. GLOBAL PAYMENT IMPLICATIONS
EXAMPLES of “Re-Thinking” Care Delivery Systems Under Global Payment
Models
“Rapid Access Care”
ER use change
Diagnostic Testing
In system vs. out of system perspectives
Chronic Illness – Behavioral Health Impact
Implications for primary care delivery systems
21. Improving Population Health is Challenging
Better the
Experience
of Care
Lower Per
Capita
Health Costs
Improve
Population
Health
Better
Value
Transforming Health Care Delivery
System
Improving Community Conditions for Health
25. HIPAA
• Covered: Data for clinical care is covered
• Not Covered:
• Data collected by a pharmaceutical manufacturer in a clinical trial
• Searches that people do online for health information
• Social media or mobile health apps to collected and store and use data.
26. • Stage 1
• 91% attested
• Stage 2 Progress is beginning
• 8 hospitals and 252 providers have attested as of May 2014
• Stage 3 measures are being finalized
MEANINGFUL USE
ONC Presentation: http://www.nursing.umn.edu/prod/groups/nurs/@pub/@nurs/documents/content/nurs_content_482406.pdf
27.
28. READINESS: 2013 CIO SURVEY
• 82% of CIOs indicated that bi-directional sharing of clinical
and/or patient data with local healthcare organizations is
important to their organization.
BUT…..
• Only 18% indicated having sufficient trained staff
• And 34% noted that senior leadership had not prioritized
analytics as a key area for staffing needs.
eHealth Initiative, Key Findings from eHealth Initiative Survey on Data and Analytics. August, 2013. http://www.ehidc.org/resource-center/publications/view_document/26
31. NH AND APCD DATA : A CASE OF VISUALIZATION
UNH Institute for Health Policy helped the Accountable Care Project across the state with the goal of:
Creating and sustaining a payment reform/clinical/quality improvement learning network.
- APCD (all payer claims data)
32. METHODS – PROVIDER IDENTIFICATION, NPI REVIEW
COPYRIGHT, 2014. UNIVERSITY OF NEW
HAMPSHIRE. ALL RIGHTS RESERVED.
32
Category % of claims in category “Fix”
1. Consistent NPI (No concerns) 46.09% None Needed
2. Consistent NPI when populated,
but sometimes missing
9.62% Most prevalent NPI was assigned to
the Service Provider ID
3. Always missing NPI 3.90% No fix attempted
4. Multiple, inconsistent NPI (could
include some missing)
40.39%;
5.15% Changed NPI
Most prevalent NPI was assigned to
the Service Provider ID
33. NH ACO / APCD
• Other Issues:
• Defining Primary Care?
• Patient attribution
• Geographic representation (so as not to attribute costs by MSA)
• Report design (big challenge…)
• Site- and region-level reporting (2 sets)
• 11 clinical measures
• Reporting by 19 geographic regions
• Reporting by 21 site designations
• Reporting for 3 types of data (commercial, Medicaid, Medicare)
• 2 years of data
34. REPORT DESIGN: TOO MUCH INFO!
• All in PDF output
• More than 2,000 pages of
reports across full report suite
COPYRIGHT, 2014. UNIVERSITY OF NEW HAMPSHIRE. ALL RIGHTS RESERVED.
34
35. SOLUTION
• SAS Visual Analytics
• Provides online secure portal
• Ability to drag and drop variables for a variety of cuts
• On the fly graphics and visualizations
36. PREDICTIVE ANALYTICS
UC Irvine Health:
• Had millions of data points across 1.2 million patients over 22 years in Excel files and
paper
• Needed to migrate to a singular data warehouse into a single platform (Hortonworks)
which fed medical center and the research center.
• The Key…HADOOP.
• Allowed for semi-structured data migration in real time
37. PREDICTIVE ANALYTICS
UC Irvine Health:
• One outcome is clinical nursing. Patients wearing vital sign sensors transmit every
minute
• 4,320 per patient per day
• Using predictive algorithms, nurses get signals for near term health risk
outcomes.
• Those vitals can then be combined for other data on that patient or on historical
patiets with similar risk factors etc…
• Let the data uncover what was once hidden.
39. Start-Up Funding by digital health companies in 2014http://www.washingtonpost.com/blogs/wonkblog/wp/2014/10/02/digital-health-firms-are-making-a-ton-of-money-in-the-obamacare-era/
40. OPEN STANDARDS
• Strategies that advance the adoption of standardized terminologies for clinical
documentation in electronic health records.
• Standards for genomic data
• Federal input on big data
41. THESE ANALYTICS ARE GREAT….NOW LET’S….
• Analytics success often leads to the desire to do more
• Manage your expectations and capabilities
• What’s your enterprise / cloud strategy?
• How flexible is your warehouse?
• What can you buy vs. make?
• How diverse is your analytic talent?
• Do you have / need a big data strategy?
42. LOOK TOWARD THE FUTURE
Not just a health care data analyst….
……think about a health care data scientist
43. Business and clinical acumen
Health conditions, delivery, and the business of
healthcare
47. • Companies in study had between
3 to 300,000 employees
• 80% work in groups of < 10 colleagues
• 25% work alone or with 1 other analytics colleague
ANALYTICS PROS WORK IN
SMALL GROUPS
NIH ¾ of studies likely replicatable: http://www.economist.com/news/briefing/21588057-scientists-think-science-self-correcting-alarming-degree-it-not-trouble
BMJ BMJ 2014; 349 doi: http://dx.doi.org/10.1136/bmj.g4145 (Published 01 July 2014)
Bohannon: http://www.sciencemag.org/content/342/6154/60.full
Ionnidis, J. Why Most Published Research Findings Are False. PLOS Medicine: August 30, 2005
DOI: 10.1371/journal.pmed.0020124
http://www.nature.com/news/scientific-method-statistical-errors-1.14700#/b2: a P value of 0.01 corresponds to a false-alarm probability of at least 11%, depending on the underlying probability that there is a true effect; a P value of 0.05 raises that chance to at least 29%
Ionnidis, J. Why Most Published Research Findings Are False. PLOS Medicine: August 30, 2005
DOI: 10.1371/journal.pmed.0020124
http://www.nature.com/news/scientific-method-statistical-errors-1.14700#/b2: a P value of 0.01 corresponds to a false-alarm probability of at least 11%, depending on the underlying probability that there is a true effect; a P value of 0.05 raises that chance to at least 29%
And trying to make this
In total, 102 organizations responded to the survey, representing an array of stakeholders including hospitals (37%), integrated delivery networks (33%), academic medical centers (13%), multi-provider practices (3%), health information exchange organizations (2%), community health centers or clinics (1%), and others.
I wanted to begin with showing what all 4 clusters have in common. This slide shows a graph type called a Density Plot. Along th ebottom (or X axis) we are measuing CURIOSITY. As a point of reference a BELL CURVE is a DENSITY plot as well. What you can see is that all 4 clusters are extremely curious. Every single position in our study showed people working in the role who were deeply curious, eager to learn, research oriented – people who are motivated by solving very sophisticated problems. NOTE: WHAT WE ARE MEASURING HERE IS CALLED RAW TALENT. THIS IS NOT SOMETHING YOU CAN TRAIN
Skills can be learned by good people
Tools constantly changing
Languages not a barrier
Ability to learn more important
Mindset can't be learned
Attitude, "Raw Talent", "It Factor”
It is measurable
Bad Strategy:
Hire 30 top Analytics Professionals
Task: go "do something interesting”
Give them carte blanche
Year of undirected activity
Lay off whole team because they “didn't do anything”