The Role of Data Lakes in Healthcare

The Role of Data Lakes
in Healthcare

2
About Perficient
Perficient is the leading digital transformation
consulting firm serving Global 2000 and enterprise
customers throughout North America.
With unparalleled information technology, management consulting,
and creative capabilities, Perficient and its Perficient Digital agency
deliver vision, execution, and value with outstanding digital
experience, business optimization, and industry solutions.

3
Perficient Profile
Founded in 1997
Public, NASDAQ: PRFT
2016 revenue $487 million
Major market locations:
Allentown, Atlanta, Ann Arbor, Boston, Charlotte,
Chattanooga, Chicago, Cincinnati, Columbus, Dallas,
Denver, Detroit, Fairfax, Houston, Indianapolis, Lafayette,
Milwaukee, Minneapolis, New York City, Northern California,
Oxford (UK), Southern California, St. Louis, Toronto
Global delivery centers in China and Indi Nearly a
3,000+ colleagues
Dedicated solution practices
~95% repeat business rate
Alliance partnerships with major technology vendors
Multiple vendor/industry technology and growth awards

5
Speaker Introductions
Juliet Silver, Director, Enterprise Strategy, Healthcare
Juliet provides strategic thought leadership and leverages her more than 20 years
of healthcare industry, management consulting and technology experience to
support healthcare clients in the realization of their strategic vision.
Jill Corcoran, Senior Technical Architect, Healthcare
Jill has more than 20 years of consulting experience focused on helping clients
solve complex business challenges by providing enterprise, data and business
intelligence architectural solutions that transform the way they think about,
organize, and leverage their data.

6
Healthcare Data Lake Concepts

7
Data Lakes in Healthcare
What
A Data Lake, as originally coined, is designed to
hold raw data assets of varied types as they are
received from their sources. Typically the lake is
stored in an Hadoop ecosystem with minimal (if any)
change to the original format and no content
integration or enhancement of the source data.
Why
Healthcare organizations are attracted to the
concept of a data lake as it allows for in-depth
analysis of patient outcomes, fraud, waste and
abuse, R&D for drugs and DME, and clinical trials.
How
A Data Lake offers schema-on-read access to large
amounts of widely varied information that can be
loaded and accessed rapidly. This allows skilled
data scientists to uncover hidden correlations,
obscure patterns, disease trends, and more.

8
The Need for a Data Lake in Healthcare
“Do we need an enterprise data warehouse, a data lake,
or both as part of our overall data architecture?”
• A Data Lake provides the ability to manage the fluid data requirements of
contemporary healthcare organizations as they attempt to rapidly analyze large
volumes of data in batch or real-time from an extensive range of sources in a
variety of formats.
• An enterprise data warehouse provides the strategy-driven, non-volatile
transformed data used to run day-to-day operations and make informed business
decisions based on known processes and thoroughly vetted data leveraging more
traditional reporting, visualization, and analytics.

9
Data Lake Traits
• Time to value in data delivery is accelerated
• Uses various tools which apply “schema-on-read"
• Introduces and reuses tools and processes that
improve search and general knowledge of the data
content
• Designed for low-cost storage for large data
volumes
• Is highly agile and reconfigurable

10
Healthcare Data Lake
Use Cases
• Genomic analytics used by health plans
• Improved clinical trials
• Predictive healthcare costs
• Member/Patient 360° view
• Billing opportunities in unstructured text
• Psychographic prescriptive modeling

11
Use Case: GenomicAnalytics Used by Insurers
The Genetic Information Nondiscrimination Act of 2008 (GINA) protects Americans from
discrimination based on their genetic information in both health insurance and employment.
But we can, and have access to the largest-ever collection of human protein-coding genetic
variants (over 10 million variants), from the Exome Aggregation Consortium (ExAC). The challenge
for healthcare is not how to use genomics data but dealing with massive amount of data.

12
Use Case: Improved Clinical Trials
The analysis and design of clinical trials can discover drug combinations with signiﬁcant
improvements for overall survival and toxicity. Using these statistical models we can develop
optimization models that select treatment regimens that can be tested in clinical trials, based
on the totality of data available on existing.
Existing models can be expanded upon by using published research as an external source
of data during clinical trials.

13
Use Case: Predictive Healthcare Costs
The data you thought would be useful … was not
• 113 candidate predictors from structured and
unstructured data sources
• Structured data was less reliable then unstructured data
– increased the reliance on unstructured data
Unexpected indicators emerged from unstructured content
• Increased the value of the Predictive Model
• 18 accurate indicators or predictors
Predictor
Analysis
% Encounters
Structured
Data
% Encounters
Unstructured
Data
Ejection
Fraction
(LVEF)
2% 74%
Smoking
Indicator
35%
(65% Accurate)
81%
(95% Accurate)
Living
Arrangements
<1% 73%
(100% Accurate)
Drug and
Alcohol Abuse
16% 81%
Assisted Living 0% 13%

14
Use Case: Member/Patient 360° View
Member/Patient 360◦
• Improve decision making
• Enhance patient experience
• Provide a greater opportunity
for improved outcomes
• Improve profitability for both the
provider and the health plan
• Reduce unnecessary and
inefficient processes and procedures
When applied across a large
population of patients you can:
• Predict disease outbreaks
• Identify preventative care
• Develop cures for diseases that touch
specific demographics or patient population segments

15
Use Case: Billing Opportunities in Unstructured Text
• The analysis of unstructured data can provide
significant opportunities for more complete and
fairer billing practices. This information is held
by providers and payers but rarely reviewed as
the amount of detail is overwhelming. Using
keyword searches across vast amounts of data
quickly produces meaningful insight.
• Transcripts of physician’s notes show pre- and
post-procedure exam tests, labs, and related
minor procedures performed unbilled
• Large U.S. health plan compensated on per-
patient basis discovered co-morbidities
allowing them to apply risk adjustments to
segments of their patient population

16
Use Case: Psychographic Prescriptive Modeling
Adding psychographic data from patient healthcare records (PHR) can provide considerable insight into
additional disease risk factors. One example of this would be The Framingham Heart Study with more than 1000
published medical papers related to the study it is one of the most widely known evidence-based studies. One of
the key discoveries was that heart disease is effected not only by measurable factors (such as blood pressure,
and cholesterol) but also demographic (age, gender, and race) and psychographic factors (values, attitudes, and
lifestyles) as well.
Basic Framingham Analysis Predictor Importance

Designing and Developing
the Data Lake

19
Provider Data Lake Healthcare Sources
Provider
Data Lake
Patient
Records
Physician
Notes
Digital
Images
Medical
Device
Financials
Health
Info Sys
External Sources
Health Plan
Gov’t
Agencies
Accountable
Care Orgs
Geo-
Political
Wearables
Research
Provider Sources
System
Sources
Security
Log Data
Metadata
Web
Sources
Social
Media
Email
& Chat
Web
Content

20
Payer Data Lake Healthcare Sources
Payer
Data Lake
Provider Sources
Provider
NetworkFinancials
Health Plan System
Sources
Security
Log Data
Metadata
Web
Sources
Social
Media
Email
& Chat
Web
Content
Claim
Encounter
Member
Marketing
Rx Claim
External Sources
Gov’t
Agencies
Accountable
Care Orgs
Geo-
Political
Wearables Genomic
PHR /
PGHD
Research Survey Standard
Codes

21
Big Data Landscape Components

22
The Enterprise Data Landscape

23
Introducing Hadoop to the Enterprise Data Landscape

24
Best Practices
Assessment
• Genuine need based on the 4 Vs
• Understanding the ‘Big’ Picture
• Mature metadata procedures in place
• Active governance with majority participation
Planning
• Executive suite backing and participation
• Fully vetted use cases
• Staffing and training plan
– Infrastructure Architect
– Big Data Architect
– Data Scientist
Implementation
• Start slow in digestible portions (usable POC)
• Employ technical project management
• Maintain strong scope management
• Small set of very skilled users for initial deployment
• Bring all data that will answer the questions

25
Summary
• Data Lakes deliver the power to share data and rapidly explore, discover, and
predict patterns of risk, cost, and improved outcomes and engagement
• Provides the foundation for research and ad-hoc data science to occur across
a variety of large volume data sets
• Integral to evidence-based care and clinical genetics programs
• Need for genomics data
• Pedigree data
• Personal health information
• Geo data sets
• Psychographic data
• Requires advanced data management and data science skill sets
• Should be governed through an Information and data governance structure
• Sets use case and data priorities
• Oversees data risk, security, and compliance

Questions
Type your question into the chat box

27
Next up:
[Webinar] Harness the Power of Cloud to Drive
Business Innovation – Tuesday, April 25th
[Webinar] Modernize Core Technology to Accelerate
Digital Transformation – Tuesday, May 23rd
Follow Us Online
• Perficient.com/SocialMedia
• Facebook.com/Perficient
• Twitter.com/Perficient_HC
• Blogs.perficient.com/healthcare

The Role of Data Lakes in Healthcare

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à The Role of Data Lakes in Healthcare

Similaire à The Role of Data Lakes in Healthcare (20)

Plus de Perficient, Inc.

Plus de Perficient, Inc. (20)

Dernier

Dernier (20)

The Role of Data Lakes in Healthcare

Notes de l'éditeur