1. Translational
Bioinformatics (TBI)
Big Data Analystics
in Health Care
Sushil K. Meher
MCA(NIT, RKL), MBA (Hospital Management), M.Phil (CS),(Ph.D(eHealth)).
Computer Facility
ALL INDIA INSTITUTE OF MEDICAL SCIENCES
NEW DELHI
4. “Why big data
is a big deal”
InfoWorld – 9/1/11
“Keeping Afloat
in a Sea of 'Big
Data”
ITBusinessEdge – 9/6/11
“The challenge–
and opportunity–
of big data”
McKinsey Quarterly—5/11
“Getting a Handle
on Big Data with
Hadoop”
Businessweek-9/7/11
“Ten reasons why
Big Data will
change the travel
industry”
Tnooz -8/15/11
“The promise of
Big Data in
Health Care”
Intelligent Utility-8/28/11
Big Data Buzz
5. Our Journey To The Cloud/Big Data
OLTP: Online Transaction Processing (DBMSs)
OLAP: Online Analytical Processing (Data Warehousing)
RTAP: Real-Time Analytics Processing (Big Data Architecture & Technology)
6. So What is Big Data?
Big Data refers to datasets that grow so large that it is difficult to
capture, store, manage, share, analyze and visualize with the
typical database software tools.
How much is Big?
It is not a single number but a set of
parameters
7. !!!
THE ERA OF
BIG DATA
IS HERE
!!!
!!!
!!!
!!!
“Big Data Is Less About
Size, And
More About Freedom”
―Techcrunch
!!!
!!!
!!!
“Findings: ‘Big Data’
Is More Extreme
Than Volume”
― Gartner
“Big Data! It’s Real, It’s
Real-time, and It’s
Already Changing Your
World”
―IDC
“Total data: ‘bigger’
than big data”
― 451 Group
9. Big Data in Healthcare
SOCIAL
BLOG
SMART
METER
1011001010010
0100110101010
1011100101010
100100101
HEALTH
VOLUME VELOCITY VARIETY VARACITY
10. Data Measurement Units
• In 2011 alone, 1.8 zettabytes of data were created globally. To put this into
perspective, this volume of data equated to 200 billion, 2-hour long HD movies, which
one person would need 47 million years to watch in their entirety.
• U.S health care data alone reached 150 exabytes in 2011. Five exabytes (1018
gigabytes) of data would contain all the words ever spoken by human beings on earth.
At this rate, big data for U.S. health care will soon reach zettabyte (1021 gigabytes)
scale and yottabytes (1024 gigabytes) not long after.
The register.co.uk
“Translating Health Care Through Big Data, Strategies for
leveraging big data in the health care industry” - Institute
for Health Technology Transformation
11. The Model Has Changed…
The Model of Generating/Consuming Data has Changed
Old Model: Few Hospitals are generating data, all others are consuming data
New Model: all of us are generating data, and all of us are consuming data
13. Innovate With Big Data Analytics
Big Data Analytics Accelerate Health Care 2.0 for Evidence-based Care Provider
Delivering 10 Years Of
Data In Seconds
TRADITIONAL DATA LEVERAGED
HIGH
Quality of Care
LOW
Legacy
System
Treatment
Pathways on
Summary Data
Database
BI Reporting
Treatment
Pathways on
All the Data
BIG DATA LEVERAGED
Big Data Analytics
In-Database
Analytics
Associative Rule Mining and User Clustering
Improves Pathways
External Data Sources Enable Personalized
Medicine
USE CASE
14. Big Data Key Drivers
Population
Health
Patient
Experience
Per Capita
Cost
• New Delivery Models
• Meaningful Use
• ICD-10 / SNOMED-CT
• Better Data = Improved
Outcomes
• Shift from volume-based care
to value-based care
• Fraud Detection
• Cost Savings
15. What’s driving Big Data
- Optimizations and predictive analytics
- Complex statistical analysis
- All types of data, and many sources
- Very large datasets
- More of a real-time
- Ad-hoc querying and reporting
- Data mining techniques
- Structured data, typical sources
- Small to mid-size datasets
17. Where Does the Data Come From?
Supply Chain and
Revenue Cycle
Clinical and HIM Administrative
• Structured
─ EHR
─ HIS
• Unstructured
─ Image based – PACS
and radiology, EKG’s,
Monitor data
─ Insurance card, patient
photo, consent forms,
orders
─ Paper based patient
information
• Semi-Structured
─ DNA-RNA- Protein
Genomics
• Human Resources
– HR Management
Systems
– Documents such as
new hire paperwork,
employee records,
credentialing, etc.
• Legal
– Documents include
contracts and
agreements,
correspondence,
compliance
• Finance
– Statements
• Business Office
– Back Office
• Supply Chain
– Materials Management
– Documents such as
requisitions, purchase
orders, invoices, packing
slips, receiving paperwork
• Revenue Cycle
– Pre-registration
– Denials Management
– Documents include EOB’s,
correspondence
18. Definition of Translational
Bioinformatics (TBI)
• Development of storage, analytic, and
visualization methods.
Bergman, 2010
20. Personalizing Health & Care (PHC)
1. Better understanding health, ageing & disease
2. Effective health promotion, prediction, screening
and disease prevention
3. Early diagnosis (detection)
4. Innovative treatments & technologies
5. Advancing active & healthy ageing
6. Integrated, sustainable, citizen-centered care
7. Improving health information, data exploitation
& knowledge translation
21. Approach for 4P
Basic
Biomedical
Research
Clinical
Knowledge
& Research
Public Health
Population
Health
Personal
Health
Translational Research
Text mining, BioPatch, CAMA, DzMap, CKD, PWAS, Drug reposition
Reverse translational research
22. Data Interaction Model for Translational
Bioinformatics Research
Patient
Profile
Age, sex, allergy, weight,
height, blood type, body
temperature, …etc.
Diagnosis
/Problem
Lab/Exam
Medication Procedures
Current and/or
chronic dz,
malignancy,
Pregnancy…etc.
Surgery,
transfusion,
endoscopy,
angiogram, PTCA,
rehabilitation…etc.
YC (Jack) Li et. al., 2004
CBC, D/C, LFT, hCG,
PT, APTT, INR…etc.
e.g. Fluorouracil vs
thrombocytopenia
Fluoruracil vs Theophylline,
Doxorubicin vs
Methotrexate, …etc.
e.g. Wafarin vs
colonoscopy
e.g. Tamoxifen vs
Nausea
e.g. Valproic
acid vs
pregnancy
Gene
28. • Formulate new questions and
become much more agile
• Make evidence based decisions
• Democratize your data
• Visualize invisible knowledge
• Big data is here – now
• Data breaches
• Intrusion of privacy
• Unfair use of Data
Big Data in Health Care
30. What Technology Do We Have
For Big Data ??
Hadoop
• Low cost, reliable
scale-out architecture
• Distributed computing
Proven success in
Fortune 500
companies
• Exploding interest
NoSQL Databases
• Huge horizontal scaling
and high availability
• Highly optimized for
retrieval and appending
• Types
• Document stores
• Key Value stores
• Graph databases
Analytic RDBMS
• Optimized for bulk-load
and fast aggregate
query workloads
• Types
• Column-oriented
• MPP
• In-memory
Hadoop NoSQL Databases Analytic Databases
31. Major Hadoop Utilities
Apache Hive
Apache Pig
Apache HBase
Sqoop
Hue
Oozie
Flume
Apache Whirr
Apache Zookeeper
SQL-like language and
metadata repository
High-level language
for expressing data
analysis programs
The Hadoop database.
Random, real -time
read/write access
Highly reliable
distributed
coordination service
Library for running
Hadoop in the cloud
Distributed service for
collecting and
aggregating log and
event data
Browser-based
desktop interface for
interacting with
Hadoop
Server-based
workflow engine for
Hadoop activities
Integrating Hadoop
with RDBMS