SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
MACHINE LEARNING
APPLIED IN HEALTH
WHO AM I?
Javier Samir Rey
Systems engineer
Machine learning engineer - Direktio
Co-organizer meetup Big Data Colombia
jreyro@gmail.com
javier-samir-rey-7104195
github/jasam
“Work on Stuff that Matters”
Tim O'Reilly
Source: United Nations - Sustainable Development Goals
SUSTAINABLE DEVELOPMENT GOALS
3 - GOOD HEALTH AND WELL-BEING
“Ensure healthy lives and promoting
well-being for all at all ages.”
● Reproductive maternal and child
health.
● Communicable, non-communicable
and environmental diseases.
● Health risk reduction and
management.
● Universal health coverage.
NO AND COMMUNICABLE DISEASES
The incidence of major infectious
diseases: HIV, tuberculosis and
malaria.
Almost half the world’s
population is at risk of malaria.
889,000 people died from
infectious diseases caused largely by
faecal contamination of water.
40 millions global death were due NCDs.
48% deaths were premature.
75% of premature deaths were caused by
cardiovascular disease,
cancer, diabetes and chronic
respiratory disease.
80% of heart disease, stroke and diabetes
can be prevented.
Source: United Nations
CDs NCDs
Noncommunicable diseases (NCDs), also known as chronic diseases, tend
to be of long duration and are the result of a combination of genetic,
physiological, environmental and behaviours factors.
Detection, screening and treatment of NCDs, as well as palliative care, are
key components of the response to NCDs.
An important way to control NCDs is to focus on reducing the risk factors
associated with these diseases. Low-cost solutions exist for
governments and other stakeholders to reduce the common modifiable
risk factors. Monitoring progress and trends of NCDs and their risk is
important for guiding policy and priorities.
NON COMMUNICABLE DISEASES
Decreased quality
of life of the
human being.
IMPACT
In low-resource settings,
health-care costs for NCDs
quickly drain
household resources.
The exorbitant costs of
NCDs, including often
lengthy and expensive
treatment and loss of
breadwinners, force millions
of people into poverty
annually and stifle.
Hypertension and
Diabetes Mellitus
COLOMBIA NO COMMUNICABLE DISEASES
major precursors of
- Ischemic cardiovascular disease
- Cerebrovascular events
- End-stage renal disease
- Death
prevalence
- Hypertension: 6.5 %
- Diabetes: 1.9 %
20% of the population
consumes 80%of the
resources.
Source: cuenta de alto costo
SOME REVIEW
Data is quickly emerging as the greatest asset of the
healthcare industry. The trend in our industry is to drive many
decisions supported by data. it is a walk of maturity with the real gold
nuggets coming in Analytics 3.0 and beyond. This will not be solved
with a product or purchased off the shelf. Big Data needs to be
part of the DNA of an organization.
-- Chris Belmont, MBA
Vice President and Chief Information Officer
MD Anderson Cancer Center
“I know that 50% of my
advertising is wasted, I just
don’t know which half.”
WANAMAKER’S QUESTION
Healthcare industry is now awash in data in a way that it has never
been before: biological, gene expression, sensors, DNA, sequence, EHRs,
drug and medicals. We have entered a new era in which we can work on
massive datasets effectively combining it. We can start asking
the important questions, the wanamaker questions!
The opportunities are huge!.
Source: wikipedia
THE HERO’S JOURNEY
Source: Wikipedia
BIG PLAYERS
BUSINESS UNDERSTANDING DATA VALUE
PYRAMID
Source: datasyndrome
AGILE DATA SCIENCE MANIFIESTO
Source: agile data science 2.0
Iterate, iterate, iterate: tables, charts, reports, predictions
- roadmap projects.
1
Integrate the tyrannical opinion of data in product
management.
4
Ship intermediate output. Even failed experiments have
output.
2
Prototype experiments over implementing tasks.
3
AGILE DATA SCIENCE MANIFIESTO
Source: agile data science 2.0
Climb up and down the data-value pyramid as we work.
5
Discover and pursue the critical path to a killer product.
6
Get meta. Describe the process, not just the end-state.
7
CRISP-DM METHODOLOGY
Source: Wikipedia
Before define
your framework
(agile is a
possibility), first
define your
culture and
team.
Cross Industry
Standard
Process for Data
Mining
BUSINESS UNDERSTANDING
It is one of the most
important concepts of
data science!
It is vital to understand the problem to be solved and context.
1
Often recasting the problem and designing a solution is an
iterative process of discovery.
2
The Business Understanding stage represents a part of the
craft where the analysts’ creativity plays a large role.
3
BUSINESS UNDERSTANDING
It is one of the most
important concepts of
data science!
The key to a great success is a creative problem formulation how
to cast the business problem as one or more data science
problems (subproblems).
4
What is the expected value.
5
Team’s help is really important, we are not alone.
6
BUSINESS UNDERSTANDING - HEALTH
Source: mckinsey and company
Big data has a higher
potential in 3 ways:
● Precision medicine
● Diagnose diseases
● Optimize clinical
trials
BUSINESS UNDERSTANDING - HEALTH
ACTORS
● Clinicians, domain experts and financial
analysts
● Managers, IT developers, consultants and
vendors
● Policy makers
● Patients and consumers
● Executives and lines-of-business leaders
● Researches and academia
● Health institutions
● Society
Build your strategy together!
BUSINESS UNDERSTANDING - HEALTH
CHRONIC CONDITIONS CARE MODEL
Source: Cuidado das Condições Crônicas na Atenção Primária à Saúde
Inspired by
the pyramid
of Kaiser
Permanente!
DATA UNDERSTANDING
Solving the business problem is the goal.
1
It is important to understand the strengths and
limitations of the data because rarely is there an exact
match with the problem.
2
Some data will be available virtually for free while others
will require effort to obtain.
3
Cleaning and matching different sources in only one record
match is itself could be a complicated analytics problem
4
DATA UNDERSTANDING
Remember all V’s about data: volume, velocity, variety,
variability, veracity, visualization and value.
5
Design and build data engineering team that supports
your data requirements.
6
Data Governance DAMA (Data Management Association
International)
7
DATA UNDERSTANDING - HEALTH
SOURCES FOR DATA IN HEALTHCARE
Healthcare data Examples
Images Radiographic, Images, MRIs, Ultrasounds and Nuclear
imaging
Un-/semi-structured Clinical narratives, Physician notes, Level 2,3 OMICS,
Summaries, Pathology reports
Streaming Bedside, remote monitors, Implants, fitness bands, smart
watches and smart phones
Social media Facebook, Twitter, Web forums and communities
Structure data All claims, EHR, ERP and other information systems
Dark data Server logs, application error logs, account information,
emails and documents
DATA UNDERSTANDING - HEALTH
Source: The Rise of Consumer Health Wearables
DATA UNDERSTANDING - HEALTH
Source: mckinsey and company
DATA UNDERSTANDING - HEALTH
Source: mckinsey and company
DATA PREPARATION
The analytic technologies could be powerful but they impose
certain requirements on the data they use (data table).
1
Typical examples of data preparation are converting data
to tabular format
2
Feature engineering.3
Technology is important but this is not the main point.4
DATA PREPARATION
The process defining the variables. This is one of the main
points at which human creativity, common sense,
and business knowledge come into play.
4
Document your time process.
5
Think optimization process -Big O6
Little blocks of processing - plan for scale7
DATA PREPARATION - COMPUTING BOUND
Source: hadoop in the enterprise: architecture
DATA PREPARATION - DATA ENGINEERING
Pair review
Modularize
your project
Create professional
projects - world
class solutions
using: versioning,
standards, right
tools, unit tests.
DATA PREPARATION - TABULAR FORM - THE GOAL
Primary care
Secondary care
Medication
Other data… a lot of
types
ID age med height weight BMI diet
1 15 Y 168 60 21.3 Y
2 20 Y 185 80 23.4 Y
3 65 N 192 90 24.4 N
4 48 N 172 85 28.7 N
5 45 Y 185 79 23.1 N
6 79 N 182 71 21.4 Y
7 22 Y 186 79 22.8 Y
Feature engineering
Data points this is
the key (N*M)! After a
very expensive
process
To put data together
is challenging
Data engineering
N features
Mobservations
DATA MODELING
The creation of models from data is known as model induction.
Induction is a term from philosophy that refers to generalizing from
specific cases to general rules (or laws, or truths).
Source: Data science for business
Generally speaking, a model is a simplified representation of
reality created to serve a purpose.
In data science, a predictive model is a formula for estimating the
unknown value of interest: the target. The formula could be
mathematical, or it could be a logical statement such as a rule. Often it is
a hybrid of the two.
Many Names for the
Same Things!.
DATA MODELING - BEST PRACTICES
Ask a specific question, Remember you are solving a
business problem, not a math problem.
1
Start simple, start with the minimal set of data.
2
Try many algorithms but remember that data is more
important than the exact algorithm, better your features.
3
Treat your data with suspicion, understand its
idiosyncrasy.
4
Normalize your inputs
5
DATA MODELING - BEST PRACTICES
● Validate your model (set validation and clinical)
● Do the benchmark attempt, don’t be afraid to launch your product
without ML
● Set up a feedback loop
● Healthcare doesn’t trust black boxes
● Correlation is not causation
● Monitor ongoing performance
● Don’t be fooled by “accuracy”
● Labeled data
● Use medical support libraries eg: pubmed, cochrane, American
Heart Association, Diabetes UK and so on.
DATA MODELING - BLUEPRINT
Source: sci-kit learn
DATA MODELING - TRADE OFF
Source: oreilly strata 2013
DATA MODELING - TECHNOLOGY
DATA MODELING - TOOLS
Reproducible research
is great!
DATA MODELING - END OF THE HERO’S JOURNEY!
DATA MODELING - USE CASE - ELSEVIER
RISK PREDICTIONS: WHICH DISEASE WILL YOU
LIKE GET WITHIN 4 YEARS
1600+ models
integrated into a
same
information
system.
Source: Elsevier Medical Graph - slideshare
DATA MODELING - USE CASE - ELSEVIER
Source: Elsevier Medical Graph - slideshare
Physician want
explanations.
Otherwise they
will not trust
the predictions
Typical best-in-class
classification methods
(deep learning, random
forest) do not yet deliver
explainable models.
In practice, you
need to save the
users processing
time, not add to it.
Visualization is
key.
Building a classification
model using open source
tools is simple. Scaling
input data size is also
manageable. Building
1000+ models is complex.
Open source tools have
failures (as have proprietary
tools). Debugging can be a
nightmare.
Implementing, applying
and maintaining a
security framework to
keep personal health
information secure is a
substantial effort.
THANKS!

Contenu connexe

Tendances

Prediction of Diabetes using Probability Approach
Prediction of Diabetes using Probability ApproachPrediction of Diabetes using Probability Approach
Prediction of Diabetes using Probability ApproachIRJET Journal
 
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSISAN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSISijcsit
 
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A SurveyPrediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A Surveyrahulmonikasharma
 
Machine learning in disease diagnosis
Machine learning in disease diagnosisMachine learning in disease diagnosis
Machine learning in disease diagnosisSushrutaMishra1
 
IRJET- Diabetes Prediction using Machine Learning
IRJET- Diabetes Prediction using Machine LearningIRJET- Diabetes Prediction using Machine Learning
IRJET- Diabetes Prediction using Machine LearningIRJET Journal
 
Comparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease PredictionComparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease PredictionIRJET Journal
 
IRJET- Heart Disease Prediction and Recommendation
IRJET-  	  Heart Disease Prediction and RecommendationIRJET-  	  Heart Disease Prediction and Recommendation
IRJET- Heart Disease Prediction and RecommendationIRJET Journal
 
Diabetes Data Science
Diabetes Data ScienceDiabetes Data Science
Diabetes Data SciencePhilip Bourne
 
Heart disease prediction system
Heart disease prediction systemHeart disease prediction system
Heart disease prediction systemSWAMI06
 
Predictive Analytics and Machine Learning for Healthcare - Diabetes
Predictive Analytics and Machine Learning for Healthcare - DiabetesPredictive Analytics and Machine Learning for Healthcare - Diabetes
Predictive Analytics and Machine Learning for Healthcare - DiabetesDr Purnendu Sekhar Das
 
PREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUESPREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUESIAEME Publication
 
Deep-learning-or-health-informatics-recent-trends-and-future-directions By Ra...
Deep-learning-or-health-informatics-recent-trends-and-future-directions By Ra...Deep-learning-or-health-informatics-recent-trends-and-future-directions By Ra...
Deep-learning-or-health-informatics-recent-trends-and-future-directions By Ra...raihansikdar
 
Using AI to Predict Strokes
Using AI to Predict StrokesUsing AI to Predict Strokes
Using AI to Predict StrokesEMMAIntl
 
Diabetes prediction with r(using knn)
Diabetes prediction with r(using knn)Diabetes prediction with r(using knn)
Diabetes prediction with r(using knn)tanujoshi98
 
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...dkNET
 
Predicting diabetes using a machine learning approach linked in
Predicting diabetes using a machine learning approach   linked inPredicting diabetes using a machine learning approach   linked in
Predicting diabetes using a machine learning approach linked invenkatvajradhar1
 
Various Data Mining Techniques for Diabetes Prognosis: A Review
Various Data Mining Techniques for Diabetes Prognosis: A ReviewVarious Data Mining Techniques for Diabetes Prognosis: A Review
Various Data Mining Techniques for Diabetes Prognosis: A Reviewijtsrd
 
IRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of DiabetesIRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of DiabetesIRJET Journal
 
Machine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilanceMachine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilanceRevathi Boyina
 

Tendances (20)

Prediction of Diabetes using Probability Approach
Prediction of Diabetes using Probability ApproachPrediction of Diabetes using Probability Approach
Prediction of Diabetes using Probability Approach
 
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSISAN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
AN ALGORITHM FOR PREDICTIVE DATA MINING APPROACH IN MEDICAL DIAGNOSIS
 
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A SurveyPrediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
 
Machine learning in disease diagnosis
Machine learning in disease diagnosisMachine learning in disease diagnosis
Machine learning in disease diagnosis
 
IRJET- Diabetes Prediction using Machine Learning
IRJET- Diabetes Prediction using Machine LearningIRJET- Diabetes Prediction using Machine Learning
IRJET- Diabetes Prediction using Machine Learning
 
Comparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease PredictionComparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease Prediction
 
IRJET- Heart Disease Prediction and Recommendation
IRJET-  	  Heart Disease Prediction and RecommendationIRJET-  	  Heart Disease Prediction and Recommendation
IRJET- Heart Disease Prediction and Recommendation
 
Final ppt
Final pptFinal ppt
Final ppt
 
Diabetes Data Science
Diabetes Data ScienceDiabetes Data Science
Diabetes Data Science
 
Heart disease prediction system
Heart disease prediction systemHeart disease prediction system
Heart disease prediction system
 
Predictive Analytics and Machine Learning for Healthcare - Diabetes
Predictive Analytics and Machine Learning for Healthcare - DiabetesPredictive Analytics and Machine Learning for Healthcare - Diabetes
Predictive Analytics and Machine Learning for Healthcare - Diabetes
 
PREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUESPREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUES
PREDICTION OF DIABETES MELLITUS USING MACHINE LEARNING TECHNIQUES
 
Deep-learning-or-health-informatics-recent-trends-and-future-directions By Ra...
Deep-learning-or-health-informatics-recent-trends-and-future-directions By Ra...Deep-learning-or-health-informatics-recent-trends-and-future-directions By Ra...
Deep-learning-or-health-informatics-recent-trends-and-future-directions By Ra...
 
Using AI to Predict Strokes
Using AI to Predict StrokesUsing AI to Predict Strokes
Using AI to Predict Strokes
 
Diabetes prediction with r(using knn)
Diabetes prediction with r(using knn)Diabetes prediction with r(using knn)
Diabetes prediction with r(using knn)
 
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
 
Predicting diabetes using a machine learning approach linked in
Predicting diabetes using a machine learning approach   linked inPredicting diabetes using a machine learning approach   linked in
Predicting diabetes using a machine learning approach linked in
 
Various Data Mining Techniques for Diabetes Prognosis: A Review
Various Data Mining Techniques for Diabetes Prognosis: A ReviewVarious Data Mining Techniques for Diabetes Prognosis: A Review
Various Data Mining Techniques for Diabetes Prognosis: A Review
 
IRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of DiabetesIRJET - Machine Learning for Diagnosis of Diabetes
IRJET - Machine Learning for Diagnosis of Diabetes
 
Machine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilanceMachine learning in health data analytics and pharmacovigilance
Machine learning in health data analytics and pharmacovigilance
 

Similaire à Machine Learning in Healthcare: Detecting Diseases & Optimizing Clinical Trials

Data science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptxData science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptxArpitaDebnath20
 
Sun==big data analytics for health care
Sun==big data analytics for health careSun==big data analytics for health care
Sun==big data analytics for health careAravindharamanan S
 
ppt for data science slideshare.pptx
ppt for data science slideshare.pptxppt for data science slideshare.pptx
ppt for data science slideshare.pptxMangeshPatil358834
 
Big Data Provides Opportunities, Challenges and a Better Future in Health and...
Big Data Provides Opportunities, Challenges and a Better Future in Health and...Big Data Provides Opportunities, Challenges and a Better Future in Health and...
Big Data Provides Opportunities, Challenges and a Better Future in Health and...Cirdan
 
Data science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptxData science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptxArpitaDebnath20
 
Meeting healthcare challenges: what are the challenges and what is the role o...
Meeting healthcare challenges: what are the challenges and what is the role o...Meeting healthcare challenges: what are the challenges and what is the role o...
Meeting healthcare challenges: what are the challenges and what is the role o...Mohammad Al-Ubaydli
 
A Systematic Review Of Type-2 Diabetes By Hadoop Map-Reduce
A Systematic Review Of Type-2 Diabetes By Hadoop Map-ReduceA Systematic Review Of Type-2 Diabetes By Hadoop Map-Reduce
A Systematic Review Of Type-2 Diabetes By Hadoop Map-ReduceFinni Rice
 
The Philosophy, Psychology, and Technology of Data in Healthcare
The Philosophy, Psychology, and  Technology of Data in HealthcareThe Philosophy, Psychology, and  Technology of Data in Healthcare
The Philosophy, Psychology, and Technology of Data in HealthcareDale Sanders
 
HealthPanel Concept pitchdeck
HealthPanel Concept pitchdeckHealthPanel Concept pitchdeck
HealthPanel Concept pitchdecksmworth
 
The future interface of mental health with information technology: high touch...
The future interface of mental health with information technology: high touch...The future interface of mental health with information technology: high touch...
The future interface of mental health with information technology: high touch...HealthXn
 
Benefits of Big Data in Health Care A Revolution
Benefits of Big Data in Health Care A RevolutionBenefits of Big Data in Health Care A Revolution
Benefits of Big Data in Health Care A Revolutionijtsrd
 
A BIG DATA REVOLUTION IN HEALTH CARE SECTOR: OPPORTUNITIES, CHALLENGES AND TE...
A BIG DATA REVOLUTION IN HEALTH CARE SECTOR: OPPORTUNITIES, CHALLENGES AND TE...A BIG DATA REVOLUTION IN HEALTH CARE SECTOR: OPPORTUNITIES, CHALLENGES AND TE...
A BIG DATA REVOLUTION IN HEALTH CARE SECTOR: OPPORTUNITIES, CHALLENGES AND TE...ijistjournal
 
From personal health data to a personalized advice
From personal health data to a personalized adviceFrom personal health data to a personalized advice
From personal health data to a personalized adviceWessel Kraaij
 
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Conference – iHT2
 
Patient-Focused Data Science: Machine Learning for Complex Diseases (AIM203-S...
Patient-Focused Data Science: Machine Learning for Complex Diseases (AIM203-S...Patient-Focused Data Science: Machine Learning for Complex Diseases (AIM203-S...
Patient-Focused Data Science: Machine Learning for Complex Diseases (AIM203-S...Amazon Web Services
 

Similaire à Machine Learning in Healthcare: Detecting Diseases & Optimizing Clinical Trials (20)

Ai applied in healthcare
Ai applied in healthcareAi applied in healthcare
Ai applied in healthcare
 
Data science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptxData science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptx
 
Sun==big data analytics for health care
Sun==big data analytics for health careSun==big data analytics for health care
Sun==big data analytics for health care
 
ppt for data science slideshare.pptx
ppt for data science slideshare.pptxppt for data science slideshare.pptx
ppt for data science slideshare.pptx
 
Nurses and Data Science
Nurses and Data ScienceNurses and Data Science
Nurses and Data Science
 
Big Data Provides Opportunities, Challenges and a Better Future in Health and...
Big Data Provides Opportunities, Challenges and a Better Future in Health and...Big Data Provides Opportunities, Challenges and a Better Future in Health and...
Big Data Provides Opportunities, Challenges and a Better Future in Health and...
 
Data science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptxData science in healthcare-Assignment 2.pptx
Data science in healthcare-Assignment 2.pptx
 
Meeting healthcare challenges: what are the challenges and what is the role o...
Meeting healthcare challenges: what are the challenges and what is the role o...Meeting healthcare challenges: what are the challenges and what is the role o...
Meeting healthcare challenges: what are the challenges and what is the role o...
 
A Systematic Review Of Type-2 Diabetes By Hadoop Map-Reduce
A Systematic Review Of Type-2 Diabetes By Hadoop Map-ReduceA Systematic Review Of Type-2 Diabetes By Hadoop Map-Reduce
A Systematic Review Of Type-2 Diabetes By Hadoop Map-Reduce
 
The Philosophy, Psychology, and Technology of Data in Healthcare
The Philosophy, Psychology, and  Technology of Data in HealthcareThe Philosophy, Psychology, and  Technology of Data in Healthcare
The Philosophy, Psychology, and Technology of Data in Healthcare
 
HealthPanel Concept pitchdeck
HealthPanel Concept pitchdeckHealthPanel Concept pitchdeck
HealthPanel Concept pitchdeck
 
The future interface of mental health with information technology: high touch...
The future interface of mental health with information technology: high touch...The future interface of mental health with information technology: high touch...
The future interface of mental health with information technology: high touch...
 
Benefits of Big Data in Health Care A Revolution
Benefits of Big Data in Health Care A RevolutionBenefits of Big Data in Health Care A Revolution
Benefits of Big Data in Health Care A Revolution
 
A BIG DATA REVOLUTION IN HEALTH CARE SECTOR: OPPORTUNITIES, CHALLENGES AND TE...
A BIG DATA REVOLUTION IN HEALTH CARE SECTOR: OPPORTUNITIES, CHALLENGES AND TE...A BIG DATA REVOLUTION IN HEALTH CARE SECTOR: OPPORTUNITIES, CHALLENGES AND TE...
A BIG DATA REVOLUTION IN HEALTH CARE SECTOR: OPPORTUNITIES, CHALLENGES AND TE...
 
From personal health data to a personalized advice
From personal health data to a personalized adviceFrom personal health data to a personalized advice
From personal health data to a personalized advice
 
THE LARGE DATA DEMO - ONE MODEL
THE LARGE DATA DEMO - ONE MODELTHE LARGE DATA DEMO - ONE MODEL
THE LARGE DATA DEMO - ONE MODEL
 
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
 
Predictive Health Population Analytics
Predictive Health Population AnalyticsPredictive Health Population Analytics
Predictive Health Population Analytics
 
Patient-Focused Data Science: Machine Learning for Complex Diseases (AIM203-S...
Patient-Focused Data Science: Machine Learning for Complex Diseases (AIM203-S...Patient-Focused Data Science: Machine Learning for Complex Diseases (AIM203-S...
Patient-Focused Data Science: Machine Learning for Complex Diseases (AIM203-S...
 
Pavia wsp october 2011
Pavia wsp october 2011Pavia wsp october 2011
Pavia wsp october 2011
 

Plus de Big Data Colombia

An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learningBig Data Colombia
 
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern RecognitionWhose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern RecognitionBig Data Colombia
 
Analysis of your own Facebook friends’ data structure through graphs
Analysis of your own Facebook friends’ data structure through graphsAnalysis of your own Facebook friends’ data structure through graphs
Analysis of your own Facebook friends’ data structure through graphsBig Data Colombia
 
Lo datos cuentan su historia
Lo datos cuentan su historiaLo datos cuentan su historia
Lo datos cuentan su historiaBig Data Colombia
 
Entornos Naturalmente Inteligentes
Entornos Naturalmente InteligentesEntornos Naturalmente Inteligentes
Entornos Naturalmente InteligentesBig Data Colombia
 
Modelamiento predictivo y medicina
Modelamiento predictivo y medicinaModelamiento predictivo y medicina
Modelamiento predictivo y medicinaBig Data Colombia
 
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al MesAyudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al MesBig Data Colombia
 
Deep learning: el renacimiento de las redes neuronales
Deep learning: el renacimiento de las redes neuronalesDeep learning: el renacimiento de las redes neuronales
Deep learning: el renacimiento de las redes neuronalesBig Data Colombia
 
Cloud computing: Trends and Challenges
Cloud computing: Trends and ChallengesCloud computing: Trends and Challenges
Cloud computing: Trends and ChallengesBig Data Colombia
 
Kaggle: Coupon Purchase Prediction
Kaggle: Coupon Purchase PredictionKaggle: Coupon Purchase Prediction
Kaggle: Coupon Purchase PredictionBig Data Colombia
 
Introducción al Datawarehousing
Introducción al DatawarehousingIntroducción al Datawarehousing
Introducción al DatawarehousingBig Data Colombia
 
Análisis Explotatorio de Datos: Dejad que la data hable.
Análisis Explotatorio de Datos: Dejad que la data hable.Análisis Explotatorio de Datos: Dejad que la data hable.
Análisis Explotatorio de Datos: Dejad que la data hable.Big Data Colombia
 
Salud, dinero, amor y big data
Salud, dinero, amor y big dataSalud, dinero, amor y big data
Salud, dinero, amor y big dataBig Data Colombia
 
Business Analytics: ¡La culpa es del BIG data!
Business Analytics: ¡La culpa es del BIG data!Business Analytics: ¡La culpa es del BIG data!
Business Analytics: ¡La culpa es del BIG data!Big Data Colombia
 

Plus de Big Data Colombia (19)

An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
 
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern RecognitionWhose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
Whose Balance Sheet is this? Neural Networks for Banks’ Pattern Recognition
 
Analysis of your own Facebook friends’ data structure through graphs
Analysis of your own Facebook friends’ data structure through graphsAnalysis of your own Facebook friends’ data structure through graphs
Analysis of your own Facebook friends’ data structure through graphs
 
Lo datos cuentan su historia
Lo datos cuentan su historiaLo datos cuentan su historia
Lo datos cuentan su historia
 
Entornos Naturalmente Inteligentes
Entornos Naturalmente InteligentesEntornos Naturalmente Inteligentes
Entornos Naturalmente Inteligentes
 
Modelamiento predictivo y medicina
Modelamiento predictivo y medicinaModelamiento predictivo y medicina
Modelamiento predictivo y medicina
 
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al MesAyudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
Ayudando a los Viajeros usando 500 millones de Reseñas Hoteleras al Mes
 
Deep learning: el renacimiento de las redes neuronales
Deep learning: el renacimiento de las redes neuronalesDeep learning: el renacimiento de las redes neuronales
Deep learning: el renacimiento de las redes neuronales
 
IPython & Jupyter
IPython & JupyterIPython & Jupyter
IPython & Jupyter
 
Cloud computing: Trends and Challenges
Cloud computing: Trends and ChallengesCloud computing: Trends and Challenges
Cloud computing: Trends and Challenges
 
Kaggle: Coupon Purchase Prediction
Kaggle: Coupon Purchase PredictionKaggle: Coupon Purchase Prediction
Kaggle: Coupon Purchase Prediction
 
Machine learning y Kaggle
Machine learning y KaggleMachine learning y Kaggle
Machine learning y Kaggle
 
Fraud Analytics
Fraud AnalyticsFraud Analytics
Fraud Analytics
 
Data crunching con Spark
Data crunching con SparkData crunching con Spark
Data crunching con Spark
 
Introducción al Datawarehousing
Introducción al DatawarehousingIntroducción al Datawarehousing
Introducción al Datawarehousing
 
Análisis Explotatorio de Datos: Dejad que la data hable.
Análisis Explotatorio de Datos: Dejad que la data hable.Análisis Explotatorio de Datos: Dejad que la data hable.
Análisis Explotatorio de Datos: Dejad que la data hable.
 
Big Data para mortales
Big Data para mortalesBig Data para mortales
Big Data para mortales
 
Salud, dinero, amor y big data
Salud, dinero, amor y big dataSalud, dinero, amor y big data
Salud, dinero, amor y big data
 
Business Analytics: ¡La culpa es del BIG data!
Business Analytics: ¡La culpa es del BIG data!Business Analytics: ¡La culpa es del BIG data!
Business Analytics: ¡La culpa es del BIG data!
 

Dernier

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Dernier (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Machine Learning in Healthcare: Detecting Diseases & Optimizing Clinical Trials

  • 2. WHO AM I? Javier Samir Rey Systems engineer Machine learning engineer - Direktio Co-organizer meetup Big Data Colombia jreyro@gmail.com javier-samir-rey-7104195 github/jasam
  • 3. “Work on Stuff that Matters” Tim O'Reilly
  • 4. Source: United Nations - Sustainable Development Goals SUSTAINABLE DEVELOPMENT GOALS
  • 5. 3 - GOOD HEALTH AND WELL-BEING “Ensure healthy lives and promoting well-being for all at all ages.” ● Reproductive maternal and child health. ● Communicable, non-communicable and environmental diseases. ● Health risk reduction and management. ● Universal health coverage.
  • 6. NO AND COMMUNICABLE DISEASES The incidence of major infectious diseases: HIV, tuberculosis and malaria. Almost half the world’s population is at risk of malaria. 889,000 people died from infectious diseases caused largely by faecal contamination of water. 40 millions global death were due NCDs. 48% deaths were premature. 75% of premature deaths were caused by cardiovascular disease, cancer, diabetes and chronic respiratory disease. 80% of heart disease, stroke and diabetes can be prevented. Source: United Nations CDs NCDs
  • 7. Noncommunicable diseases (NCDs), also known as chronic diseases, tend to be of long duration and are the result of a combination of genetic, physiological, environmental and behaviours factors. Detection, screening and treatment of NCDs, as well as palliative care, are key components of the response to NCDs. An important way to control NCDs is to focus on reducing the risk factors associated with these diseases. Low-cost solutions exist for governments and other stakeholders to reduce the common modifiable risk factors. Monitoring progress and trends of NCDs and their risk is important for guiding policy and priorities. NON COMMUNICABLE DISEASES
  • 8. Decreased quality of life of the human being. IMPACT In low-resource settings, health-care costs for NCDs quickly drain household resources. The exorbitant costs of NCDs, including often lengthy and expensive treatment and loss of breadwinners, force millions of people into poverty annually and stifle.
  • 9. Hypertension and Diabetes Mellitus COLOMBIA NO COMMUNICABLE DISEASES major precursors of - Ischemic cardiovascular disease - Cerebrovascular events - End-stage renal disease - Death prevalence - Hypertension: 6.5 % - Diabetes: 1.9 % 20% of the population consumes 80%of the resources. Source: cuenta de alto costo
  • 10. SOME REVIEW Data is quickly emerging as the greatest asset of the healthcare industry. The trend in our industry is to drive many decisions supported by data. it is a walk of maturity with the real gold nuggets coming in Analytics 3.0 and beyond. This will not be solved with a product or purchased off the shelf. Big Data needs to be part of the DNA of an organization. -- Chris Belmont, MBA Vice President and Chief Information Officer MD Anderson Cancer Center
  • 11. “I know that 50% of my advertising is wasted, I just don’t know which half.” WANAMAKER’S QUESTION Healthcare industry is now awash in data in a way that it has never been before: biological, gene expression, sensors, DNA, sequence, EHRs, drug and medicals. We have entered a new era in which we can work on massive datasets effectively combining it. We can start asking the important questions, the wanamaker questions! The opportunities are huge!. Source: wikipedia
  • 14. BUSINESS UNDERSTANDING DATA VALUE PYRAMID Source: datasyndrome
  • 15. AGILE DATA SCIENCE MANIFIESTO Source: agile data science 2.0 Iterate, iterate, iterate: tables, charts, reports, predictions - roadmap projects. 1 Integrate the tyrannical opinion of data in product management. 4 Ship intermediate output. Even failed experiments have output. 2 Prototype experiments over implementing tasks. 3
  • 16. AGILE DATA SCIENCE MANIFIESTO Source: agile data science 2.0 Climb up and down the data-value pyramid as we work. 5 Discover and pursue the critical path to a killer product. 6 Get meta. Describe the process, not just the end-state. 7
  • 17. CRISP-DM METHODOLOGY Source: Wikipedia Before define your framework (agile is a possibility), first define your culture and team. Cross Industry Standard Process for Data Mining
  • 18. BUSINESS UNDERSTANDING It is one of the most important concepts of data science! It is vital to understand the problem to be solved and context. 1 Often recasting the problem and designing a solution is an iterative process of discovery. 2 The Business Understanding stage represents a part of the craft where the analysts’ creativity plays a large role. 3
  • 19. BUSINESS UNDERSTANDING It is one of the most important concepts of data science! The key to a great success is a creative problem formulation how to cast the business problem as one or more data science problems (subproblems). 4 What is the expected value. 5 Team’s help is really important, we are not alone. 6
  • 20. BUSINESS UNDERSTANDING - HEALTH Source: mckinsey and company Big data has a higher potential in 3 ways: ● Precision medicine ● Diagnose diseases ● Optimize clinical trials
  • 21. BUSINESS UNDERSTANDING - HEALTH ACTORS ● Clinicians, domain experts and financial analysts ● Managers, IT developers, consultants and vendors ● Policy makers ● Patients and consumers ● Executives and lines-of-business leaders ● Researches and academia ● Health institutions ● Society Build your strategy together!
  • 22. BUSINESS UNDERSTANDING - HEALTH CHRONIC CONDITIONS CARE MODEL Source: Cuidado das Condições Crônicas na Atenção Primária à Saúde Inspired by the pyramid of Kaiser Permanente!
  • 23. DATA UNDERSTANDING Solving the business problem is the goal. 1 It is important to understand the strengths and limitations of the data because rarely is there an exact match with the problem. 2 Some data will be available virtually for free while others will require effort to obtain. 3 Cleaning and matching different sources in only one record match is itself could be a complicated analytics problem 4
  • 24. DATA UNDERSTANDING Remember all V’s about data: volume, velocity, variety, variability, veracity, visualization and value. 5 Design and build data engineering team that supports your data requirements. 6 Data Governance DAMA (Data Management Association International) 7
  • 25. DATA UNDERSTANDING - HEALTH SOURCES FOR DATA IN HEALTHCARE Healthcare data Examples Images Radiographic, Images, MRIs, Ultrasounds and Nuclear imaging Un-/semi-structured Clinical narratives, Physician notes, Level 2,3 OMICS, Summaries, Pathology reports Streaming Bedside, remote monitors, Implants, fitness bands, smart watches and smart phones Social media Facebook, Twitter, Web forums and communities Structure data All claims, EHR, ERP and other information systems Dark data Server logs, application error logs, account information, emails and documents
  • 26. DATA UNDERSTANDING - HEALTH Source: The Rise of Consumer Health Wearables
  • 27. DATA UNDERSTANDING - HEALTH Source: mckinsey and company
  • 28. DATA UNDERSTANDING - HEALTH Source: mckinsey and company
  • 29. DATA PREPARATION The analytic technologies could be powerful but they impose certain requirements on the data they use (data table). 1 Typical examples of data preparation are converting data to tabular format 2 Feature engineering.3 Technology is important but this is not the main point.4
  • 30. DATA PREPARATION The process defining the variables. This is one of the main points at which human creativity, common sense, and business knowledge come into play. 4 Document your time process. 5 Think optimization process -Big O6 Little blocks of processing - plan for scale7
  • 31. DATA PREPARATION - COMPUTING BOUND Source: hadoop in the enterprise: architecture
  • 32. DATA PREPARATION - DATA ENGINEERING Pair review Modularize your project Create professional projects - world class solutions using: versioning, standards, right tools, unit tests.
  • 33. DATA PREPARATION - TABULAR FORM - THE GOAL Primary care Secondary care Medication Other data… a lot of types ID age med height weight BMI diet 1 15 Y 168 60 21.3 Y 2 20 Y 185 80 23.4 Y 3 65 N 192 90 24.4 N 4 48 N 172 85 28.7 N 5 45 Y 185 79 23.1 N 6 79 N 182 71 21.4 Y 7 22 Y 186 79 22.8 Y Feature engineering Data points this is the key (N*M)! After a very expensive process To put data together is challenging Data engineering N features Mobservations
  • 34. DATA MODELING The creation of models from data is known as model induction. Induction is a term from philosophy that refers to generalizing from specific cases to general rules (or laws, or truths). Source: Data science for business Generally speaking, a model is a simplified representation of reality created to serve a purpose. In data science, a predictive model is a formula for estimating the unknown value of interest: the target. The formula could be mathematical, or it could be a logical statement such as a rule. Often it is a hybrid of the two. Many Names for the Same Things!.
  • 35. DATA MODELING - BEST PRACTICES Ask a specific question, Remember you are solving a business problem, not a math problem. 1 Start simple, start with the minimal set of data. 2 Try many algorithms but remember that data is more important than the exact algorithm, better your features. 3 Treat your data with suspicion, understand its idiosyncrasy. 4 Normalize your inputs 5
  • 36. DATA MODELING - BEST PRACTICES ● Validate your model (set validation and clinical) ● Do the benchmark attempt, don’t be afraid to launch your product without ML ● Set up a feedback loop ● Healthcare doesn’t trust black boxes ● Correlation is not causation ● Monitor ongoing performance ● Don’t be fooled by “accuracy” ● Labeled data ● Use medical support libraries eg: pubmed, cochrane, American Heart Association, Diabetes UK and so on.
  • 37. DATA MODELING - BLUEPRINT Source: sci-kit learn
  • 38. DATA MODELING - TRADE OFF Source: oreilly strata 2013
  • 39. DATA MODELING - TECHNOLOGY
  • 40. DATA MODELING - TOOLS Reproducible research is great!
  • 41. DATA MODELING - END OF THE HERO’S JOURNEY!
  • 42. DATA MODELING - USE CASE - ELSEVIER RISK PREDICTIONS: WHICH DISEASE WILL YOU LIKE GET WITHIN 4 YEARS 1600+ models integrated into a same information system. Source: Elsevier Medical Graph - slideshare
  • 43. DATA MODELING - USE CASE - ELSEVIER Source: Elsevier Medical Graph - slideshare Physician want explanations. Otherwise they will not trust the predictions Typical best-in-class classification methods (deep learning, random forest) do not yet deliver explainable models. In practice, you need to save the users processing time, not add to it. Visualization is key. Building a classification model using open source tools is simple. Scaling input data size is also manageable. Building 1000+ models is complex. Open source tools have failures (as have proprietary tools). Debugging can be a nightmare. Implementing, applying and maintaining a security framework to keep personal health information secure is a substantial effort.