SlideShare une entreprise Scribd logo
1  sur  10
Yves Caseau - Machine Learning for Self Tracking – February 2019 1/10
Machine Learning Heuristics for Short TimeMachine Learning Heuristics for Short Time
Series Forecasting with Quantified Self DataSeries Forecasting with Quantified Self Data
Yves Caseau
National Academy of Technologies
Yves Caseau - Machine Learning for Self Tracking – February 2019 2/10
Self-Tracking and Knomee Mobile AppSelf-Tracking and Knomee Mobile App
 Knomee is a self-tracking mobile app for iOS (one of many
thousands)
 Knomee motto: « self-tracking with sense »
 Data science applied to self tracking
 Self-tracking apps generate time series
 One or many (up to 4) data points collected over a period of
time
 Data is either self-declared (the user picks a value in a preset
range) or automatically imported from a a connected device
(iPhone’s sensors, Apple watch or any HealthKit compatible
device like a a Withings scale)
 Data files are accessible on:
https://github.com/ycaseau/KnomeeQuest/tree/master/data
 20 samples
 Ranging from 40 to 220 measures (x 4)
Yves Caseau - Machine Learning for Self Tracking – February 2019 3/10
Quests : Causal Diagrams are proposed by the userQuests : Causal Diagrams are proposed by the user
 Self-tracking is organized around causal diagrams
 A quest is made of a target tracker and up to three
factor trackers
 The user makes the hypothesis that the factors may
contribute to the target
 Using Judea Peal’s notation we look for: usal
 P(X | do(Y)) : impact of doing Y on X
 Detect causality through active experiments
 Correlation is not enough
 A quest is an hypothesis, not all quests are meaningful
 Factor causality is tricky (e.g. coffee as a symptom)
 How to tell if the effort on factors is « worth it » ?
Impact on the target
 Key property of self-tracking data:
some input is purely random
{quest:ENERGY, icloud:true,
energy:{
type:2, more:true,
min:1, max:6, target:4,
labels:[crisis, sleepy, lapses,
normal, energetic, hyper],},
sleep:{
type:7, more:true,
min:4, max:9, target:7,},
steps:{
type:4, more:true,
min:0, max:19000,
target:7000,},
weight:{
type:5, more:false,
min:75, max:82, target:78,},
}
Yves Caseau - Machine Learning for Self Tracking – February 2019 4/10
Short Time-Series ForecastingShort Time-Series Forecasting
 Our goal in this talk : how to forecast values from self-tracking data ?
 Forecasting gives a possible clue about the value of the causal hypothesis
(Granger causality)
 We search for a robust method that does not break with random noise
 Measuring success: iterative training protocol
 For i in (2N/3 .. N), forecast TS[i] from (TS[1], …, TS[i - 1]
– Apply forecast to time[i]
– Measure average distance to real value TS[i]
– Compare to « average » performance
 Realistic simulation of what happens in the app
 Why it is hard:
 short samples (small data)
 mixed random inputs
Yves Caseau - Machine Learning for Self Tracking – February 2019 5/10
Classical Methods yield poor resultsClassical Methods yield poor results
 Three classical ML algorithms, trained to
minimize distance, using implicit time
features and factors
 Linear Regression
 K-means Clustering (10 – 15 groups)
 ARMA (AutoRegressive Moving Average)
 Forecasting results are dispapointing
 The difficulty is not a surprise, we are
looking to extract a small amount of
information, only when present
 Improving a few % over average is the best
we can expect
 Overfitting very easily offsets the forecasting
gain
Linar
Regression
K-means ARMA
forecasting 18.34% 19.5% 18.9%
average 17.5% 17.5% 17.5%
Distance
(squares)
0.655 0.81 0.525
Random noise
Linked to factors
Linked to non-
collected factors Random noise
“good quest” “poor quest”
variation
Yves Caseau - Machine Learning for Self Tracking – February 2019 6/10
A Term-Algebra of Heuristics CombinationsA Term-Algebra of Heuristics Combinations
 Heuristic toolbox
 MovingAverage – MA(k,discount)
 Trend (time linear regression)
 Weekly and Hourly patterns
 Factor regression with explicit delay
 CumSum (cumulative sum of differences to average)
 Threshold regression with delay
 Combined through a linear algebra
 Each term is a weighted combination of a few heuristics
 Some other heuristics provide improvement with some quests but are left aside for lack
of robustness
 Cycle analysis (detecting “biorhythms”)
 Split (constant until date X, then T)
useful when something changed.
 And(t1,t2) : Boolean conjunction of two factors
Mi x[ 0. 97] (
T[ 2. 25- 2. 02/ - 1. 00] ,
wAvg[ " t ar get " ] ( 10, 1. 00) )
+ Cor [ 0. 04] ( " t r ack2" +16)  
Yves Caseau - Machine Learning for Self Tracking – February 2019 7/10
Distances and RegularizationDistances and Regularization
Time-series operations are weighted
 The weight of each measure is proportional to the
distance to its next neighbor
 Spaced measures are more important than repeated
ones
« Triangular distance »
 The distance between two time series is the area
between the two curves
Regularization to avoid overfitting
 Principle: add a penalty to the distance that reduces
the overall standard deviation
 best formula for this data set
wDist(a,t) + max(0.0, stdev(a) – 0.02)
Yves Caseau - Machine Learning for Self Tracking – February 2019 8/10
Randomized Incremental AlgorithmsRandomized Incremental Algorithms
 Main algorithm is “Randomized Optimization” (RandOpt)
 Create n random algebra terms
 Combination of glutton heuristics (create the best possible term)
 And randomization (coefficients / which sub-term to pick)
 Depth is controlled with a global parameter
 Optimized though local optimization
 Each parameter of the algebra sub-terms (i.e, coefficient, delays, etc.) are optimized one by one
 Hill-climbing local meta heuristics
 Three successive rounds
 This is used in an “incremental mode:
 For each new measure
 Reuse previous best term, and improve through local optimization
 Run ”RandOpt” (100 iterations)
 Keep best term
 What has not worked out so far
 Evolutionary (genetic algorithm with cross-over)
 Mutation (large neighborgood local optimization)
Yves Caseau - Machine Learning for Self Tracking – February 2019 9/10
Computational resultsComputational results
 Average forecast is 16.88% (control = average is 17.5%)
 Average square distance is 1.03 (worse than LR,ARMA or k-means) because of regularization
 Strong measures against overfitting (regularization, depth, # local opt loops + techniques)
Yves Caseau - Machine Learning for Self Tracking – February 2019 10/10
ConclusionConclusion
Forecasting for self-tracking data is hard
We presented a reinforcement generative
machine learning that performs better than
most classical techniques
This is due to the complex nature of the data
 On (classical) sales time series, ARMA does better than the proposed approach
(close to LR)
 Open question : how to detect the “intrinsic quality” of the quest and change the
forecasting method / regularization parameters accordingly ?
 You can download the data and try your own approaches 
Forecasting is used to two purposes in our mobile app:
 User experience : forecasting makes data entry faster + gives a sense of playfulness
 Granger Causality : when the forecasting score is ”good”, this gives a sense of
plausibility to the causal diagram hypothesis (represented by the “quest”)

Contenu connexe

Similaire à Machine Learning for Self-Tracking

Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...ijistjournal
 
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...ijistjournal
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsGIScRG
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsAM Publications
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401butest
 
Sentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVMSentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVMIRJET Journal
 
Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET Journal
 
Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819HODCSE21
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learningbutest
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET Journal
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET Journal
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningcsandit
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGcsandit
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...cscpconf
 
The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...IJERA Editor
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationIJECEIAES
 

Similaire à Machine Learning for Self-Tracking (20)

Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
 
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
MEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational ExperimentsMEME – An Integrated Tool For Advanced Computational Experiments
MEME – An Integrated Tool For Advanced Computational Experiments
 
presentationIDC - 14MAY2015
presentationIDC - 14MAY2015presentationIDC - 14MAY2015
presentationIDC - 14MAY2015
 
Classification of Machine Learning Algorithms
Classification of Machine Learning AlgorithmsClassification of Machine Learning Algorithms
Classification of Machine Learning Algorithms
 
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
 
Sentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVMSentiment Analysis using Naïve Bayes, CNN, SVM
Sentiment Analysis using Naïve Bayes, CNN, SVM
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...
 
Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819Cse 7th-sem-machine-learning-laboratory-csml1819
Cse 7th-sem-machine-learning-laboratory-csml1819
 
Brief Tour of Machine Learning
Brief Tour of Machine LearningBrief Tour of Machine Learning
Brief Tour of Machine Learning
 
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and LimeIRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
IRJET- Stabilization of Black Cotton Soil using Rice Husk Ash and Lime
 
IRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine LearningIRJET- Student Placement Prediction using Machine Learning
IRJET- Student Placement Prediction using Machine Learning
 
Analytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion miningAnalytical study of feature extraction techniques in opinion mining
Analytical study of feature extraction techniques in opinion mining
 
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MINING
 
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
Radial Basis Function Neural Network (RBFNN), Induction Motor, Vector control...
 
The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...The Development of Financial Information System and Business Intelligence Usi...
The Development of Financial Information System and Business Intelligence Usi...
 
Opinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classicationOpinion mining framework using proposed RB-bayes model for text classication
Opinion mining framework using proposed RB-bayes model for text classication
 

Plus de Yves Caseau

DataAquitaine February 2022
DataAquitaine February 2022DataAquitaine February 2022
DataAquitaine February 2022Yves Caseau
 
Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3Yves Caseau
 
Information Systems for Digital Transformation
Information Systems for Digital TransformationInformation Systems for Digital Transformation
Information Systems for Digital TransformationYves Caseau
 
Lean from the guts
Lean from the gutsLean from the guts
Lean from the gutsYves Caseau
 
Taking advantageofai july2018
Taking advantageofai july2018Taking advantageofai july2018
Taking advantageofai july2018Yves Caseau
 
Software Pitch 2018
Software Pitch 2018Software Pitch 2018
Software Pitch 2018Yves Caseau
 
Intelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIAIntelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIAYves Caseau
 
Big data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureBig data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureYves Caseau
 
XEBICON Public November 2015
XEBICON Public November 2015XEBICON Public November 2015
XEBICON Public November 2015Yves Caseau
 
Smart selfnovember2013
Smart selfnovember2013Smart selfnovember2013
Smart selfnovember2013Yves Caseau
 
Management socialnetworksfeb2012
Management socialnetworksfeb2012Management socialnetworksfeb2012
Management socialnetworksfeb2012Yves Caseau
 
Google socialnetworksmarch08
Google socialnetworksmarch08Google socialnetworksmarch08
Google socialnetworksmarch08Yves Caseau
 
Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance Yves Caseau
 
Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013Yves Caseau
 
Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015Yves Caseau
 
The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014Yves Caseau
 
Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014Yves Caseau
 

Plus de Yves Caseau (20)

CCEM2023.pptx
CCEM2023.pptxCCEM2023.pptx
CCEM2023.pptx
 
DataAquitaine February 2022
DataAquitaine February 2022DataAquitaine February 2022
DataAquitaine February 2022
 
Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3Global warming dynamic gamesv0.3
Global warming dynamic gamesv0.3
 
Information Systems for Digital Transformation
Information Systems for Digital TransformationInformation Systems for Digital Transformation
Information Systems for Digital Transformation
 
Lean from the guts
Lean from the gutsLean from the guts
Lean from the guts
 
Taking advantageofai july2018
Taking advantageofai july2018Taking advantageofai july2018
Taking advantageofai july2018
 
Software Pitch 2018
Software Pitch 2018Software Pitch 2018
Software Pitch 2018
 
Intelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIAIntelligence Artificielle - Journée MEDEF & AFIA
Intelligence Artificielle - Journée MEDEF & AFIA
 
Big data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT ArchitectureBig data, Behavioral Change and IOT Architecture
Big data, Behavioral Change and IOT Architecture
 
XEBICON Public November 2015
XEBICON Public November 2015XEBICON Public November 2015
XEBICON Public November 2015
 
Smart selfnovember2013
Smart selfnovember2013Smart selfnovember2013
Smart selfnovember2013
 
Management socialnetworksfeb2012
Management socialnetworksfeb2012Management socialnetworksfeb2012
Management socialnetworksfeb2012
 
Google socialnetworksmarch08
Google socialnetworksmarch08Google socialnetworksmarch08
Google socialnetworksmarch08
 
Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance Managing Business Processes Communication and Performance
Managing Business Processes Communication and Performance
 
Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013Smart homeamsterdamoctober2013
Smart homeamsterdamoctober2013
 
Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015Entreprise troispointzeropublicjan2015
Entreprise troispointzeropublicjan2015
 
GTES UTC 2014
GTES  UTC 2014GTES  UTC 2014
GTES UTC 2014
 
The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014The European CIO Conference - November 27th, 2014
The European CIO Conference - November 27th, 2014
 
Disic mars2014
Disic mars2014Disic mars2014
Disic mars2014
 
Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014Lean entreprisetwodotzerodauphinefev2014
Lean entreprisetwodotzerodauphinefev2014
 

Dernier

(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxSuji236384
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Silpa
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxRenuJangid3
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptxSilpa
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfSumit Kumar yadav
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY1301aanya
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspectsmuralinath2
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 

Dernier (20)

(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Use of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptxUse of mutants in understanding seedling development.pptx
Use of mutants in understanding seedling development.pptx
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 

Machine Learning for Self-Tracking

  • 1. Yves Caseau - Machine Learning for Self Tracking – February 2019 1/10 Machine Learning Heuristics for Short TimeMachine Learning Heuristics for Short Time Series Forecasting with Quantified Self DataSeries Forecasting with Quantified Self Data Yves Caseau National Academy of Technologies
  • 2. Yves Caseau - Machine Learning for Self Tracking – February 2019 2/10 Self-Tracking and Knomee Mobile AppSelf-Tracking and Knomee Mobile App  Knomee is a self-tracking mobile app for iOS (one of many thousands)  Knomee motto: « self-tracking with sense »  Data science applied to self tracking  Self-tracking apps generate time series  One or many (up to 4) data points collected over a period of time  Data is either self-declared (the user picks a value in a preset range) or automatically imported from a a connected device (iPhone’s sensors, Apple watch or any HealthKit compatible device like a a Withings scale)  Data files are accessible on: https://github.com/ycaseau/KnomeeQuest/tree/master/data  20 samples  Ranging from 40 to 220 measures (x 4)
  • 3. Yves Caseau - Machine Learning for Self Tracking – February 2019 3/10 Quests : Causal Diagrams are proposed by the userQuests : Causal Diagrams are proposed by the user  Self-tracking is organized around causal diagrams  A quest is made of a target tracker and up to three factor trackers  The user makes the hypothesis that the factors may contribute to the target  Using Judea Peal’s notation we look for: usal  P(X | do(Y)) : impact of doing Y on X  Detect causality through active experiments  Correlation is not enough  A quest is an hypothesis, not all quests are meaningful  Factor causality is tricky (e.g. coffee as a symptom)  How to tell if the effort on factors is « worth it » ? Impact on the target  Key property of self-tracking data: some input is purely random {quest:ENERGY, icloud:true, energy:{ type:2, more:true, min:1, max:6, target:4, labels:[crisis, sleepy, lapses, normal, energetic, hyper],}, sleep:{ type:7, more:true, min:4, max:9, target:7,}, steps:{ type:4, more:true, min:0, max:19000, target:7000,}, weight:{ type:5, more:false, min:75, max:82, target:78,}, }
  • 4. Yves Caseau - Machine Learning for Self Tracking – February 2019 4/10 Short Time-Series ForecastingShort Time-Series Forecasting  Our goal in this talk : how to forecast values from self-tracking data ?  Forecasting gives a possible clue about the value of the causal hypothesis (Granger causality)  We search for a robust method that does not break with random noise  Measuring success: iterative training protocol  For i in (2N/3 .. N), forecast TS[i] from (TS[1], …, TS[i - 1] – Apply forecast to time[i] – Measure average distance to real value TS[i] – Compare to « average » performance  Realistic simulation of what happens in the app  Why it is hard:  short samples (small data)  mixed random inputs
  • 5. Yves Caseau - Machine Learning for Self Tracking – February 2019 5/10 Classical Methods yield poor resultsClassical Methods yield poor results  Three classical ML algorithms, trained to minimize distance, using implicit time features and factors  Linear Regression  K-means Clustering (10 – 15 groups)  ARMA (AutoRegressive Moving Average)  Forecasting results are dispapointing  The difficulty is not a surprise, we are looking to extract a small amount of information, only when present  Improving a few % over average is the best we can expect  Overfitting very easily offsets the forecasting gain Linar Regression K-means ARMA forecasting 18.34% 19.5% 18.9% average 17.5% 17.5% 17.5% Distance (squares) 0.655 0.81 0.525 Random noise Linked to factors Linked to non- collected factors Random noise “good quest” “poor quest” variation
  • 6. Yves Caseau - Machine Learning for Self Tracking – February 2019 6/10 A Term-Algebra of Heuristics CombinationsA Term-Algebra of Heuristics Combinations  Heuristic toolbox  MovingAverage – MA(k,discount)  Trend (time linear regression)  Weekly and Hourly patterns  Factor regression with explicit delay  CumSum (cumulative sum of differences to average)  Threshold regression with delay  Combined through a linear algebra  Each term is a weighted combination of a few heuristics  Some other heuristics provide improvement with some quests but are left aside for lack of robustness  Cycle analysis (detecting “biorhythms”)  Split (constant until date X, then T) useful when something changed.  And(t1,t2) : Boolean conjunction of two factors Mi x[ 0. 97] ( T[ 2. 25- 2. 02/ - 1. 00] , wAvg[ " t ar get " ] ( 10, 1. 00) ) + Cor [ 0. 04] ( " t r ack2" +16)  
  • 7. Yves Caseau - Machine Learning for Self Tracking – February 2019 7/10 Distances and RegularizationDistances and Regularization Time-series operations are weighted  The weight of each measure is proportional to the distance to its next neighbor  Spaced measures are more important than repeated ones « Triangular distance »  The distance between two time series is the area between the two curves Regularization to avoid overfitting  Principle: add a penalty to the distance that reduces the overall standard deviation  best formula for this data set wDist(a,t) + max(0.0, stdev(a) – 0.02)
  • 8. Yves Caseau - Machine Learning for Self Tracking – February 2019 8/10 Randomized Incremental AlgorithmsRandomized Incremental Algorithms  Main algorithm is “Randomized Optimization” (RandOpt)  Create n random algebra terms  Combination of glutton heuristics (create the best possible term)  And randomization (coefficients / which sub-term to pick)  Depth is controlled with a global parameter  Optimized though local optimization  Each parameter of the algebra sub-terms (i.e, coefficient, delays, etc.) are optimized one by one  Hill-climbing local meta heuristics  Three successive rounds  This is used in an “incremental mode:  For each new measure  Reuse previous best term, and improve through local optimization  Run ”RandOpt” (100 iterations)  Keep best term  What has not worked out so far  Evolutionary (genetic algorithm with cross-over)  Mutation (large neighborgood local optimization)
  • 9. Yves Caseau - Machine Learning for Self Tracking – February 2019 9/10 Computational resultsComputational results  Average forecast is 16.88% (control = average is 17.5%)  Average square distance is 1.03 (worse than LR,ARMA or k-means) because of regularization  Strong measures against overfitting (regularization, depth, # local opt loops + techniques)
  • 10. Yves Caseau - Machine Learning for Self Tracking – February 2019 10/10 ConclusionConclusion Forecasting for self-tracking data is hard We presented a reinforcement generative machine learning that performs better than most classical techniques This is due to the complex nature of the data  On (classical) sales time series, ARMA does better than the proposed approach (close to LR)  Open question : how to detect the “intrinsic quality” of the quest and change the forecasting method / regularization parameters accordingly ?  You can download the data and try your own approaches  Forecasting is used to two purposes in our mobile app:  User experience : forecasting makes data entry faster + gives a sense of playfulness  Granger Causality : when the forecasting score is ”good”, this gives a sense of plausibility to the causal diagram hypothesis (represented by the “quest”)

Notes de l'éditeur

  1. CRITICAL : print the version with Notes !