Machine Learning for Self-Tracking

Yves Caseau - Machine Learning for Self Tracking – February 2019 1/10
Machine Learning Heuristics for Short TimeMachine Learning Heuristics for Short Time
Series Forecasting with Quantified Self DataSeries Forecasting with Quantified Self Data
Yves Caseau
National Academy of Technologies

Self-Tracking and Knomee Mobile AppSelf-Tracking and Knomee Mobile App
 Knomee is a self-tracking mobile app for iOS (one of many
thousands)
 Knomee motto: « self-tracking with sense »
 Data science applied to self tracking
 Self-tracking apps generate time series
 One or many (up to 4) data points collected over a period of
time
 Data is either self-declared (the user picks a value in a preset
range) or automatically imported from a a connected device
(iPhone’s sensors, Apple watch or any HealthKit compatible
device like a a Withings scale)
 Data files are accessible on:
https://github.com/ycaseau/KnomeeQuest/tree/master/data
 20 samples
 Ranging from 40 to 220 measures (x 4)

Quests : Causal Diagrams are proposed by the userQuests : Causal Diagrams are proposed by the user
 Self-tracking is organized around causal diagrams
 A quest is made of a target tracker and up to three
factor trackers
 The user makes the hypothesis that the factors may
contribute to the target
 Using Judea Peal’s notation we look for: usal
 P(X | do(Y)) : impact of doing Y on X
 Detect causality through active experiments
 Correlation is not enough
 A quest is an hypothesis, not all quests are meaningful
 Factor causality is tricky (e.g. coffee as a symptom)
 How to tell if the effort on factors is « worth it » ?
Impact on the target
 Key property of self-tracking data:
some input is purely random
{quest:ENERGY, icloud:true,
energy:{
type:2, more:true,
min:1, max:6, target:4,
labels:[crisis, sleepy, lapses,
normal, energetic, hyper],},
sleep:{
type:7, more:true,
min:4, max:9, target:7,},
steps:{
type:4, more:true,
min:0, max:19000,
target:7000,},
weight:{
type:5, more:false,
min:75, max:82, target:78,},
}

Short Time-Series ForecastingShort Time-Series Forecasting
 Our goal in this talk : how to forecast values from self-tracking data ?
 Forecasting gives a possible clue about the value of the causal hypothesis
(Granger causality)
 We search for a robust method that does not break with random noise
 Measuring success: iterative training protocol
 For i in (2N/3 .. N), forecast TS[i] from (TS[1], …, TS[i - 1]
– Apply forecast to time[i]
– Measure average distance to real value TS[i]
– Compare to « average » performance
 Realistic simulation of what happens in the app
 Why it is hard:
 short samples (small data)
 mixed random inputs

Classical Methods yield poor resultsClassical Methods yield poor results
 Three classical ML algorithms, trained to
minimize distance, using implicit time
features and factors
 Linear Regression
 K-means Clustering (10 – 15 groups)
 ARMA (AutoRegressive Moving Average)
 Forecasting results are dispapointing
 The difficulty is not a surprise, we are
looking to extract a small amount of
information, only when present
 Improving a few % over average is the best
we can expect
 Overfitting very easily offsets the forecasting
gain
Linar
Regression
K-means ARMA
forecasting 18.34% 19.5% 18.9%
average 17.5% 17.5% 17.5%
Distance
(squares)
0.655 0.81 0.525
Random noise
Linked to factors
Linked to non-
collected factors Random noise
“good quest” “poor quest”
variation

A Term-Algebra of Heuristics CombinationsA Term-Algebra of Heuristics Combinations
 Heuristic toolbox
 MovingAverage – MA(k,discount)
 Trend (time linear regression)
 Weekly and Hourly patterns
 Factor regression with explicit delay
 CumSum (cumulative sum of differences to average)
 Threshold regression with delay
 Combined through a linear algebra
 Each term is a weighted combination of a few heuristics
 Some other heuristics provide improvement with some quests but are left aside for lack
of robustness
 Cycle analysis (detecting “biorhythms”)
 Split (constant until date X, then T)
useful when something changed.
 And(t1,t2) : Boolean conjunction of two factors
Mi x[ 0. 97] (
T[ 2. 25- 2. 02/ - 1. 00] ,
wAvg[ " t ar get " ] ( 10, 1. 00) )
+ Cor [ 0. 04] ( " t r ack2" +16)

Distances and RegularizationDistances and Regularization
Time-series operations are weighted
 The weight of each measure is proportional to the
distance to its next neighbor
 Spaced measures are more important than repeated
ones
« Triangular distance »
 The distance between two time series is the area
between the two curves
Regularization to avoid overfitting
 Principle: add a penalty to the distance that reduces
the overall standard deviation
 best formula for this data set
wDist(a,t) + max(0.0, stdev(a) – 0.02)

Randomized Incremental AlgorithmsRandomized Incremental Algorithms
 Main algorithm is “Randomized Optimization” (RandOpt)
 Create n random algebra terms
 Combination of glutton heuristics (create the best possible term)
 And randomization (coefficients / which sub-term to pick)
 Depth is controlled with a global parameter
 Optimized though local optimization
 Each parameter of the algebra sub-terms (i.e, coefficient, delays, etc.) are optimized one by one
 Hill-climbing local meta heuristics
 Three successive rounds
 This is used in an “incremental mode:
 For each new measure
 Reuse previous best term, and improve through local optimization
 Run ”RandOpt” (100 iterations)
 Keep best term
 What has not worked out so far
 Evolutionary (genetic algorithm with cross-over)
 Mutation (large neighborgood local optimization)

Computational resultsComputational results
 Average forecast is 16.88% (control = average is 17.5%)
 Average square distance is 1.03 (worse than LR,ARMA or k-means) because of regularization
 Strong measures against overfitting (regularization, depth, # local opt loops + techniques)

ConclusionConclusion
Forecasting for self-tracking data is hard
We presented a reinforcement generative
machine learning that performs better than
most classical techniques
This is due to the complex nature of the data
 On (classical) sales time series, ARMA does better than the proposed approach
(close to LR)
 Open question : how to detect the “intrinsic quality” of the quest and change the
forecasting method / regularization parameters accordingly ?
 You can download the data and try your own approaches 
Forecasting is used to two purposes in our mobile app:
 User experience : forecasting makes data entry faster + gives a sense of playfulness
 Granger Causality : when the forecasting score is ”good”, this gives a sense of
plausibility to the causal diagram hypothesis (represented by the “quest”)

Machine Learning for Self-Tracking

Contenu connexe

Similaire à Machine Learning for Self-Tracking

Plus de Yves Caseau

Dernier

Machine Learning for Self-Tracking

Notes de l'éditeur