Adyen enables integrating companies to accept payments from their customers using any payment method over any sales channel. We have designed and implemented a time series forecasting algorithm that allows us to predict the volume for each integration with confidence and thus be able to flag anomalies such as traffic drop or abnormally low traffic. We are using Apache Spark as our computational engine both to make this data available to the training process as well as to train over years of data in a scalable way. The prediction performances are benchmarked and the models are served in production through custom real-time monitoring and alerting infrastructure that uses ElasticSearch as hot storage. With this state-of-the-art solution, Adyen knows whether a problem happened and can alert the operational teams accordingly in a record time.
‘This presentation will cover the journey we took with focus on the mathematical concepts, the present time constraints, the prediction performances, and the architecture needed to make this happen. We’ll go over lessons learned, pitfalls, and best practices discovered on modeling time series datasets with Apache Spark. Data Scientists would be able to gain insights on applying effective and real-life seasonality modeling techniques. We’ll share our approaches used for sub-millisecond model serving that would inspire Data Engineers who work on related problems.
11. OK, but
What is an anomaly?
No luxury of a labelled dataset, divergence
of opinions.
Connecting to a live platform without
ML deployment hooks ready.
We were working on MLflow but not there yet.
No standard for timeseries forecasting at scale
With spark, several choices.
12. Considerations when dealing with Big Data
Big Technology
Leverage on mature Tech to
solve the problem (hello Spark).
Big diversity
Many different topologies for
our merchants and yet one
algorithm to track them all.
Big consequences
1000 merchants * 10 min * 95%
accuracy = 50400 emails/week
17. Scoring in Java
While working on a fully functional engine to
deploy ML models based on MLflow.
Launch fast and iterate!
Transporting the model
The model transported for tens of thousands of
accounts needs to be lightweight.
Harness the maths
No using blackboxed models, equations need to
be understood and replicated in Java.
Needs to perform fast
Score and decide whether our seen traffic form
ElasticSearch is actually anomalous on the ms
scale.
20. Fourier
components
Would not
optimise the
business cycles
ARIMA
Not perfect for
picking up
seasonality
Isolation Forests
Great for
multidimensional
data, not so much
for time series.
Autoencoders
Good luck
transporting the
model for each
merchant.
XGBM
Noice, but score
that in Java.
Research stage
Understand a problem and build a solution, decide what’s best.
21. Ridge Regression
Makes scoring in Java nice and
kinda easy.
Residuals
Confidence intervals modelled
through quantile regression of
observed values.
Events
Recurrent or one-off events are
shown to the model.
Piece-wise linear trends
Breaks down the signal into pieces
and learn the last trends.
Gaussian Basis Functions
Allow us to teach the model to
understand business cycles
The model
Discover anomalous behaviour
based on a probability p.
Pre-sampling
Allow us to sample and bucketize
the merchants to adequate
intervals.
22. Ridge Regression
Makes scoring in Java nice and
kinda easy.
Residuals
Confidence intervals modelled
through quantile regression of
observed values.
Events
Recurrent or one-off events are
shown to the model.
Piece-wise linear trends
Breaks down the signal into pieces
and learn the last trends.
Gaussian Basis Functions
Allow us to teach the model to
understand business cycles
The model
Discover anomalous behaviour
based on a probability p.
Pre-sampling
Allow us to sample and bucketize
the merchants to adequate
intervals.
23. Ridge Regression
Makes scoring in Java nice and
kinda easy.
Residuals
Confidence intervals modelled
through quantile regression of
observed values.
Events
Recurrent or one-off events are
shown to the model.
Piece-wise linear trends
Breaks down the signal into pieces
and learn the last trends.
Gaussian Basis Functions
Allow us to teach the model to
understand business cycles
The model
Discover anomalous behaviour
based on a probability p.
Pre-sampling
Allow us to sample and bucketize
the merchants to adequate
intervals.
24. Ridge Regression
Makes scoring in Java nice and
kinda easy.
Residuals
Confidence intervals modelled
through quantile regression of
observed values.
Events
Recurrent or one-off events are
shown to the model.
Piece-wise linear trends
Breaks down the signal into pieces
and learn the last trends.
Gaussian Basis Functions
Allow us to teach the model to
understand business cycles
The model
Discover anomalous behaviour
based on a probability p.
Pre-sampling
Allow us to sample and bucketize
the merchants to adequate
intervals.
25. Ridge Regression
Makes scoring in Java nice and
kinda easy.
Residuals
Confidence intervals modelled
through quantile regression of
observed values.
Events
Recurrent or one-off events are
shown to the model.
Piece-wise linear trends
Breaks down the signal into pieces
and learn the last trends.
Gaussian Basis Functions
Allow us to teach the model to
understand business cycles
The model
Discover anomalous behaviour
based on a probability p.
Pre-sampling
Allow us to sample and bucketize
the merchants to adequate
intervals.
26. Ridge Regression
Makes scoring in Java nice and
kinda easy.
Residuals
Confidence intervals modelled
through quantile regression of
observed values.
Events
Recurrent or one-off events are
shown to the model.
Piece-wise linear trends
Breaks down the signal into pieces
and learn the last trends.
Gaussian Basis Functions
Allow us to teach the model to
understand business cycles
The model
Discover anomalous behaviour
based on a probability p.
Pre-sampling
Allow us to sample and bucketize
the merchants to adequate
intervals.
27. Ridge Regression
Makes scoring in Java nice and
kinda easy.
Residuals
Confidence intervals modelled
through quantile regression of
observed values.
Events
Recurrent or one-off events are
shown to the model.
Piece-wise linear trends
Breaks down the signal into pieces
and learn the last trends.
Gaussian Basis Functions
Allow us to teach the model to
understand business cycles
The model
Discover anomalous behaviour
based on a probability p.
Pre-sampling
Allow us to sample and bucketize
the merchants to adequate
intervals.
28. Ridge Regression
Makes scoring in Java nice and
kinda easy. easy
Residuals
Confidence intervals modelled
through quantile regression of
observed values.
Events
Recurrent or one-off events are
shown to the model.
Piece-wise linear trends
Breaks down the signal into pieces
and learn the last trends.
Gaussian Basis Functions
Allow us to teach the model to
understand business cycles
The model
Discover anomalous behaviour
based on a probability p.
Pre-sampling
Allow us to sample and bucketize
the merchants to adequate
intervals.
32. The implementation
on Spark
How did we get there, on the Spark side.
Reusability
Overloads of scikit-learns and pandas allow us to
ensure reusability
Cross-validation
Ensure the best tuning through tuning of
hyperparameters.
Scalability
Using Spark’s map-reduce paradigm we totally
control the computational performances.
34. Input daily time series —> {t:[…], v:[…]}
Collect to list —> [{t:[…], v:[…]}]
Hinges and Hyperparameters
Distribute UDF
Making it happen at scale
35. Cross-validation
F4-sampling score: favours higher sampling
considering classical precision and recall.
Custom cv folds split TimeSeriesWeekSplit get the
sense of the business cycle
39. Overcoming
unsupervised
learning
Alarm rate and synthetic recall allow us to
know for each case how many alarms would
have been captured and raised, even without
having a labelled dataset.
40. Trade-off alarm
rates and recall
We provide a number of choices (95%, 97%,
99% probability and completely profile what to
expect in terms of anomalies.