2. Data is part of the fabric of the applications
Front-end and UX Mobile Back-end
and operations
Data and
analytics
3. Three types of data-driven development
Retrospective
analysis and
reporting
Amazon Redshift
Amazon RDS
Amazon S3
Amazon EMR
4. Three types of data-driven development
Retrospective
analysis and
reporting
Here-and-now
real-time processing
and dashboards
Amazon Kinesis
Amazon EC2
AWS Lambda
Amazon Redshift
Amazon RDS
Amazon S3
Amazon EMR
5. Three types of data-driven development
Retrospective
analysis and
reporting
Here-and-now
real-time processing
and dashboards
Predictions
to enable smart
applications
Amazon Kinesis
Amazon EC2
AWS Lambda
Amazon Redshift
Amazon RDS
Amazon S3
Amazon EMR
6. Machine learning and smart applications
Machine learning is the technology that
automatically finds patterns in your data and
uses them to make predictions for new data
points as they become available
7. Machine learning and smart applications
Machine learning is the technology that
automatically finds patterns in your data and
uses them to make predictions for new data
points as they become available
Your data + machine learning = smart applications
8. Smart applications by example
Based on what you
know about the user:
Will they use/buy
your product?
9. Smart applications by example
Based on what you
know about the user:
Will they use/buy
your product?
Based on what you
know about an order:
Is this order
fraudulent?
10. Smart applications by example
Based on what you
know about the user:
Will they use/buy
your product?
Based on what you
know about an order:
Is this order
fraudulent?
Based on what you know
about a news article:
What other articles are
interesting?
11. And a few more examples…
Fraud detection Detecting fraudulent transactions, filtering spam emails,
flagging suspicious reviews, …
Personalization Recommending content, predictive content loading,
improving user experience, …
Targeted marketing Matching customers and offers, choosing marketing
campaigns, cross-selling and up-selling, …
Content classification Categorizing documents, matching hiring managers and
resumes, …
Churn prediction Finding customers who are likely to stop using the
service, free-tier upgrade targeting, …
Customer support Predictive routing of customer emails, social media
listening, …
13. 1. You cannot code the rules
Use machine learning
to learn your business
rules from data.
14. 2. You cannot scale
Use machine learning
to build new scalable
value propositions.
15. Building a Smart Application
1. Define the problem 2. Collect Data 3. Shape Data
4. Train a model 5. Use model to
predict values based
on new input
Iterate!
23. Batch Predictions
Access data that is stored in S3,
Amazon Redshift, or MySQL databases
in RDS
Output predictions to S3 for easy
integration with your data flows
25. Automate and Iterate
Monitor performance of model
Gather more data
Continuously improve the
model
Automate using Amazon ML
APIs
26. C O - F O U N D E R & C T O @ K R Y
Joachim Hedenius
27. •Meet a doctor in your phone
•Founded 2014
•35 employees & 100+ doctors in
Sweden, Norway and Spain
•More than 100 000 users
•Currently doing 1% of Swedens
primary healthcare
•~40% monthly growth
28. C H A L L E N G E : S P E C I F I C V S . “ C A T C H A L L ” S Y M P T O M F O R M S
Specific form has
symptom specific follow-
up questions and info
texts
Catch all form only has
general questions and
info texts.
29. N E W F L O W - S U G G E S T I N G S Y M P T O M F O R M
30. Use Machine Learning?
•Hand-coding rules is hard and error-prone
•Text classification is a typical ML problem
•ML allows continuous automated improvement
31. Why AWS Machine Learning?
•Familiarity - Already running everything on AWS
•No heavy lifting - Amazon ML is fully managed
•Simplicity - Able to get results quickly
32. Training data
~15K usable examples mapped to 27 categories
Category Free text
headache “My head hurts”
cold_and_flu “My nose is blocked and i have a fever”
allergy “Several times after I had milk i get pains in my stomach”
rash “Little Gustaf has some red spots on both his lower arms.”
headache “I get terrible migraines from time to time. It often happens when …”
… …
33. Manual benchmark
•How good can a human perform
categorisation?
•Needed 3 attempts or less for 95% of
the descriptions
•What is a good result for a machine?
•Is that good enough?
Attempts Human Machine
1 76 % ?
2 92 % ?
3 95 % ?
Success rate
34. Create ML model
1. Uploaded training data to S3
2. Configured AWS ML model
3. Training + evaluation
4. Ready to do real time predictions
35. Evaluation - Human vs Computer
•Categorised same data with ML
endpoint
•Almost on par! Attempts Human Machine
1 76 % 67 %
2 92 % 87 %
3 95 % 93 %
Success rate
36. Evaluation drill down
•Generally guesses are correct
•Most “invalid” guesses due to
overlapping categories
•F1 score of 0,52 (baseline 0,01)
37. Lessons learnt
• Language support - What about Spanish & Norwegian?
• Automated tools - Saved time
• Language preprocessing - Made model 7% better
• Useful to get initial advice from experts - Then implemented easily
38. Next steps
• Release feature with A/B testing
• Pipeline for training data
• Doctor decision support
• Proactive health care - “KRY tells you when you are sick”