4. Traditional ML - Challenges
Repetitive & time-
consuming
Resource-intensive Requires deep
statistical knowledge
5. Automated Machine Learning (AutoML)
Automation of time-
consuming processes
Incorporation of
best practices
Democratisation of
machine learning
6. Demo –
Titanic Dataset
Objective:
Predict if a passenger survived
Titanic’s maiden voyage based
on known parameters
Reference: https://www.kaggle.com/c/titanic/
Feature Definition Key
survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
sex Sex
Age Age in years
sibsp # of siblings / spouses
aboard the Titanic
parch # of parents / children
aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q =
Queenstown, S =
Southampton
7. Data Pre-processing
• SQL
• Python
• R
• Etc.
Languages
• ‘Unknown’ added for missing values
• Age converted to a range
• Fare converted to a range
Steps taken in Titanic Dataset
9. Iterations
• Runs consist of multiple
combinations of ML
algorithms & hyperparameters
• Stops when exit criteria
reached
• Deploy best model as web
service
10. Measurements
Metric Purpose Equation
Accuracy Indication of success-rate for true positives & true negatives (Tp + Tn) / (Tp + Tn + Fp + Fn)
Precision Indication of success-rate for true positives Tp / (Tp + Fp)
Recall Indication of impact of false negatives Tp / (Tp + Fn)
(Tn) (Fp)
(Tp)(Fn)
12. Hosting
• Test in an Azure Container Instance
(ACI) container
• Productionise using an Azure
Kubernetes Services (AKS) cluster
• Consume REST web-service in
applications
13. Drift Detection
• Data drift causes model
performance degradation
• Detect drift on timeseries data
• Set-up alerts in Application
Insights
Reference - https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-monitor-datasets
15. MLOps
Reference - https://github.com/Microsoft/MLOps
• Integrate Azure Machine
Learning with Azure DevOps
(AML extension)
• Enable collaboration between
Data Scientists and other
teams
• Set-up CI/CD pipelines using
an effecting branching
strategy
16. Conclusions
• AutoML brings data science activities to a wider audience
• It reduces repetitive processing time thereby freeing up time for
higher-level activities
• Microsoft Azure Machine Learning is constantly evolving