Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Forget becoming a Data Scientist, become a Machine Learning Engineer instead

48 vues

Publié le

Data Con LA 2020
Description
Machine learning is an essential skill in today's job market. But when it comes to learning Machine Learning, beginners get lot of conflicting advice. I have been teaching ML for software engineers for years. In this talk

*I will dis-spell some of the myths surrounding machine learning

*give you solid, tangible plan on how to go about learning ML

*and give you good pointers to start from

*and steer you away from common mistakes

Speaker
Sujee Maniyam, Elephant Scale, Founder, Principal instructor

Publié dans : Données & analyses
  • Soyez le premier à commenter

  • Soyez le premier à aimer ceci

Forget becoming a Data Scientist, become a Machine Learning Engineer instead

  1. 1. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Becoming a ML EngineerBecoming a ML Engineer 1
  2. 2. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Hi, I am Sujee Maniyam Founder / Principal @ Consult & teach AI, Data Science, Big Data and Cloud technologies Author - open source book for learning ML : open source book : Packt Publishing, 2015 : O'Reilly video course Contact: ElephantScale 'Guided Machine Learning' 'Hadoop illuminated' 'HBase Design Patterns' 'Data Analytics With Spark And Hadoop' sujee@elephantscale.com github.com/sujee ElephantScale.com https://www.linkedin.com/in/sujeemaniyam 2
  3. 3. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. About This Talk We will discuss: Understand what ML Engineering is How to become one More tinyurl.com/yydcn48b Download slides and sign up for a FREE MLEng class! 3
  4. 4. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Machine Learning Engineering 4
  5. 5. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. What is Machine Learning "The field of study that gives computers the ability to learn without being explicitly programmed." -- Arthur Samuel 5
  6. 6. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Traditional Programming vs. Machine Learning Here is an example of spam detection rule engine The rules are coded by developers There could be 100s of 1000s of rules! if (email.from_ip.one_of("ip1", "ip2", "ip3")) { result = "no-spam" } else if ( email.text.contains ("free loans", "cheap degrees")) { result = "spam" } 6
  7. 7. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Traditional Programming vs. Machine Learning Here is how we detect spam using ML We don't explicitly write rules Instead, we show the algorithm with spam and non- spam emails Algorithm 'learns' which attributes are indicative of spam Then algorithm predicts spam/no-spam on new email 7
  8. 8. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Machine Learning Process Machine learning is focused on building models Build model Test/evaluate the model Rinse/repeat Data Scientists focus on this Lot of this is done on a laptop (small scale) 8
  9. 9. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Productionizing Models 9
  10. 10. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. What is Machine Learning Engineering Machine Learning Engineering is the process of taking machine learning models to production Includes: Good software engineering practices data analytics and devops 10
  11. 11. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Demand for ML Engineer 11
  12. 12. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. ML Engineer Skill Set 12
  13. 13. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. ML Engineer Skill Set: AI A good ML engineer needs good understanding of machine learning and deep learning algorithms See next slide for explanation What if I don't know enough Math? Even though ML and DL are built on advanced math, we don't need deep understanding of the mathematical theories to use the algorithms Because the tools and algorithms have gotten so much better and easier to use Practical use of algorithms recommended 13
  14. 14. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. AI vs. Machine Learning :-) Source 14
  15. 15. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. AI / Machine Learning / Deep Learning Artificial Intelligence (AI): Broader concept of "making machines smart" Machine Learning: Current application of AI that machines learn from data using mathematical, statistical models Deep Learning: (Hot!) Using Neural Networks to solve some hard problems 15
  16. 16. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. From Laptop to Cloud Data Scientists might develop their model on their laptop Small scale data Smaller model Training the model at large scale, typically is done on cloud environment ML Engineer will handle this 16
  17. 17. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. ML Engineer Skill Set: Cloud Nowadays large scale training and deployment happens on the cloud Advantages of cloud: Easy to get started Flexible Pay as you use pricing Almost unlimited scale 17
  18. 18. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Which Cloud? Three major cloud vendors: Google Amazon Microsoft All of them have pretty good ML capabilities Choose the one that best suit your needs partnership deals team expertise 18
  19. 19. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Deciding ML Services Decide the spectrum of the service you'd like Based on desired control, flexibility and agility Renting infrastructure: Get a virtual machine with GPU and train our own model Renting a ML service: Use a pre-built model Say use a 'computer vision' model that is offered by cloud vendor This is basically 'ML as Service' 19
  20. 20. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. ML Engineer Skill Set: Big Data & Distributed Computing Training large scale models may use large amount of data And training can be computationally intensive For example, let's say working with 1GB data on a laptop takes 1 hr How about we have 1TB of data? it will definitely not fit into laptop's memory We would need to do it distributed on a cluster of machines 20
  21. 21. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Distributed Computing In distributed computing, data and computing are distributed across many nodes Tools for distributed computing Apache Spark (Open source, very popular, cloud neutral) AWS Lambda (serverless compute) Google BigQuery (SQL at scale) 21
  22. 22. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Model Serving Here is an example of model serving at scale The system has to scale up and down based on load If some nodes or applications crash, they needed to restarted automatically The application is packaged as containers 22
  23. 23. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. ML Engineer Skill Set: DevOps Deploying applications that are fault tolerant and work at scale requires modern DevOps Tools of trade: Docker: Package applications as containers Kubernetes: Deploy and manage containers, specially in the cloud Kubeflow: Kubernetes for Machine Learning Monitoring and Logging: Various tools 23
  24. 24. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. ML Engineer Learning Path 24
  25. 25. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Some Resources To Get You Started - a self study guide for learning ML Sign up, we meet every Saturday 11am PST tinyurl.com/yydcn48b Download slides and sign up for a FREE MLEng class! Guided Machine Learning 25
  26. 26. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Further Reading Books - by Chris Fregly, Antje Barth - by Anirudh Koul, Siddha Ganju, Meher Kasam - by Trevor Grant, Holden Karau, Boris Lublinsky, Richard Liu, Ilan Filonenko Websites / Blogs Data Science on AWS Practical Deep Learning for Cloud, Mobile, and Edge Kubeflow for Machine Learning www.datascienceonaws.com/ 26
  27. 27. Copyright (c) 2020 Elephant Scale Inc. All rights reserved. Q&A & Thanks! Any questions? tinyurl.com/yydcn48b 27

×