SlideShare a Scribd company logo
1 of 45
Download to read offline
#DATACAREER
FIRST AND SECOND #DATAROLES IN THE INDUSTRY
AND STARTUPS
SPECIAL EDITION WITH ADAM GREEN
FROM AIGUILD.EVENTBRITE.COM
#DATACAREER
“No matter who you are, self-improvement is
one of the most important and most
overlooked attributes of young AI talent. It
only takes four years of experience to become
a senior AI researcher, or five years of
experience to lead an entire institute. The
determination and discipline to improve both
the hard and soft skills continually will be the
deciding factor in an AI researcher’s career.”
Jean-François Gagné
Dânia Meira
Founding member, AI Guild
ML models for predictive analytics
Former bootcamp teacher
#datacareer since 2012
LinkedIn
Adam Green
Founding member, AI Guild
Senior data scientist
Former bootcamp director
Focus on energy industry
LinkedIn
Chris Armbruster
Founding member, AI Guild
10,000 Data Scientists for Europe
Former bootcamp director
#datacareer coaching since 2017
LinkedIn
#DATACAREER
WORKSHOP
OUTLINE
AI Guild career
coaching
#dataroles
specialization
#dataroles
upgrading
#datacareer
orientation
AI GUILD CAREER
COACHING
Running for junior and for senior
practitioners since early 2019
Runs monthly for AI Guild members
Coaching capacity per year: 240
participants
INSIGHTS FROM
CAREER
COACHING
Search for the 1st as well as the 2nd role may
take >6 months
Upgrading inside a company may be easier
Job advertisements may be misleading and
confusing
The role ‘in real life’ may not match the talents
expectations
OBSERVING THE
MARKET
Specialization and differentiation of roles
Rising value of domain expertise
Experimental phase with PoC plays ending
Increasing focus on deployment
ANECDOTAL EVIDENCE
FOR DIGITAL ADOPTION
AND BEHAVIOR
INCREASING 10X
… but labor market
admittedly very
difficult
#DATACAREER
WORKSHOP
OUTLINE #dataroles
specialization
#dataroles
upgrading
#datacareer
orientation
PRODUCTIONIZING MACHINE LEARNING
ML
Models
Data Collection Data Quality
Infrastructure
Process Management
Tools
Machine
Resource
Management
Monitoring
Configuration
Feature
Extraction
Analysis
Data Preprocessing
Parameter
Configuration
Offline
Validation
Business
Logic
A/B Testing
Data
Engineer
Data
Scientist
Data
Analyst
ML Engineer
AI
Researcher
#dataroles
See also: “Hidden Technical Debt in Machine Learning System” by Sculley et al, Google inc, 2015
#DATAROLES
Task Understand business case,
build features to train predictive
models to address such use
cases
Skill Statistics, SQL,
programming (e.g. python, R),
ML & DL techniques.
Data Scientist
Task Business and data under-
standing to report on what
happens
Skill Descriptive analytics, SQL,
statistics, dashboarding and
visualization tools
Data Analyst Data Engineer
Task Build and maintain infra-
structure and pipeline to collect,
clean and pre-process data
Skill Distributed systems,
databases, software engineering
Task Optimize, deploy and
maintain machine learning
models in production
Skill Software engineering,
devops and systems architecture
Machine Learning
Engineer
Task Build new machine learning
algorithms, find custom scientific
solutions
Skill Research, presenting at
conferences, writing publications
AI Researcher
‚COOKING‘ DATA: EXPLAINING SPECIALIZATION
ML
Models
Data Collection Data Quality
Infrastructure
Process Management
Tools
Machine
Resource
Management
Monitoring
Configuration
Feature
Extraction
Analysis
Data Preprocessing
Parameter
Configuration
Offline
Validation
Business
Logic
A/B Testing
See also: Understanding a Machine Learning Workflow Through Food by Daniel Godoy
Sowing Harvesting Choose recipe
Prepare ingredients
Customers tasting
Kitchen Tasting
Use utensils
Try combinations of
appliances and recipes
Kitchen space and available appliances
UNDERSTANDING #DATAROLES
Build Kitchen Appliances
Create and use recipes to cook
Check quality of ingredients and recipes
Process ingredients at scale
Turn a recipe into many dishes served efficiently
Data
Engineer
Data
Scientist
Data
Analyst
ML Engineer
AI
Researcher
ADAM’S CASE: COMING INTO DATA FROM INDUSTRY
Chemical engineering and black
box modelling
Working as energy engineer with
spreadsheets and linear
programming
From bootcamp graduate to
director
ADAM’S TAKE ON #DATAROLES
THE DATA ANALYST
ENRICHES DATA
THE DATA SCIENTIST MAKES
PREDICTIONS
THE DATA ENGINEER
ENABLES ACCESS TO DAATA
WHAT IS GREAT
ABOUT BEING A
DATA SCIENTIST
¡ A never-ending story of learning
¡ Tooling is free
¡ Lots of freely accessible data
¡ Leverage of the technology
¡ The variety of non-traditional and
interdisciplinary routes into the field
¡ Future proof
¡ People are excited and interested in
what you do
¡ Many interesting life lessons
WHAT IS GETTING
EASIER
¡ Tooling
¡ Putting code into
production
¡ Differentiation of roles
WHAT IS STILL
DIFFICULT
¡ Knowing where to stop learning
¡ Mastering new algorithms
¡ Keeping up with research
¡ Dealing with the impostor
syndrome
¡ Access to simulators
¡ APIs and libraries for
Reinforcement Learning
#DATACAREER
WORKSHOP
OUTLINE
#dataroles upgrading #datacareer orientation
LET‘S START FROM
YOUR ‚USERS‘ AND
‚CUSTOMERS‘
¡ Hiring managers
¡ Human resources
¡ Recruiters
¡ Network of friends and
colleagues
¡ Company leaders
DATA ENGINEER
SQL, Bash, Java, Scala, Python
Hadoop: Hive, Pig, Spark
Databases e.g. Microsoft SQL, PostgreSQL, MongoDB
Platforms: AWS, Google Cloud Platform, Microsoft Azure, Linux
Tools: git, docker, airflow, Jenkins
Language specific skills are important, also for ETL and databases. Certifications with AWS, Google, or
Cloudera may be relevant.
Key topics include data pipelines, algorithms and data structures, and the understanding of system
design.
DATA ANALYST
SQL, Excel
Visualization tools like Tableau
Python/R packages like matplotlib, seaborn, ggplot2
Key topics include statistical knowledge, data analysis, data interpretation, and logical approach.
DATA SCIENTIST
SQL, Bash
R: dplyr, sqldf, tidyr, lubridate, shiny, ggplot2, MLR, ranger, xgboost
Python: numpy, pandas, matplotlib, scikit-learn, keras,
Hadoop: Hive, Pig, Spark
Databases: Microsoft SQL, PostgreSQL, MongoDB
Tools: git, jupyter notebook, docker
Models & algorithms: Statistical models and distributions, linear and logistic regression, random forest,
backpropagation, ARIMA, Natural Language Processing, Computer Vision
WHERE ARE WE TODAY?
ML IS WIDELY DEPLOYED AND THE
PRACTICE DEVELOPING CREATIVELY
MORE AND MORE INDUSTRIES ARE
PROGRESSING FROM DIGITAL TO
DATA AND ARTIFICIAL INTELLIGENCE
VALIDATING THE BUSINESS CASE IS
KEY
#DATACAREER
WORKSHOP
OUTLINE
#datacareer orientation
MARKET ORIENTATION
WINNING BIG
AND SMALL
KEY INDUSTRY
CHALLENGES*
¡ Data volume, accessibility, and
quality
¡ Trust of customers,
stakeholders, and employees,
including governance,
compliance, and reputation
¡ Competence of employees,
management, and company
*Based on the 2019 PWC report “Künstliche
Intelligenz in Unternehmen”, p. 12
SOME STARTUP
CHALLENGES
• Data volume, accessibility, and
quality
• Company funding and runway
• Expertise levels and team size
STARTUP
MARKET MAP
EMPLOYER SKILLS GAPS
…AMONG
EARLY AI
ADOPTERS IN
THE UNITED
STATES
…IN
CORPORATE
EUROPE
…IN GERMAN
INDUSTRY
…AMONG AI
PLAYERS IN
GERMANY
PROMISING #AIUSECASES
…AMONG EARLY
ADOPTERS IN
THE USA
….AMONG AI
PLAYERS IN
GERMANY
….PROSPECTIVE
USE CASES IN
GERMANY
WRAPPING UP
Keep observing the market
Look for matches between employers’
needs and your skills profile
Scan the industry and startups for the
most promising #aiusecase
THANK YOU
Join at
theguild.ai/community

More Related Content

What's hot

Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Dr. Mohan K. Bavirisetty
 

What's hot (20)

KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit HamutcuKDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
KDD 2019 IADSS Workshop - Research Updates from Usama Fayyad & Hamit Hamutcu
 
Red threadresearch excx_medallia experience_final
Red threadresearch excx_medallia experience_finalRed threadresearch excx_medallia experience_final
Red threadresearch excx_medallia experience_final
 
Data scientist the sexiest job of the 21st century by thomas h davenport and ...
Data scientist the sexiest job of the 21st century by thomas h davenport and ...Data scientist the sexiest job of the 21st century by thomas h davenport and ...
Data scientist the sexiest job of the 21st century by thomas h davenport and ...
 
What companies hiring data scientists and hadoop developers are looking for?
What companies hiring data scientists and hadoop developers are looking for?What companies hiring data scientists and hadoop developers are looking for?
What companies hiring data scientists and hadoop developers are looking for?
 
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
Data Scientist Roles and Responsibilities | Data Scientist Career | Data Scie...
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
Data science
Data scienceData science
Data science
 
2016 Data Science Salary Survey
2016 Data Science Salary Survey2016 Data Science Salary Survey
2016 Data Science Salary Survey
 
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
Data Analyst vs Data Engineer vs Data Scientist | Data Analytics Masters Prog...
 
Data Science Salon: Building a Data Science Culture
Data Science Salon: Building a Data Science CultureData Science Salon: Building a Data Science Culture
Data Science Salon: Building a Data Science Culture
 
Big Data, Predictive Analytics 2nd of Sept 2014 Copenhagen
Big Data, Predictive Analytics 2nd of Sept 2014 CopenhagenBig Data, Predictive Analytics 2nd of Sept 2014 Copenhagen
Big Data, Predictive Analytics 2nd of Sept 2014 Copenhagen
 
Different Career Paths in Data Science
Different Career Paths in Data ScienceDifferent Career Paths in Data Science
Different Career Paths in Data Science
 
MIBA - MSc in Business Analytics
MIBA - MSc in Business AnalyticsMIBA - MSc in Business Analytics
MIBA - MSc in Business Analytics
 
KDD 2019 IADSS Workshop - Leveraging data and analytics for company results -...
KDD 2019 IADSS Workshop - Leveraging data and analytics for company results -...KDD 2019 IADSS Workshop - Leveraging data and analytics for company results -...
KDD 2019 IADSS Workshop - Leveraging data and analytics for company results -...
 
EE to Data Science - Why and How of the Pivot
EE to Data Science - Why and How of the PivotEE to Data Science - Why and How of the Pivot
EE to Data Science - Why and How of the Pivot
 
Planning Your Data Science Projects
Planning Your Data Science ProjectsPlanning Your Data Science Projects
Planning Your Data Science Projects
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 
Edge brochure solutions
Edge brochure solutionsEdge brochure solutions
Edge brochure solutions
 
1140 track 1 weiss_using his mac
1140 track 1 weiss_using his mac1140 track 1 weiss_using his mac
1140 track 1 weiss_using his mac
 
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
Advanced Analytics - Frameworks, Platforms and Metholodologies v 1.0
 

Similar to #Datacaeer - AI Guild workshop on data roles in industry with Adam Green

Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Simplilearn
 
MLconf NYC Ted Willke
MLconf NYC Ted WillkeMLconf NYC Ted Willke
MLconf NYC Ted Willke
MLconf
 

Similar to #Datacaeer - AI Guild workshop on data roles in industry with Adam Green (20)

Hiring for data roles - Adwait Bhave (ML Engineer and Data Scientist at Druva
Hiring for data roles - Adwait Bhave (ML Engineer and Data Scientist at DruvaHiring for data roles - Adwait Bhave (ML Engineer and Data Scientist at Druva
Hiring for data roles - Adwait Bhave (ML Engineer and Data Scientist at Druva
 
Rahat Yasir: Enterprise Data & AI Strategy & Platform Designing
Rahat Yasir: Enterprise Data & AI Strategy & Platform DesigningRahat Yasir: Enterprise Data & AI Strategy & Platform Designing
Rahat Yasir: Enterprise Data & AI Strategy & Platform Designing
 
Rahat Yasir: Enterprise Data & AI Strategy & Platform Designing
Rahat Yasir: Enterprise Data & AI Strategy & Platform DesigningRahat Yasir: Enterprise Data & AI Strategy & Platform Designing
Rahat Yasir: Enterprise Data & AI Strategy & Platform Designing
 
MLOps - The Assembly Line of ML
MLOps - The Assembly Line of MLMLOps - The Assembly Line of ML
MLOps - The Assembly Line of ML
 
Big Data for Data Scientists - Info Session
Big Data for Data Scientists - Info SessionBig Data for Data Scientists - Info Session
Big Data for Data Scientists - Info Session
 
Data science 101 Masterclass
Data science 101 MasterclassData science 101 Masterclass
Data science 101 Masterclass
 
Microsoft cloud big data strategy
Microsoft cloud big data strategyMicrosoft cloud big data strategy
Microsoft cloud big data strategy
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 
How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists How Cloud is Affecting Data Scientists
How Cloud is Affecting Data Scientists
 
Data engineering design patterns
Data engineering design patternsData engineering design patterns
Data engineering design patterns
 
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...Developing and deploying AI solutions on the cloud using Team Data Science Pr...
Developing and deploying AI solutions on the cloud using Team Data Science Pr...
 
MLOps Using MLflow
MLOps Using MLflowMLOps Using MLflow
MLOps Using MLflow
 
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
Data Scientist Salary, Skills, Jobs And Resume | Data Scientist Career | Data...
 
Artificial Intelligence As a Service
Artificial Intelligence As a ServiceArtificial Intelligence As a Service
Artificial Intelligence As a Service
 
BBBT Watson Data Platform Presentation
BBBT Watson Data Platform PresentationBBBT Watson Data Platform Presentation
BBBT Watson Data Platform Presentation
 
Architecting an Open Source AI Platform 2018 edition
Architecting an Open Source AI Platform   2018 editionArchitecting an Open Source AI Platform   2018 edition
Architecting an Open Source AI Platform 2018 edition
 
DevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleDevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-Oracle
 
MLconf NYC Ted Willke
MLconf NYC Ted WillkeMLconf NYC Ted Willke
MLconf NYC Ted Willke
 
Graph Databases – Benefits and Risks
Graph Databases – Benefits and RisksGraph Databases – Benefits and Risks
Graph Databases – Benefits and Risks
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field Experience
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

#Datacaeer - AI Guild workshop on data roles in industry with Adam Green

  • 1. #DATACAREER FIRST AND SECOND #DATAROLES IN THE INDUSTRY AND STARTUPS SPECIAL EDITION WITH ADAM GREEN FROM AIGUILD.EVENTBRITE.COM
  • 2. #DATACAREER “No matter who you are, self-improvement is one of the most important and most overlooked attributes of young AI talent. It only takes four years of experience to become a senior AI researcher, or five years of experience to lead an entire institute. The determination and discipline to improve both the hard and soft skills continually will be the deciding factor in an AI researcher’s career.” Jean-François Gagné
  • 3.
  • 4. Dânia Meira Founding member, AI Guild ML models for predictive analytics Former bootcamp teacher #datacareer since 2012 LinkedIn
  • 5. Adam Green Founding member, AI Guild Senior data scientist Former bootcamp director Focus on energy industry LinkedIn
  • 6. Chris Armbruster Founding member, AI Guild 10,000 Data Scientists for Europe Former bootcamp director #datacareer coaching since 2017 LinkedIn
  • 8.
  • 9. AI GUILD CAREER COACHING Running for junior and for senior practitioners since early 2019 Runs monthly for AI Guild members Coaching capacity per year: 240 participants
  • 10. INSIGHTS FROM CAREER COACHING Search for the 1st as well as the 2nd role may take >6 months Upgrading inside a company may be easier Job advertisements may be misleading and confusing The role ‘in real life’ may not match the talents expectations
  • 11. OBSERVING THE MARKET Specialization and differentiation of roles Rising value of domain expertise Experimental phase with PoC plays ending Increasing focus on deployment
  • 12. ANECDOTAL EVIDENCE FOR DIGITAL ADOPTION AND BEHAVIOR INCREASING 10X … but labor market admittedly very difficult
  • 14. PRODUCTIONIZING MACHINE LEARNING ML Models Data Collection Data Quality Infrastructure Process Management Tools Machine Resource Management Monitoring Configuration Feature Extraction Analysis Data Preprocessing Parameter Configuration Offline Validation Business Logic A/B Testing Data Engineer Data Scientist Data Analyst ML Engineer AI Researcher #dataroles See also: “Hidden Technical Debt in Machine Learning System” by Sculley et al, Google inc, 2015
  • 15. #DATAROLES Task Understand business case, build features to train predictive models to address such use cases Skill Statistics, SQL, programming (e.g. python, R), ML & DL techniques. Data Scientist Task Business and data under- standing to report on what happens Skill Descriptive analytics, SQL, statistics, dashboarding and visualization tools Data Analyst Data Engineer Task Build and maintain infra- structure and pipeline to collect, clean and pre-process data Skill Distributed systems, databases, software engineering Task Optimize, deploy and maintain machine learning models in production Skill Software engineering, devops and systems architecture Machine Learning Engineer Task Build new machine learning algorithms, find custom scientific solutions Skill Research, presenting at conferences, writing publications AI Researcher
  • 16. ‚COOKING‘ DATA: EXPLAINING SPECIALIZATION ML Models Data Collection Data Quality Infrastructure Process Management Tools Machine Resource Management Monitoring Configuration Feature Extraction Analysis Data Preprocessing Parameter Configuration Offline Validation Business Logic A/B Testing See also: Understanding a Machine Learning Workflow Through Food by Daniel Godoy Sowing Harvesting Choose recipe Prepare ingredients Customers tasting Kitchen Tasting Use utensils Try combinations of appliances and recipes Kitchen space and available appliances
  • 17. UNDERSTANDING #DATAROLES Build Kitchen Appliances Create and use recipes to cook Check quality of ingredients and recipes Process ingredients at scale Turn a recipe into many dishes served efficiently Data Engineer Data Scientist Data Analyst ML Engineer AI Researcher
  • 18. ADAM’S CASE: COMING INTO DATA FROM INDUSTRY Chemical engineering and black box modelling Working as energy engineer with spreadsheets and linear programming From bootcamp graduate to director
  • 19. ADAM’S TAKE ON #DATAROLES THE DATA ANALYST ENRICHES DATA THE DATA SCIENTIST MAKES PREDICTIONS THE DATA ENGINEER ENABLES ACCESS TO DAATA
  • 20. WHAT IS GREAT ABOUT BEING A DATA SCIENTIST ¡ A never-ending story of learning ¡ Tooling is free ¡ Lots of freely accessible data ¡ Leverage of the technology ¡ The variety of non-traditional and interdisciplinary routes into the field ¡ Future proof ¡ People are excited and interested in what you do ¡ Many interesting life lessons
  • 21. WHAT IS GETTING EASIER ¡ Tooling ¡ Putting code into production ¡ Differentiation of roles
  • 22. WHAT IS STILL DIFFICULT ¡ Knowing where to stop learning ¡ Mastering new algorithms ¡ Keeping up with research ¡ Dealing with the impostor syndrome ¡ Access to simulators ¡ APIs and libraries for Reinforcement Learning
  • 24. LET‘S START FROM YOUR ‚USERS‘ AND ‚CUSTOMERS‘ ¡ Hiring managers ¡ Human resources ¡ Recruiters ¡ Network of friends and colleagues ¡ Company leaders
  • 25. DATA ENGINEER SQL, Bash, Java, Scala, Python Hadoop: Hive, Pig, Spark Databases e.g. Microsoft SQL, PostgreSQL, MongoDB Platforms: AWS, Google Cloud Platform, Microsoft Azure, Linux Tools: git, docker, airflow, Jenkins Language specific skills are important, also for ETL and databases. Certifications with AWS, Google, or Cloudera may be relevant. Key topics include data pipelines, algorithms and data structures, and the understanding of system design.
  • 26. DATA ANALYST SQL, Excel Visualization tools like Tableau Python/R packages like matplotlib, seaborn, ggplot2 Key topics include statistical knowledge, data analysis, data interpretation, and logical approach.
  • 27. DATA SCIENTIST SQL, Bash R: dplyr, sqldf, tidyr, lubridate, shiny, ggplot2, MLR, ranger, xgboost Python: numpy, pandas, matplotlib, scikit-learn, keras, Hadoop: Hive, Pig, Spark Databases: Microsoft SQL, PostgreSQL, MongoDB Tools: git, jupyter notebook, docker Models & algorithms: Statistical models and distributions, linear and logistic regression, random forest, backpropagation, ARIMA, Natural Language Processing, Computer Vision
  • 28. WHERE ARE WE TODAY? ML IS WIDELY DEPLOYED AND THE PRACTICE DEVELOPING CREATIVELY MORE AND MORE INDUSTRIES ARE PROGRESSING FROM DIGITAL TO DATA AND ARTIFICIAL INTELLIGENCE VALIDATING THE BUSINESS CASE IS KEY
  • 32. KEY INDUSTRY CHALLENGES* ¡ Data volume, accessibility, and quality ¡ Trust of customers, stakeholders, and employees, including governance, compliance, and reputation ¡ Competence of employees, management, and company *Based on the 2019 PWC report “Künstliche Intelligenz in Unternehmen”, p. 12
  • 33. SOME STARTUP CHALLENGES • Data volume, accessibility, and quality • Company funding and runway • Expertise levels and team size
  • 44. WRAPPING UP Keep observing the market Look for matches between employers’ needs and your skills profile Scan the industry and startups for the most promising #aiusecase