SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
Building up a
Data Science Team
Amadeus Magrabi
Lead Data Scientist, commercetools
Plan for today
All Rights Reserved @2019 2
• Background:
o Me
o commercetools
o Data Science @ commercetools
• What I learned about building up a data science team
o Hiring
o Communication
o Choosing projects
o Choosing tools
• Questions
Background
All Rights Reserved @2019 3
Me:
• Studied cognitive science
• Research in neuroscience/ML for 3 years
• Working in data science since 3 years
commercetools:
• Offers e-commerce software via cloud-based APIs
• Founded in 2006
• 150+ employees
• Main offices in Berlin, Munich, Durham (US)
All Rights Reserved @2019 4
Data Science @ commercetools:
Image Similarity Search
Search Image Prediction 1 Prediction 2 Prediction 3
All Rights Reserved @2019 5
Data Science @ commercetools:
Category Recommendations
All Rights Reserved @2019 6
Data Science @ commercetools
Team structure:
• 3 data scientists
• 2 software engineers
• 1 product owner
• 1 working student
Team output:
• For merchants:
o APIs that make it easier to manage data and improve data quality.
• For customers:
o APIs that enable innovative data-based features.
• For colleagues:
o Make internal company processes more data-driven,
more efficient and more accurate.
All Rights Reserved @2019 7
Lessons learned from building up
the data science team at
commercetools
All Rights Reserved @2019 8
Lesson 1:
Job titles in data science are vague and unreliable.
• Data Scientist:
Vague combination of knowledge about statistics, machine learning,
software engineering and business domains.
• Machine Learning Engineer:
Focus on building ML-based software products.
• Data Analyst:
Focus on statistics, visualizations and business insights.
• Data Engineer:
Manages ETL pipelines to make data accessible and maintain data quality.
• Software Engineer:
Ensures code is efficient, scalable and maintainable.
• Machine Learning Researcher:
Focus on publishing research papers.
• Machine Learning Product Owner:
Communicates with stakeholders to define requirements and priorities.
Software
Engineering
Statistics/
Machine
Learning
Business
Knowledge
Data
Science
All Rights Reserved @2019 9
Lesson 2:
Do not try to find someone who can do everything.
Build a balanced and autonomous team instead.
Software
Engineering
Statistics/
Machine
Learning
Business
Knowledge
Data
Science
generalist
specialist
unicorn
All Rights Reserved @2019 10
Lesson 3:
Prioritize simplicity over business impact for the first projects.
vs.
All Rights Reserved @2019 11
Lesson 4:
Talk to everyone and set the right expectations.
• Talk to people across the whole organization
to identify the most valuable data projects.
• Find data enthusiasts that support your initiatives
and convince skeptics.
• Over-communication is better than under-communication.
• Avoid setting overly optimistic expectations
(better to under-promise and over-deliver).
All Rights Reserved @2019 12
Lesson 5:
Find a good balance between flexibility and consistency of tools.
Flexibility makes
prototyping and
hiring easier.
Consistency makes
collaboration and
productionizing
easier.
All Rights Reserved @2019 13
Lesson 6:
Data science is a long-term investment.
rare
more common
Thanks for listening!
Summary
1. Job titles in data science are vague and unreliable.
2. Build a balanced and autonomous team.
3. Prioritize simplicity over business impact for the first projects.
4. Talk to everyone and set the right expectations.
5. Find a good balance between flexibility and consistency of tools.
6. Data science is a long-term investment.
Social media
twitter.com/AmadeusMagrabi
linkedin.com/in/amadeusmagrabi
medium.com/@amadeus.magrabi

Contenu connexe

Tendances

Data science as a professional career
Data science as a professional careerData science as a professional career
Data science as a professional careerDavid Rostcheck
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataDomino Data Lab
 
Uwe Seiler, Data Architect and Trainer at codecentric AG - "Hadoop & Germany ...
Uwe Seiler, Data Architect and Trainer at codecentric AG - "Hadoop & Germany ...Uwe Seiler, Data Architect and Trainer at codecentric AG - "Hadoop & Germany ...
Uwe Seiler, Data Architect and Trainer at codecentric AG - "Hadoop & Germany ...Dataconomy Media
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelDataiku
 
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...Dr. Haxel Consult
 
Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Dataiku
 
Data science tips for data engineers
Data science tips for data engineersData science tips for data engineers
Data science tips for data engineersIBM Analytics
 
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...Simplilearn
 
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscapingAI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscapingDr. Haxel Consult
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino Data Lab
 
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceIntroduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceFerdin Joe John Joseph PhD
 
The Virtualization of Clouds - The New Enterprise Data Architecture Opportunity
The Virtualization of Clouds - The New Enterprise Data Architecture OpportunityThe Virtualization of Clouds - The New Enterprise Data Architecture Opportunity
The Virtualization of Clouds - The New Enterprise Data Architecture OpportunityDenodo
 

Tendances (20)

Data science as a professional career
Data science as a professional careerData science as a professional career
Data science as a professional career
 
How I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked DataHow I Learned to Stop Worrying and Love Linked Data
How I Learned to Stop Worrying and Love Linked Data
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Uwe Seiler, Data Architect and Trainer at codecentric AG - "Hadoop & Germany ...
Uwe Seiler, Data Architect and Trainer at codecentric AG - "Hadoop & Germany ...Uwe Seiler, Data Architect and Trainer at codecentric AG - "Hadoop & Germany ...
Uwe Seiler, Data Architect and Trainer at codecentric AG - "Hadoop & Germany ...
 
Data science Big Data
Data science Big DataData science Big Data
Data science Big Data
 
Data Activities in Austria
Data Activities in AustriaData Activities in Austria
Data Activities in Austria
 
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
Dataiku, Pitch at Data-Driven NYC, New York City, September 17th 2013
 
Applied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML modelApplied Data Science Course Part 1: Concepts & your first ML model
Applied Data Science Course Part 1: Concepts & your first ML model
 
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
 
Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...Applied Data Science Course Part 2: the data science workflow and basic model...
Applied Data Science Course Part 2: the data science workflow and basic model...
 
Data science tips for data engineers
Data science tips for data engineersData science tips for data engineers
Data science tips for data engineers
 
Big Data
Big DataBig Data
Big Data
 
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
Data Scientist vs Data Analyst vs Data Engineer - Role & Responsibility, Skil...
 
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscapingAI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
AI-SDV 2021: Angela Bauch - AILANI for clinical competitive landscaping
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
 
Big data analysis
Big data analysisBig data analysis
Big data analysis
 
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data ScienceIntroduction to Data Science - Week 4 - Tools and Technologies in Data Science
Introduction to Data Science - Week 4 - Tools and Technologies in Data Science
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
The Virtualization of Clouds - The New Enterprise Data Architecture Opportunity
The Virtualization of Clouds - The New Enterprise Data Architecture OpportunityThe Virtualization of Clouds - The New Enterprise Data Architecture Opportunity
The Virtualization of Clouds - The New Enterprise Data Architecture Opportunity
 
AI-SDV 2021 Biomax
AI-SDV 2021 BiomaxAI-SDV 2021 Biomax
AI-SDV 2021 Biomax
 

Similaire à Building up a Data Science Team from Scratch

The Power of < Artificial Intelligence >
The Power of < Artificial Intelligence >The Power of < Artificial Intelligence >
The Power of < Artificial Intelligence >Merelda
 
Executive Briefing: Why managing machines is harder than you think
Executive Briefing: Why managing machines is harder than you thinkExecutive Briefing: Why managing machines is harder than you think
Executive Briefing: Why managing machines is harder than you thinkPeter Skomoroch
 
How to classify documents automatically using NLP
How to classify documents automatically using NLPHow to classify documents automatically using NLP
How to classify documents automatically using NLPSkyl.ai
 
Data Quality Success Stories
Data Quality Success StoriesData Quality Success Stories
Data Quality Success StoriesDATAVERSITY
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptxShambhavi Vats
 
Data Strategy - Executive MBA Class, IE Business School
Data Strategy - Executive MBA Class, IE Business SchoolData Strategy - Executive MBA Class, IE Business School
Data Strategy - Executive MBA Class, IE Business SchoolGam Dias
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOProduct School
 
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...Dataconomy Media
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryMark Constable
 
7 steps to keep your ICT spend under control
7 steps to keep your ICT spend under control7 steps to keep your ICT spend under control
7 steps to keep your ICT spend under controlLesley Smith
 
Bridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder SupportBridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder SupportPeter Skomoroch
 
Never Mind Big Data: We're Still Living in the Era of Big Spreadsheet
Never Mind Big Data: We're Still Living in the Era of Big SpreadsheetNever Mind Big Data: We're Still Living in the Era of Big Spreadsheet
Never Mind Big Data: We're Still Living in the Era of Big SpreadsheetInformationActive Inc.
 
Data Science Unit1 AMET.pdf
Data Science Unit1 AMET.pdfData Science Unit1 AMET.pdf
Data Science Unit1 AMET.pdfmustaq4
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeArushi Prakash, Ph.D.
 
Surge engr 245 lean launchpad stanford 2020
Surge engr 245 lean launchpad stanford 2020Surge engr 245 lean launchpad stanford 2020
Surge engr 245 lean launchpad stanford 2020Stanford University
 
Learn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in KarnatakaLearn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in KarnatakaREVA University
 
Data-Ed: Data Warehousing Strategies
Data-Ed: Data Warehousing StrategiesData-Ed: Data Warehousing Strategies
Data-Ed: Data Warehousing StrategiesData Blueprint
 

Similaire à Building up a Data Science Team from Scratch (20)

The Power of < Artificial Intelligence >
The Power of < Artificial Intelligence >The Power of < Artificial Intelligence >
The Power of < Artificial Intelligence >
 
Executive Briefing: Why managing machines is harder than you think
Executive Briefing: Why managing machines is harder than you thinkExecutive Briefing: Why managing machines is harder than you think
Executive Briefing: Why managing machines is harder than you think
 
How to classify documents automatically using NLP
How to classify documents automatically using NLPHow to classify documents automatically using NLP
How to classify documents automatically using NLP
 
Data Quality Success Stories
Data Quality Success StoriesData Quality Success Stories
Data Quality Success Stories
 
L3 Big Data and Application.pptx
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptx
 
Data Strategy - Executive MBA Class, IE Business School
Data Strategy - Executive MBA Class, IE Business SchoolData Strategy - Executive MBA Class, IE Business School
Data Strategy - Executive MBA Class, IE Business School
 
Data is not the new snake oil
Data is not the new snake oilData is not the new snake oil
Data is not the new snake oil
 
How to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPOHow to Build an AI/ML Product and Sell it by SalesChoice CPO
How to Build an AI/ML Product and Sell it by SalesChoice CPO
 
DATA BLENDING
DATA BLENDINGDATA BLENDING
DATA BLENDING
 
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
Big Data Brussels 2019 v.4.0 I 'How to Build Big Data Analytics Capabilities ...
 
Advanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project DeliveryAdvanced Project Data Analytics for Improved Project Delivery
Advanced Project Data Analytics for Improved Project Delivery
 
7 steps to keep your ICT spend under control
7 steps to keep your ICT spend under control7 steps to keep your ICT spend under control
7 steps to keep your ICT spend under control
 
Get your data analytics strategy right!
Get your data analytics strategy right!Get your data analytics strategy right!
Get your data analytics strategy right!
 
Bridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder SupportBridging the AI Gap: Building Stakeholder Support
Bridging the AI Gap: Building Stakeholder Support
 
Never Mind Big Data: We're Still Living in the Era of Big Spreadsheet
Never Mind Big Data: We're Still Living in the Era of Big SpreadsheetNever Mind Big Data: We're Still Living in the Era of Big Spreadsheet
Never Mind Big Data: We're Still Living in the Era of Big Spreadsheet
 
Data Science Unit1 AMET.pdf
Data Science Unit1 AMET.pdfData Science Unit1 AMET.pdf
Data Science Unit1 AMET.pdf
 
Crafting a Compelling Data Science Resume
Crafting a Compelling Data Science ResumeCrafting a Compelling Data Science Resume
Crafting a Compelling Data Science Resume
 
Surge engr 245 lean launchpad stanford 2020
Surge engr 245 lean launchpad stanford 2020Surge engr 245 lean launchpad stanford 2020
Surge engr 245 lean launchpad stanford 2020
 
Learn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in KarnatakaLearn All about Data Science from the Best Private University in Karnataka
Learn All about Data Science from the Best Private University in Karnataka
 
Data-Ed: Data Warehousing Strategies
Data-Ed: Data Warehousing StrategiesData-Ed: Data Warehousing Strategies
Data-Ed: Data Warehousing Strategies
 

Plus de Institute of Contemporary Sciences

Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Institute of Contemporary Sciences
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicInstitute of Contemporary Sciences
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Institute of Contemporary Sciences
 
Solving churn challenge in Big Data environment - Jelena Pekez
Solving churn challenge in Big Data environment  - Jelena PekezSolving churn challenge in Big Data environment  - Jelena Pekez
Solving churn challenge in Big Data environment - Jelena PekezInstitute of Contemporary Sciences
 
Application of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovApplication of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovInstitute of Contemporary Sciences
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Institute of Contemporary Sciences
 
Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Institute of Contemporary Sciences
 
Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Institute of Contemporary Sciences
 
Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Institute of Contemporary Sciences
 
Reality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicReality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicInstitute of Contemporary Sciences
 
Sensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicSensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicInstitute of Contemporary Sciences
 
Prediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionPrediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionInstitute of Contemporary Sciences
 
Using data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentUsing data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentInstitute of Contemporary Sciences
 

Plus de Institute of Contemporary Sciences (20)

First 5 years of PSI:ML - Filip Panjevic
First 5 years of PSI:ML - Filip PanjevicFirst 5 years of PSI:ML - Filip Panjevic
First 5 years of PSI:ML - Filip Panjevic
 
Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...Building valuable (online and offline) Data Science communities - Experience ...
Building valuable (online and offline) Data Science communities - Experience ...
 
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen DraskovicData Science Master 4.0 on Belgrade University - Drazen Draskovic
Data Science Master 4.0 on Belgrade University - Drazen Draskovic
 
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
Deep learning fast and slow, a responsible and explainable AI framework - Ahm...
 
Solving churn challenge in Big Data environment - Jelena Pekez
Solving churn challenge in Big Data environment  - Jelena PekezSolving churn challenge in Big Data environment  - Jelena Pekez
Solving churn challenge in Big Data environment - Jelena Pekez
 
Application of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar DilovApplication of Business Intelligence in bank risk management - Dimitar Dilov
Application of Business Intelligence in bank risk management - Dimitar Dilov
 
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
Trends and practical applications of AI/ML in Fin Tech industry - Milos Kosan...
 
Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...Recommender systems for personalized financial advice from concept to product...
Recommender systems for personalized financial advice from concept to product...
 
Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...Advanced tools in real time analytics and AI in customer support - Milan Sima...
Advanced tools in real time analytics and AI in customer support - Milan Sima...
 
Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...Complex AI forecasting methods for investments portfolio optimization - Pawel...
Complex AI forecasting methods for investments portfolio optimization - Pawel...
 
From Zero to ML Hero for Underdogs - Amir Tabakovic
From Zero to ML Hero for Underdogs  - Amir TabakovicFrom Zero to ML Hero for Underdogs  - Amir Tabakovic
From Zero to ML Hero for Underdogs - Amir Tabakovic
 
Data and data scientists are not equal to money david hoyle
Data and data scientists are not equal to money   david hoyleData and data scientists are not equal to money   david hoyle
Data and data scientists are not equal to money david hoyle
 
The price is right - Tomislav Krizan
The price is right - Tomislav KrizanThe price is right - Tomislav Krizan
The price is right - Tomislav Krizan
 
When it's raining gold, bring a bucket - Andjela Culibrk
When it's raining gold, bring a bucket - Andjela CulibrkWhen it's raining gold, bring a bucket - Andjela Culibrk
When it's raining gold, bring a bucket - Andjela Culibrk
 
Reality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos SolujicReality and traps of real time data engineering - Milos Solujic
Reality and traps of real time data engineering - Milos Solujic
 
Sensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir BrusicSensor networks for personalized health monitoring - Vladimir Brusic
Sensor networks for personalized health monitoring - Vladimir Brusic
 
Improving Data Quality with Product Similarity Search
Improving Data Quality with Product Similarity SearchImproving Data Quality with Product Similarity Search
Improving Data Quality with Product Similarity Search
 
Prediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognitionPrediction of good patterns for future sales using image recognition
Prediction of good patterns for future sales using image recognition
 
Using data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local governmentUsing data to fight corruption: full budget transparency in local government
Using data to fight corruption: full budget transparency in local government
 
Geospatial Analysis and Open Data - Forest and Climate
Geospatial Analysis and Open Data - Forest and ClimateGeospatial Analysis and Open Data - Forest and Climate
Geospatial Analysis and Open Data - Forest and Climate
 

Dernier

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 

Dernier (20)

Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 

Building up a Data Science Team from Scratch

  • 1. Building up a Data Science Team Amadeus Magrabi Lead Data Scientist, commercetools
  • 2. Plan for today All Rights Reserved @2019 2 • Background: o Me o commercetools o Data Science @ commercetools • What I learned about building up a data science team o Hiring o Communication o Choosing projects o Choosing tools • Questions
  • 3. Background All Rights Reserved @2019 3 Me: • Studied cognitive science • Research in neuroscience/ML for 3 years • Working in data science since 3 years commercetools: • Offers e-commerce software via cloud-based APIs • Founded in 2006 • 150+ employees • Main offices in Berlin, Munich, Durham (US)
  • 4. All Rights Reserved @2019 4 Data Science @ commercetools: Image Similarity Search Search Image Prediction 1 Prediction 2 Prediction 3
  • 5. All Rights Reserved @2019 5 Data Science @ commercetools: Category Recommendations
  • 6. All Rights Reserved @2019 6 Data Science @ commercetools Team structure: • 3 data scientists • 2 software engineers • 1 product owner • 1 working student Team output: • For merchants: o APIs that make it easier to manage data and improve data quality. • For customers: o APIs that enable innovative data-based features. • For colleagues: o Make internal company processes more data-driven, more efficient and more accurate.
  • 7. All Rights Reserved @2019 7 Lessons learned from building up the data science team at commercetools
  • 8. All Rights Reserved @2019 8 Lesson 1: Job titles in data science are vague and unreliable. • Data Scientist: Vague combination of knowledge about statistics, machine learning, software engineering and business domains. • Machine Learning Engineer: Focus on building ML-based software products. • Data Analyst: Focus on statistics, visualizations and business insights. • Data Engineer: Manages ETL pipelines to make data accessible and maintain data quality. • Software Engineer: Ensures code is efficient, scalable and maintainable. • Machine Learning Researcher: Focus on publishing research papers. • Machine Learning Product Owner: Communicates with stakeholders to define requirements and priorities. Software Engineering Statistics/ Machine Learning Business Knowledge Data Science
  • 9. All Rights Reserved @2019 9 Lesson 2: Do not try to find someone who can do everything. Build a balanced and autonomous team instead. Software Engineering Statistics/ Machine Learning Business Knowledge Data Science generalist specialist unicorn
  • 10. All Rights Reserved @2019 10 Lesson 3: Prioritize simplicity over business impact for the first projects. vs.
  • 11. All Rights Reserved @2019 11 Lesson 4: Talk to everyone and set the right expectations. • Talk to people across the whole organization to identify the most valuable data projects. • Find data enthusiasts that support your initiatives and convince skeptics. • Over-communication is better than under-communication. • Avoid setting overly optimistic expectations (better to under-promise and over-deliver).
  • 12. All Rights Reserved @2019 12 Lesson 5: Find a good balance between flexibility and consistency of tools. Flexibility makes prototyping and hiring easier. Consistency makes collaboration and productionizing easier.
  • 13. All Rights Reserved @2019 13 Lesson 6: Data science is a long-term investment. rare more common
  • 14. Thanks for listening! Summary 1. Job titles in data science are vague and unreliable. 2. Build a balanced and autonomous team. 3. Prioritize simplicity over business impact for the first projects. 4. Talk to everyone and set the right expectations. 5. Find a good balance between flexibility and consistency of tools. 6. Data science is a long-term investment. Social media twitter.com/AmadeusMagrabi linkedin.com/in/amadeusmagrabi medium.com/@amadeus.magrabi