SlideShare une entreprise Scribd logo
1  sur  27
Télécharger pour lire hors ligne
Demystifying Data Science
Venkatesh
Data Science Expert and Machine Learning Researcher
What is Data Science?
Data science is an interdisciplinary field that uses
scientific methods, processes, algorithms and systems
to extract knowledge and insights from data in
various forms, both structured and
unstructured,[1][2]
similar to data mining.
Well Tell me in Layman terms
Data
Domain Expertise
Algorithms
Insights
Data products
Automation /
Optimization
Business value
Intelligent Systems - A simple definition
Systems that perform actions that,
if performed by humans, would be
considered intelligent
Sensing
Language
Understanding
Planning
Problem Solving
Knowledge
Decision Making
Learning
Inference
Language
Generation
Robotics
Control
Tasks of Intelligence
Companies have AI issues
Engineering wants to get its hands on Machine Learning
The C-Suite needs an “AI” strategy
Marketing wants to include “AI” in product descriptions
Product is afraid of falling behind
Everyone is pitching you technology
Wait.. What happened to Data warehousing?
First, What is data warehousing?
● Integrated: Constructed by combining data from heterogeneous sources such as relational databases, flat files, etc.
● Time-Variant: Provides information with respect to a particular time period.
● Non-volatile: Data once entered into the warehouse should not change
However, it does not provide:
1. Automatic discovery of patterns
2. Prediction of likely outcomes
3. Creation of actionable information
Courtesy: https://www.educba.com/data-warehousing-vs-data-mining/
What about Business intelligence? Reporting?
● Summarizes the factual/historical data
● Delivers reports, KPIs and trends in a visually
pleasing manner
● Allows organisation to see the big picture
● Assists them to make better decisions to support
the mission.
● BI systems are designed to look backwards
based on real data from real events.
“What Happened and what needs to change ?”
● Data Science looks forward, interpreting the
information to predict what might happen in the future.
“Why it happened and how to change it ?”
STATISTICAL MACHINE LEARNING
= Cat
DEEP LEARNING
92%
EVIDENCE-BASED REASONING
RECOMMENDATION SYSTEMS
NATURAL LANGUAGE GENERATION
CHAT/CONVERSATIONAL INTERFACESROBOTIC PROCESS
AUTOMATION
TEXT ANALYSIS
What makes a Data Science Team?
Research
Courtesy: https://www.business-science.io/business/2018/09/18/data-science-team.html
Who are the members?
Data Engineers Data Scientists
Full Stack
Developers
Product
Managers
Research
Data Engineer. Does he only do ETL?
● Industry has shifted from drag-and-drop ETL tools towards a more
programmatic approach
● Nature of data that needs to be processed is changing day by day
(Processing Files/Batches --> Real time stream data)
Expected Skill Sets:
● Should not stick with a set of tools for building data pipelines
● Has to be a good software engineer
● Comfortable in working with open source platforms
● Adaptable to constantly evolving open source tools
● Employ a variety of tools and languages to marry systems together
Courtesy: http://podcast.freecodecamp.org/ep-37-the-rise-of-the-data-engineer
Why does a DS team need Full Stack Developer?
● Development of Pilots and MVP Applications
○ Productize the data science work so it can serve
an internal stakeholder
○ Interactive display of results/stats/insights
● Responsible for bringing a Software Engineering culture into the Data Science process
○ Build Infrastructure as Code - Automatization of the Data Science team infrastructure and testing
○ Continuous Integration and Versioning Control
○ Development of APIs to help integrate data products and source into applications
○ Building tools for internal use like tools for data collection, data labelling
Courtesy: https://towardsdatascience.com/what-is-the-role-of-an-ai-software-engineer-in-a-data-science-team-eec987203ceb
Data Scientists come in many types
Type A Type B Type C
● High understanding of domain
knowledge
● Uses ready made tools instead
of developing algorithms
● Has less or no hands-on
experience in building software
applications
● Insight oriented
● Focus in better understanding
of the business
● Has basic theoretical
knowledge in data science
● Has good hands-on experience
in building software
applications
● Capable of building an
end-to-end prototype or MVP
● Deep understanding of data
science algorithms
● Has great hands on product
development skills
Domain Experts
● Experts both by education and experience in that domain
● Aware of what data is available and judge how good it is
● Major contribution in Feature Engineering and Modeling
● Use and apply the deliverables of a data science project
in the real world
● Communicate with the intended users of the project’s outcome
● Define the framework for a data science project as they would know
○ What are the current challenges
○ How they must be answered to be practically useful
● Can learn enough data science to make a reasonable model using standardized tools
Courtesy: https://www.linkedin.com/pulse/role-domain-knowledge-data-science-patrick-bangert/
Cutting edge Research
● Seek to understand and develop systems by advancing the
longer-term academic problems surrounding AI
● Actively engage with the research community through
○ publications
○ participation in technical conferences and workshops
● Has the skills to craft customized data science and
machine learning algorithms
● Their focus will be to do research, not solve a business problem
● Data science researchers should not be an early hire
Building a team for Startup
Courtesy: https://thinkgrowth.org/the-startup-founders-guide-to-analytics-1d2176f20ac1
Building a team for an Enterprise
Courtesy: https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/breaking-away-the-secrets-to-scaling-analytics
Who should lead the DS Team
Courtesy: https://www.altexsoft.com/blog/datascience/how-to-structure-data-science-team-key-models-and-roles/
Spotify Case Study
‘Center of Excellence’ Model
Keys to Excellence
Courtesy: https://www.slideshare.net/productschool/the-why-how-of-enterprise-analytics-w-spotify-data-scientist-79046775
● Interview process focused on practical data skills
● ‘Data challenges’ - Airbnb data + real question
● Lightning talks
● Support community
● Multi-stage screening:
○ Recruiter screen
○ Take home data challenge
○ Onsite challenge
○ Trained graders
○ Two graders for each test to ensure consistency
○ 1:1s with hiring manager, business partner, CV
AirBnB
Courtesy: https://www.slideshare.net/Work-Bench/scaling-data-science-at-airbnb
Evolution Of AirBnB’s DS Team
2012 2013 2014 2015
Work
Structure
Centralised(Work
closely within team)
Started working
with other team
members
Embedded with
other teams
Embedded with
other teams
Team size 7 14 28 55
Specialisation All generalists(Data
Engineers + Data
Scientists + Data
Analysts)
Hired first Data
Engineer
Separate team for
Data Engineering
Data Science
Infrastructure
team, Specialised
roles for NLP, CV
Hiring Take home Data
challenge followed
by
1:1 interview with
whole team and
founders
Onsite data
challenge
Created rubrics
and grading
criteria
Started hiring
interns
Started focusing
on diversity and
specialised roles
for NLP,CV
Courtesy: https://vimeo.com/148942395
Facebook
● On-boards infra data scientists and engineers through the Bootcamp program
● Provides broad exposure to engineering systems in a supportive learning environment.
● Encourage engineering teams to identify mentors to guide new data scientists as they ramp up in their first projects.
● New data scientists receive mentoring on the ways to communicate the results of their complex analyses.
● Data Scientists are presented with the following options:
○ develop deep domain expertise in an area and spend several years embedded with a team
○ move across partner teams every 12 to 18 months in order to develop a broad understanding
● Provides opportunities to learn and master state-of-art skills:
○ Internal training sessions and chalk talks
○ Invite external speakers to cover important developments in the field
○ Closely connected to the academic community
○ Attend and present at major conferences such as INFORMS, KDD, and NIPS
Courtesy: https://code.fb.com/core-data/building-data-science-teams-to-have-an-impact-at-scale
Apple’s Acqui-hiring Strategy
● Apple acqui-hires startups to make its technology smarter and faster
● It buys a whole company to get the team and/or technology
● Hoping to compete with Google’s search service, Apple bought Siri in 2010
● Pandora, Spotify, and Google Music started to predict songs a user will like.
● Apple saw this, which likely prompted the company to purchase Beats Music
(a streaming music service that has a similar algorithm)
● Recently Apple has hired at least 18 people, including at least two co-founders,
one of whom is the CEO from an enterprise consulting startup
called Silicon Valley Data Science
Where should the focus be
Don’t focus on the technology
Focus on the functionality
The functionality is driven by business needs
The functionality is supported by algorithms & data
The algorithms are instrumental to business
Courtesy: Kristian Hammond, NorthWestern University
Data: Do you have the data that support it?
Task: Is your task genuinely data driven?
Scale: Do you need the scale automation
provides?
What you need to ask when considering AI
THANK YOU FOR YOUR ATTENTION
DO YOU HAVE ANY QUESTIONS ?

Contenu connexe

Tendances

Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Edureka!
 
Introduction to-data-science
Introduction to-data-scienceIntroduction to-data-science
Introduction to-data-scienceAhmad karawash
 
The Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationThe Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationEric Kavanagh
 
Learn Power BI with Power Pivot, Power Query, Power View, Power Map and Q&A
Learn Power BI with Power Pivot, Power Query, Power View, Power Map and Q&ALearn Power BI with Power Pivot, Power Query, Power View, Power Map and Q&A
Learn Power BI with Power Pivot, Power Query, Power View, Power Map and Q&AVishal Pawar
 
Business intelligence 3.0 and the data lake
Business intelligence 3.0 and the data lakeBusiness intelligence 3.0 and the data lake
Business intelligence 3.0 and the data lakeData Science Thailand
 
Power BI Full Course | Power BI Tutorial for Beginners | Edureka
Power BI Full Course | Power BI Tutorial for Beginners | EdurekaPower BI Full Course | Power BI Tutorial for Beginners | Edureka
Power BI Full Course | Power BI Tutorial for Beginners | EdurekaEdureka!
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data EngineeringC4Media
 
Creating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetupCreating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetupCarl Anderson
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First CourseArnab Majumdar
 
Practical Enterprise Architecture - Introducing CSVLOD EA Model
Practical Enterprise Architecture - Introducing CSVLOD EA ModelPractical Enterprise Architecture - Introducing CSVLOD EA Model
Practical Enterprise Architecture - Introducing CSVLOD EA ModelAshraf Fouad
 
Data Science Introduction
Data Science IntroductionData Science Introduction
Data Science IntroductionGang Tao
 
DATA & ANALYTICS
DATA & ANALYTICSDATA & ANALYTICS
DATA & ANALYTICSfireflylabz
 
The Future of Microsoft Project Portfolio Management (PPM) for Delivering Val...
The Future of Microsoft Project Portfolio Management (PPM) for Delivering Val...The Future of Microsoft Project Portfolio Management (PPM) for Delivering Val...
The Future of Microsoft Project Portfolio Management (PPM) for Delivering Val...OnePlan Solutions
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science TeamsEMC
 
Week 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud ComputingWeek 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingDatabricks
 

Tendances (20)

Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...Data Science Training | Data Science Tutorial | Data Science Certification | ...
Data Science Training | Data Science Tutorial | Data Science Certification | ...
 
Introduction to-data-science
Introduction to-data-scienceIntroduction to-data-science
Introduction to-data-science
 
The Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data IntegrationThe Future of Data Warehousing and Data Integration
The Future of Data Warehousing and Data Integration
 
Learn Power BI with Power Pivot, Power Query, Power View, Power Map and Q&A
Learn Power BI with Power Pivot, Power Query, Power View, Power Map and Q&ALearn Power BI with Power Pivot, Power Query, Power View, Power Map and Q&A
Learn Power BI with Power Pivot, Power Query, Power View, Power Map and Q&A
 
Business intelligence 3.0 and the data lake
Business intelligence 3.0 and the data lakeBusiness intelligence 3.0 and the data lake
Business intelligence 3.0 and the data lake
 
Power BI Full Course | Power BI Tutorial for Beginners | Edureka
Power BI Full Course | Power BI Tutorial for Beginners | EdurekaPower BI Full Course | Power BI Tutorial for Beginners | Edureka
Power BI Full Course | Power BI Tutorial for Beginners | Edureka
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Future of Data Engineering
Future of Data EngineeringFuture of Data Engineering
Future of Data Engineering
 
Creating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetupCreating a Data-Driven Organization -- thisismetis meetup
Creating a Data-Driven Organization -- thisismetis meetup
 
Data science
Data scienceData science
Data science
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First Course
 
Data analytics
Data analyticsData analytics
Data analytics
 
Practical Enterprise Architecture - Introducing CSVLOD EA Model
Practical Enterprise Architecture - Introducing CSVLOD EA ModelPractical Enterprise Architecture - Introducing CSVLOD EA Model
Practical Enterprise Architecture - Introducing CSVLOD EA Model
 
Data Science Introduction
Data Science IntroductionData Science Introduction
Data Science Introduction
 
Power BI visuals
Power BI visualsPower BI visuals
Power BI visuals
 
DATA & ANALYTICS
DATA & ANALYTICSDATA & ANALYTICS
DATA & ANALYTICS
 
The Future of Microsoft Project Portfolio Management (PPM) for Delivering Val...
The Future of Microsoft Project Portfolio Management (PPM) for Delivering Val...The Future of Microsoft Project Portfolio Management (PPM) for Delivering Val...
The Future of Microsoft Project Portfolio Management (PPM) for Delivering Val...
 
Building Data Science Teams
Building Data Science TeamsBuilding Data Science Teams
Building Data Science Teams
 
Week 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud ComputingWeek 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud Computing
 
Automated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and TrackingAutomated Hyperparameter Tuning, Scaling and Tracking
Automated Hyperparameter Tuning, Scaling and Tracking
 

Similaire à Building successful data science teams

Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...NadinaLisbon1
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products Dataiku
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceDatabricks
 
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Ali Alkan
 
Nadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a NutshellNadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a NutshellIT Arena
 
Applied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science DeptApplied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science DeptJonathan Sedar
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Edureka!
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?DIGITALSAI1
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification courseKumarNaik21
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabadVamsiNihal
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabadsaitejavella
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training HyderabadNithinsunil1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)SayyedYusufali
 
data science training and placement
data science training and placementdata science training and placement
data science training and placementSaiprasadVella
 
online data science training
online data science trainingonline data science training
online data science trainingDIGITALSAI1
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabadVamsiNihal
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabadVamsiNihal
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in HyderabadKumarNaik21
 

Similaire à Building successful data science teams (20)

Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
Salesforce Architect Group, Frederick, United States July 2023 - Generative A...
 
The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products The 3 Key Barriers Keeping Companies from Deploying Data Products
The 3 Key Barriers Keeping Companies from Deploying Data Products
 
Building Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field ExperienceBuilding Data Science into Organizations: Field Experience
Building Data Science into Organizations: Field Experience
 
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
Intelligently Automating Machine Learning, Artificial Intelligence, and Data ...
 
Nadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a NutshellNadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
Nadine Schöne, Dataiku. The Complete Data Value Chain in a Nutshell
 
Applied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science DeptApplied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science Dept
 
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
Data Science For Beginners | Who Is A Data Scientist? | Data Science Tutorial...
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
data science training and placement
data science training and placementdata science training and placement
data science training and placement
 
online data science training
online data science trainingonline data science training
online data science training
 
Data science online training in hyderabad
Data science online training in hyderabadData science online training in hyderabad
Data science online training in hyderabad
 
data science online training in hyderabad
data science online training in hyderabaddata science online training in hyderabad
data science online training in hyderabad
 
Best data science training in Hyderabad
Best data science training in HyderabadBest data science training in Hyderabad
Best data science training in Hyderabad
 

Dernier

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...gragchanchal546
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...Health
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...gajnagarg
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...kumargunjan9515
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...kumargunjan9515
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowgargpaaro
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...HyderabadDolls
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxchadhar227
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfSayantanBiswas37
 

Dernier (20)

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
Gulbai Tekra * Cheap Call Girls In Ahmedabad Phone No 8005736733 Elite Escort...
 
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...
 
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
High Profile Call Girls Service in Jalore { 9332606886 } VVIP NISHA Call Girl...
 
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...Top Call Girls in Balaghat  9332606886Call Girls Advance Cash On Delivery Ser...
Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Gartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptxGartner's Data Analytics Maturity Model.pptx
Gartner's Data Analytics Maturity Model.pptx
 
Computer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdfComputer science Sql cheat sheet.pdf.pdf
Computer science Sql cheat sheet.pdf.pdf
 

Building successful data science teams

  • 1. Demystifying Data Science Venkatesh Data Science Expert and Machine Learning Researcher
  • 2. What is Data Science? Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured,[1][2] similar to data mining.
  • 3. Well Tell me in Layman terms Data Domain Expertise Algorithms Insights Data products Automation / Optimization Business value
  • 4. Intelligent Systems - A simple definition Systems that perform actions that, if performed by humans, would be considered intelligent
  • 6. Companies have AI issues Engineering wants to get its hands on Machine Learning The C-Suite needs an “AI” strategy Marketing wants to include “AI” in product descriptions Product is afraid of falling behind Everyone is pitching you technology
  • 7. Wait.. What happened to Data warehousing? First, What is data warehousing? ● Integrated: Constructed by combining data from heterogeneous sources such as relational databases, flat files, etc. ● Time-Variant: Provides information with respect to a particular time period. ● Non-volatile: Data once entered into the warehouse should not change However, it does not provide: 1. Automatic discovery of patterns 2. Prediction of likely outcomes 3. Creation of actionable information Courtesy: https://www.educba.com/data-warehousing-vs-data-mining/
  • 8. What about Business intelligence? Reporting? ● Summarizes the factual/historical data ● Delivers reports, KPIs and trends in a visually pleasing manner ● Allows organisation to see the big picture ● Assists them to make better decisions to support the mission. ● BI systems are designed to look backwards based on real data from real events. “What Happened and what needs to change ?” ● Data Science looks forward, interpreting the information to predict what might happen in the future. “Why it happened and how to change it ?”
  • 9. STATISTICAL MACHINE LEARNING = Cat DEEP LEARNING 92% EVIDENCE-BASED REASONING RECOMMENDATION SYSTEMS NATURAL LANGUAGE GENERATION CHAT/CONVERSATIONAL INTERFACESROBOTIC PROCESS AUTOMATION TEXT ANALYSIS
  • 10. What makes a Data Science Team? Research Courtesy: https://www.business-science.io/business/2018/09/18/data-science-team.html
  • 11. Who are the members? Data Engineers Data Scientists Full Stack Developers Product Managers Research
  • 12. Data Engineer. Does he only do ETL? ● Industry has shifted from drag-and-drop ETL tools towards a more programmatic approach ● Nature of data that needs to be processed is changing day by day (Processing Files/Batches --> Real time stream data) Expected Skill Sets: ● Should not stick with a set of tools for building data pipelines ● Has to be a good software engineer ● Comfortable in working with open source platforms ● Adaptable to constantly evolving open source tools ● Employ a variety of tools and languages to marry systems together Courtesy: http://podcast.freecodecamp.org/ep-37-the-rise-of-the-data-engineer
  • 13. Why does a DS team need Full Stack Developer? ● Development of Pilots and MVP Applications ○ Productize the data science work so it can serve an internal stakeholder ○ Interactive display of results/stats/insights ● Responsible for bringing a Software Engineering culture into the Data Science process ○ Build Infrastructure as Code - Automatization of the Data Science team infrastructure and testing ○ Continuous Integration and Versioning Control ○ Development of APIs to help integrate data products and source into applications ○ Building tools for internal use like tools for data collection, data labelling Courtesy: https://towardsdatascience.com/what-is-the-role-of-an-ai-software-engineer-in-a-data-science-team-eec987203ceb
  • 14. Data Scientists come in many types Type A Type B Type C ● High understanding of domain knowledge ● Uses ready made tools instead of developing algorithms ● Has less or no hands-on experience in building software applications ● Insight oriented ● Focus in better understanding of the business ● Has basic theoretical knowledge in data science ● Has good hands-on experience in building software applications ● Capable of building an end-to-end prototype or MVP ● Deep understanding of data science algorithms ● Has great hands on product development skills
  • 15. Domain Experts ● Experts both by education and experience in that domain ● Aware of what data is available and judge how good it is ● Major contribution in Feature Engineering and Modeling ● Use and apply the deliverables of a data science project in the real world ● Communicate with the intended users of the project’s outcome ● Define the framework for a data science project as they would know ○ What are the current challenges ○ How they must be answered to be practically useful ● Can learn enough data science to make a reasonable model using standardized tools Courtesy: https://www.linkedin.com/pulse/role-domain-knowledge-data-science-patrick-bangert/
  • 16. Cutting edge Research ● Seek to understand and develop systems by advancing the longer-term academic problems surrounding AI ● Actively engage with the research community through ○ publications ○ participation in technical conferences and workshops ● Has the skills to craft customized data science and machine learning algorithms ● Their focus will be to do research, not solve a business problem ● Data science researchers should not be an early hire
  • 17. Building a team for Startup Courtesy: https://thinkgrowth.org/the-startup-founders-guide-to-analytics-1d2176f20ac1
  • 18. Building a team for an Enterprise Courtesy: https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/breaking-away-the-secrets-to-scaling-analytics
  • 19. Who should lead the DS Team Courtesy: https://www.altexsoft.com/blog/datascience/how-to-structure-data-science-team-key-models-and-roles/
  • 20. Spotify Case Study ‘Center of Excellence’ Model Keys to Excellence Courtesy: https://www.slideshare.net/productschool/the-why-how-of-enterprise-analytics-w-spotify-data-scientist-79046775
  • 21. ● Interview process focused on practical data skills ● ‘Data challenges’ - Airbnb data + real question ● Lightning talks ● Support community ● Multi-stage screening: ○ Recruiter screen ○ Take home data challenge ○ Onsite challenge ○ Trained graders ○ Two graders for each test to ensure consistency ○ 1:1s with hiring manager, business partner, CV AirBnB Courtesy: https://www.slideshare.net/Work-Bench/scaling-data-science-at-airbnb
  • 22. Evolution Of AirBnB’s DS Team 2012 2013 2014 2015 Work Structure Centralised(Work closely within team) Started working with other team members Embedded with other teams Embedded with other teams Team size 7 14 28 55 Specialisation All generalists(Data Engineers + Data Scientists + Data Analysts) Hired first Data Engineer Separate team for Data Engineering Data Science Infrastructure team, Specialised roles for NLP, CV Hiring Take home Data challenge followed by 1:1 interview with whole team and founders Onsite data challenge Created rubrics and grading criteria Started hiring interns Started focusing on diversity and specialised roles for NLP,CV Courtesy: https://vimeo.com/148942395
  • 23. Facebook ● On-boards infra data scientists and engineers through the Bootcamp program ● Provides broad exposure to engineering systems in a supportive learning environment. ● Encourage engineering teams to identify mentors to guide new data scientists as they ramp up in their first projects. ● New data scientists receive mentoring on the ways to communicate the results of their complex analyses. ● Data Scientists are presented with the following options: ○ develop deep domain expertise in an area and spend several years embedded with a team ○ move across partner teams every 12 to 18 months in order to develop a broad understanding ● Provides opportunities to learn and master state-of-art skills: ○ Internal training sessions and chalk talks ○ Invite external speakers to cover important developments in the field ○ Closely connected to the academic community ○ Attend and present at major conferences such as INFORMS, KDD, and NIPS Courtesy: https://code.fb.com/core-data/building-data-science-teams-to-have-an-impact-at-scale
  • 24. Apple’s Acqui-hiring Strategy ● Apple acqui-hires startups to make its technology smarter and faster ● It buys a whole company to get the team and/or technology ● Hoping to compete with Google’s search service, Apple bought Siri in 2010 ● Pandora, Spotify, and Google Music started to predict songs a user will like. ● Apple saw this, which likely prompted the company to purchase Beats Music (a streaming music service that has a similar algorithm) ● Recently Apple has hired at least 18 people, including at least two co-founders, one of whom is the CEO from an enterprise consulting startup called Silicon Valley Data Science
  • 25. Where should the focus be Don’t focus on the technology Focus on the functionality The functionality is driven by business needs The functionality is supported by algorithms & data The algorithms are instrumental to business Courtesy: Kristian Hammond, NorthWestern University
  • 26. Data: Do you have the data that support it? Task: Is your task genuinely data driven? Scale: Do you need the scale automation provides? What you need to ask when considering AI
  • 27. THANK YOU FOR YOUR ATTENTION DO YOU HAVE ANY QUESTIONS ?