Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
Yulan Lin @y3l2n
Justin Gosses @JustinGosses
Data Science & Software Engineering
Valador Inc.
Supporting NASA OCIO
Rice Da...
Why practical
considerations?
● Startup
● Built around a software product
● Small companies
This talk is NOT from the perspective of...
● Established organization (> 10 years old)
● Large organization (> 10,000 people)
● Core function is not software/technic...
A grab bag at the intersection of math + code that includes
● Machine Learning
● Deep Learning
● Statistics
● Data Visuali...
● Data access
● How data science adds value
● Influence of procurement constraints
● Communication & narratives
● How data...
Data Access
Does the data exist?
● Is it actually useful for your problem?
● If you’re going to collect it or make it:
welcome to data...
Is the data programmatically accessible?
How “clean” is the data?
(or: how much data translation services do you require from
subject matter experts?)
Compliance: all. of. the. rules.
● Data Access
● Data Transfer
● Data Storage
● Data Anonymization
● Data Sharing
Data is currency:
● Power & Politics (at some level)
● Empathy is useful
A good partner helps you navigate:
● If data exists
● Data access
● Data oddities
● Compliance processes
● Data politics
How Data Scientists
Add Value
Products
Awareness:
Spreading knowledge of what is possible
Capability:
Building skills & Bringing in new tools
Procurement
Constraints
Does the proposed
product fit the
organization now
and in the future?
Consider:
● Skill development
● Workflow
● Tech stack
What is the official process?
What is the culture?
Open-source vs. proprietary
Communication and
Design
Data science:
What does that even mean?
(or why managing expectations is
important)
Credit: https://xkcd.com
Effective Narratives:
Don’t let the
Buzzwords + math + programming
get in the way of the
Business value + project schedule...
Understand as early as possible
● What’s the real problem?
● Does the data exist?
● Can you access the data?
● How clean i...
When delivering something that will be used by people:
Consider user-centered design
Data Visualization:
You’re likely undervaluing it
The distribution of
data scientists in an
organization
Distribution of Data Scientists
Team BTeam A
Org 1 Org 2 Org 3 Org 4
Executives
Data Science
Team
Organizational fence
Distribution of Data Scientists
Team BTeam A
Org 1 Org 2 Org 3 Org 4
Executives
Data Science
Team
Data
Problems
Finished
Product
Training
Best
Practices
Distribution of Data Scientists
Team BTeam A
Org 1 Org 2 Org 3 Org ...
Innovation Lab
Team BTeam A
Org 1 Org 2 Org 3 Org 4
Executives
Data Science
Team
Data
Problems
Training
Best
Practices
Innovation Lab
Team BTeam A
Org 1 Org 2 Org 3 Org 4
Executives
Data Science
Team
Data
Problems
Training
Best
Practices
Fin...
Embedded + Rotations
Team BTeam A
Org 1 Org 2 Org 3 Org 4
Executives
Data Science
TeamTraining
Best
Practices
Embedded + Rotations
Team BTeam A
Org 1 Org 2 Org 3 Org 4
Executives
Data Science
TeamTraining
Best
Practices
Projects
Lea...
Centralized Consultancy
Team BTeam A
Org 1 Org 2 Org 3 Org 4
Executives
Data Science
Team
Project Working
Group
Data
Probl...
Centralized Consultancy
Team BTeam A
Org 1 Org 2 Org 3 Org 4
Executives
Data Science
Team
Project Working
Group
Data
Probl...
Centralized Consultancy
Team BTeam A
Org 1 Org 2 Org 3 Org 4
Executives
Data Science
Team
Project Working
Group
Data
Probl...
How to grow data science in an org?
Top-down vs. grassroots
Data / Systems Skills / Culture
Wrap-up
Data scientists need to manage
“outward” into many parts of an
organization
All of these can make-or-break a project
Data access: Will you need to navigate legacy systems and/or data owners?
Value o...
Thanks, and keep in touch!
Justin Gosses @JustinGosses
Yulan Lin @y3l2n
Prochain SlideShare
Chargement dans…5
×

Practical Considerations of Data Science Consulting in Large Organizations - Oct 12 2017

1 242 vues

Publié le

Below this section, the tab "notes" contains approximate speaker text as delivered at the 2017 Rice Data Science Conference. As opposed to the typical data science talk on math, models, or frameworks, this talk discusses the need to successfully manage people relationships when doing data science consulting and prototyping in a large organization. Common traps to avoid, key questions to answer early, how organizational procurement patterns influences tool selection and the importance of having a good local partner close to the data are all discussed. The in-person presenter of this talk at Rice Data Science Day was Yulan lin - https://www.linkedin.com/in/yulanlin/ Justin's slides were recorded in advance.

Publié dans : Données & analyses
  • Identifiez-vous pour voir les commentaires

Practical Considerations of Data Science Consulting in Large Organizations - Oct 12 2017

  1. 1. Yulan Lin @y3l2n Justin Gosses @JustinGosses Data Science & Software Engineering Valador Inc. Supporting NASA OCIO Rice Data Science Conference, Oct. 2017 Practical Considerations for Data Science Consulting and Innovation in a Large Organization
  2. 2. Why practical considerations?
  3. 3. ● Startup ● Built around a software product ● Small companies This talk is NOT from the perspective of...
  4. 4. ● Established organization (> 10 years old) ● Large organization (> 10,000 people) ● Core function is not software/technical Clarification: not just NASA! Oil & Gas, Banking, Health, etc. This talk IS from the perspective of
  5. 5. A grab bag at the intersection of math + code that includes ● Machine Learning ● Deep Learning ● Statistics ● Data Visualization What is Data Science?
  6. 6. ● Data access ● How data science adds value ● Influence of procurement constraints ● Communication & narratives ● How data scientists are distributed Roadmap of our talk
  7. 7. Data Access
  8. 8. Does the data exist? ● Is it actually useful for your problem? ● If you’re going to collect it or make it: welcome to data engineering
  9. 9. Is the data programmatically accessible?
  10. 10. How “clean” is the data? (or: how much data translation services do you require from subject matter experts?)
  11. 11. Compliance: all. of. the. rules. ● Data Access ● Data Transfer ● Data Storage ● Data Anonymization ● Data Sharing
  12. 12. Data is currency: ● Power & Politics (at some level) ● Empathy is useful
  13. 13. A good partner helps you navigate: ● If data exists ● Data access ● Data oddities ● Compliance processes ● Data politics
  14. 14. How Data Scientists Add Value
  15. 15. Products
  16. 16. Awareness: Spreading knowledge of what is possible
  17. 17. Capability: Building skills & Bringing in new tools
  18. 18. Procurement Constraints
  19. 19. Does the proposed product fit the organization now and in the future? Consider: ● Skill development ● Workflow ● Tech stack
  20. 20. What is the official process?
  21. 21. What is the culture?
  22. 22. Open-source vs. proprietary
  23. 23. Communication and Design
  24. 24. Data science: What does that even mean? (or why managing expectations is important) Credit: https://xkcd.com
  25. 25. Effective Narratives: Don’t let the Buzzwords + math + programming get in the way of the Business value + project schedule + uncertainty story
  26. 26. Understand as early as possible ● What’s the real problem? ● Does the data exist? ● Can you access the data? ● How clean is the data? ● What is the business value? ● What is the organizational context?
  27. 27. When delivering something that will be used by people: Consider user-centered design
  28. 28. Data Visualization: You’re likely undervaluing it
  29. 29. The distribution of data scientists in an organization
  30. 30. Distribution of Data Scientists Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team
  31. 31. Organizational fence Distribution of Data Scientists Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team
  32. 32. Data Problems Finished Product Training Best Practices Distribution of Data Scientists Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Organizational fence
  33. 33. Innovation Lab Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Data Problems Training Best Practices
  34. 34. Innovation Lab Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Data Problems Training Best Practices Finished Product
  35. 35. Embedded + Rotations Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science TeamTraining Best Practices
  36. 36. Embedded + Rotations Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science TeamTraining Best Practices Projects Learning Training Best Practices
  37. 37. Centralized Consultancy Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Project Working Group Data Problems Training Training Best Practices
  38. 38. Centralized Consultancy Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Project Working Group Data Problems Training Training Best Practices
  39. 39. Centralized Consultancy Team BTeam A Org 1 Org 2 Org 3 Org 4 Executives Data Science Team Project Working Group Data Problems Training Training Best Practices Finished Product
  40. 40. How to grow data science in an org? Top-down vs. grassroots Data / Systems Skills / Culture
  41. 41. Wrap-up
  42. 42. Data scientists need to manage “outward” into many parts of an organization
  43. 43. All of these can make-or-break a project Data access: Will you need to navigate legacy systems and/or data owners? Value of data science: Is the project’s business value well defined? Procurement constraints: Can a project operationalize/grow within the org? Communication & design: Is the right information flowing effectively? Organizational structure: What are the pros/cons of your structures/workflow?
  44. 44. Thanks, and keep in touch! Justin Gosses @JustinGosses Yulan Lin @y3l2n

×