Data science as a professional career

Data Scientist & Cross-Disciplinary Technology Leader à Independent Contactor Developer
25 Jan 2016

Contenu connexe


Data science as a professional career

  1. DATA SCIENCE as a new professional career
  2. What we will talk about • Data Science: • What is Data Science? • What kind of work to Data Scientists do? • Employment demand for Data Science jobs • What kind education is required? • What is Data Engineering, and how does it differ from Data Science? • What is the difference between Data Science and Business Intelligence?
  3. Who am I? • David Rostcheck • I’m a consulting Data Scientist • I have worked in various software roles (software engineer, enterprise architect, etc.) • My degree is in Physics • I write articles on Data Science on
  4. What is Data Science? • Data Science is industrial research on a company’s own data • Goal: produce advanced algorithms that produce a competitive advantage • Often work with unstructured data, may be large • “The qualifications for the job include the strength to tunnel through mountains of information and the vision to discern patterns where others see none” - Bloomberg Businessweek
  5. Is Data Science really science? Academic Science Industrial Research Teams PhDs, graduate students PhDs, technologists Setting University Company Publication Formal (academic publications, conferences) Less formal (blogs, white papers, open source) Funding Public grants Corporate Goal Advance human knowledge Create competitive advantage - Data Science is industrial science - It shares some attributes with academic science, but has other differences
  6. What kind of work do Data Scientists do? • Create Artificially Intelligent systems (“narrow AI”) • Examples: • Recommender systems • Self-driving cars • AI agents • Smart energy management • Medical diagnosis • Machine vision
  7. Data Science is in Demand • “The hot job of the decade… Data scientists today are akin to Wall Street “quants” of the 1980s and 1990s” - Harvard Business Review • “18.7% projected growth 2010-2020” - VentureBeat • “McKinsey projects […] ‘50 percent to 60 percent gap between supply and requisite demand’” - Bloomberg Businessweek
  8. On the other hand… • Some people believe Data Science itself will be automated • “New Teradata Platform Reduces Demand For Data Scientists” - Forbes • “Automating the Data Scientist” - MIT Technology Review
  9. What do I think? • Yes, advanced tools will automate some data exploration • But: research and communication (the fundamental skills), are always in demand when the world is changing • Data will continue to explode (Internet of Things) • We will see more change and faster change
  10. What is Data Engineering? • Specialized type of software engineering • Requires additional training in: • Data (SQL, NoSQL, data visualization) and Big Data (Hadoop, Apache Spark/Storm/Flink, cloud) • Machine Learning algorithms and platforms (ex. Dato) • Predictive APIs (ex. Watson) • Linear Algebra & Calculus really help to understand Machine Learning
  11. Data Engineering vs. Data Science Data Science Data Engineering Approach Scientific (Exploration) Engineering (Development) Problems Unbounded Bounded Path to Solution Iterative, exploratory, nonlinear Mostly linear Education More is better (PhD’s common) BS and/or self-trained Presentation Skills Important Not as important Research experience Important Not as important Programming skills Not as important Important Data skills Important Important
  12. Data Science vs. Business Intelligence Business Intelligence (BI) Data Science Data analysis Yes Yes Statistics Yes Yes Visualization Yes Yes Data Sources Usually SQL, often Data Warehouse Less structured (logs, cloud data, SQL, noSQL, text) Tools Statistics, Visualization Statistics, Machine Learning, Graph Analysis, NLP Focus Present and past Future Approach Analytic Scientific Goal Better strategic decisions Advanced functionality The two fields are closely related. In some ways Data Science is an evolution of BI.
  13. What industries use Data Science? • Now: Technology (employ over 50%), Education, Finance, Consulting, Health Care • But: “Technology” companies like Uber, Amazon, AirBnB compete in other industries (transportation, retail, hotels) • “Software is eating the world” – Andreessen Horowitz • What industries will AI change? Ultimately, all of them. • Incorporating AI == large business opportunity
  14. Education for Data Science/Engineering • Academic programs • Boot camps • Online classes (Coursera & Udacity) • For Data Engineering: • Documentation and webinars (self-education) • Focus on data manipulation tools and Machine Learning • For Data Science: • The more academic science and research expertise, the better • Focus on projects that solve unknown problems • Work with more experienced Data Scientists
  15. Questions? Contact me:, twitter: @davidrostcheck Articles:
  16. “Big Data” • Specialized technologies and techniques for working with very large data sets • Often too big to process with one computer – need clusters and/or cloud computing • Data may change rapidly • Specialized tools: Map/Reduce, Apache Hadoop/Spark/Storm/Flink, Elastic, etc. • Large demand, but keep perspective: • Big Data tools can be more awkward • It is often easier to solve problems at small scale, then scale up, if possible

Notes de l'éditeur

  1. Statistically model human behavior Predict and respond to humans Understand natural language and the natural world Understand subtle patterns in big data
  2. - Machine Learning is here to stay
  3. On a large team, Data Science and Data Engineering are separate roles On a small team, a Data Scientist must do (at least some) of his/her own Data Engineering The roles are new and not strictly defined. Today, often one role is called by the other’s name.