Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

How to become a data scientist

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 61 Publicité

How to become a data scientist

Télécharger pour lire hors ligne

The talk is on How to become a data scientist. This was at 2ns Annual event of Pune Developer's Community. It focuses on Skill Set required to become data scientist. And also based on who you are what you can be.

The talk is on How to become a data scientist. This was at 2ns Annual event of Pune Developer's Community. It focuses on Skill Set required to become data scientist. And also based on who you are what you can be.

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à How to become a data scientist (20)

Publicité

Plus récents (20)

How to become a data scientist

  1. 1. How to become a Data Scientist? Manjunath Sindagi Pune Developer’s Community Annual Event 20.01.2018
  2. 2. Co-Founder @Hyperdata.io and Data Science Advisor
  3. 3. Focus on: Apply AI to Business Problems.
  4. 4. Agenda ● ● Why Data Scientist? ● Artificial Intelligence ● Being a Data Scientist ● Broad Skill Sets ● Skill Sets based on your Profile ● Typical Team in Data Science ● Conclusion & Final Remarks ● References to start
  5. 5. Why data scientist?
  6. 6. Why data scientist? ● Buzzword ● Fancy, Cool ● Demand ● Data Growth ● Passion
  7. 7. No. 1 Job in Glassdoor for 2016 and 2017
  8. 8. Average Salary is Higher
  9. 9. Growth of Data Scientist - IBM
  10. 10. Any industry that has digitized data, people are needed to support the ecosystem and find insights from the data.
  11. 11. Challenges of Data 4 Vs of Data ➢ Volume ➢ Velocity ➢ Variety ➢ Veracity
  12. 12. Data Science and Analytics (DSA) jobs remain open an average of 45 days, five days longer than the market average.
  13. 13. Big Data and Data Science Skills are most challenging to recruit for and potentially can create the greatest disruption.
  14. 14. Some 59% of all DSA job demand currently is in finance and insurance, professional services and IT sector. (according to IBM)
  15. 15. Job Trends from Indeed.com
  16. 16. Google Trends
  17. 17. Artificial Intelligence
  18. 18. Artificial Intelligence Artificial Intelligence is defined as the science of making computers do things that require intelligence when done by Humans Making sense out of data .
  19. 19. AI And Related Fields
  20. 20. Being a Data Scientist
  21. 21. Who is Data Scientist?
  22. 22. Who is Data Scientist? One who has wide breadth of abilities: ● Academic curiosity ● Storytelling ● Product sense ● Engineering experience ● Cleverness And above all ● deep domain expertise in Mathematics, Statistical and Machine Learning
  23. 23. T - Shaped Skill Set
  24. 24. What is Data Science? Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured, similar to data mining. - Wikipedia : https://en.wikipedia.org/wiki/Data_science
  25. 25. Data Science Interdisciplinary Field
  26. 26. Being a Data Scientist is Jack of All, Master of Everything!
  27. 27. Broad Skills - Knowledge Prerequisites
  28. 28. Programming ● Strong Programming Skills ● Strong with Python, R or Java ● Fundamentally Strong Data Structure Knowledge ● Debugging Skills ● Exceptional Problem Solving Skills
  29. 29. Mathematics ● Algebra ● Statistics ● Differentiation ● Calculus
  30. 30. Core Areas Pick One ● Information Retrieval ● Natural Language Processing ● Linguistics ● Machine Learning ● Image Processing ● Video Processing ● Speech Processing ● Then pick Neural Networks and Deep Learning
  31. 31. Tools and Technology If not all at least 3-4 ● Excel ● R, Python ● Spark ● Hadoop ● Scala ● AWS ● Solr, Elastic Search ● New ML Libraries - Tensorflow, Caffe ● Queueing System
  32. 32. Top Tools used in 2015-17 : Kdnuggets
  33. 33. Application of Algorithms ● Practical Implementations ● Follow Kaggle, KDNuggets and solve problems ● Ability to quickly suggest algorithms to apply and also to implement the same ● Working with a Mentor will help
  34. 34. Data Savvy ● Data Oriented Mindset ● Quickly understand the problem and give solutions in short span of time ● Ability to think how data can add value to business and what insights can be driven.
  35. 35. As a data scientist, if you know nothing else, you need to know how to take some data, munge it, clean it, filter it , mine it, visualize it and then validate. It’s a very long process
  36. 36. Learning PathWays ● MS/MTech/PhD ● Self Study ● Boot Camps ● Online Courses
  37. 37. Skill Set based on Current Profile
  38. 38. Skill Sets Focus ● Exceptionally Strong Programming Skills ● Strong Data Structure Knowledge ● Master Python, R, Java ● Github Profile ● Then, Work in Companies to Solve Problems Freshers
  39. 39. Skill Sets Focus ● Mathematics ● Course - Take Up a Course Online. ● Pick up a Area - ML, NLP, Linguistics etc ● Apply and Solve Problems in Kaggle Programmers (> 2 Years Experience)
  40. 40. Skill Sets Focus ● Strong AWS Knowledge ● Knowledge of ML/DL Libraries and Tools ● Photographer’s Mind ● Be a Data Engineer than a Scientist ● Practice, Practice, Practice Programmers (> 2 Years Experience)
  41. 41. Skill Set Focus Programmers with over 10 Years Experience ● Their curiosity helps to find the problem on their own and they solve it themselves. ● Can take a course and Talk to people with experience in these areas. ● Inability to admit the lack of knowledge ● Understand Scale Challenges with Data
  42. 42. Skill Set Focus Business Analyst/Managers ● Take a Course ○ Understand how things are built ○ Not necessary to know mathematics or programming ● Understand the steps in ML ○ Data Collection ○ Data Preparation ○ Model Selection ○ Training ○ Evaluation ○ Parameter Tuning ○ Prediction
  43. 43. Skill Set Focus Business Analyst/Managers ● Incremental Systems ● Accuracy Models ● View AI Videos applied to Business. ● Log the data properly in your applications. ● Ability to convey problems to Solutios Architects
  44. 44. Skill Set Focus Database Admins ● Understand different types of Data ○ Text,Images,Numbers,Files etc ● Learn all about storage mechanisms, advantages, disadvantages of different databases ○ NoSQL - Mongo, Cassandra, GraphDB (Neo4J), CouchDB ○ SQL
  45. 45. Skill Set Focus Database Admins ● Ability to convey what database is optimal to what type of data. ● Design and Build Models for various kinds of data on paper ● Practice Modeling of Data Extensively.
  46. 46. Skill Set Focus Domain Experts/CxOs ● Same as Business Analysts/Managers. ● Formulating the Business around Data ● AI is used to solve business problem ● AI is used for Automation
  47. 47. Typical Team
  48. 48. So, is it possible to be Jack of All, and Master of Everything?
  49. 49. Typical Team ● Domain Expert (Business Analyst/Product Manager) ● Solutions Architect ● Developer - Data Collection/Preparation ● Data Scientist/NLP ● Data Engineer. ● Application Developer (Expert in Visualization domain)
  50. 50. Data Scientist vs Data Engineers
  51. 51. Data Scientist ● Mathematics, Statistics etc. ● Expert in ML, NLP etc in at least one of these areas.. ● Knows to Apply Different ML Models and Algorithms
  52. 52. Data Engineers ● Takes inputs from Data Scientists once the problem is solved ● Exceptionally good at Programming and different tools ● Solves problems at Scale ● Productionize the solutions.
  53. 53. Concluding Remarks ● Programming ● Engineering Problems ● Data Organization and Modeling Problems ● Data collection and Preparation Major Challenges in Data
  54. 54. Data Science - Task Allocation Data Acquisition and Preparation - Major time consuming task
  55. 55. Concluding Remarks ● Everything is on cloud, let’s use it. ● Unaware of Business Value ● Clueless About Data Science and Related Technologies ● Solutions Architect and Domain Experts are critical to know before you join. ● Vision of the Company Major Challenges in Companies
  56. 56. Final Remarks Work with Mentor Have a Photographer’s Mind. Choose your career wisely
  57. 57. References to Start
  58. 58. Online Course ● Coursera : Andrew NG Machine Learning Course https://goo.gl/fDTwSE ● Youtube : Prof. Sengupta https://goo.gl/JGG6th
  59. 59. People and Books ● People to follow. ○ Andrew NG ○ Bernard Marr - AI Journalist. ○ Geoffrey Hinton ○ Roman Trusov ○ Many people : https://www.quora.com/Who-are-some- notable-machine-learning-researchers ● Books ○ Programming Collective Intelligence
  60. 60. Q & A
  61. 61. Thank You :)

×