Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

introduction to data science

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Prochain SlideShare
Data Science
Data Science
Chargement dans…3
×

Consultez-les par la suite

1 sur 29 Publicité

introduction to data science

Télécharger pour lire hors ligne

Defining Data Science
• What Does a Data Science Professional Do?
• Data Science in Business
• Use Cases for Data Science
• Installation of R and R studio

Defining Data Science
• What Does a Data Science Professional Do?
• Data Science in Business
• Use Cases for Data Science
• Installation of R and R studio

Publicité
Publicité

Plus De Contenu Connexe

Publicité

Plus par bhavesh lande (20)

Publicité

introduction to data science

  1. 1. Introduction to Data Science Week 1 www.swaraadyasolutions.co.in
  2. 2. Agenda • Defining Data Science • What Does a Data Science Professional Do? • Data Science in Business • Use Cases for Data Science • Installation of R and R studio www.swaraadyasolutions.co.in
  3. 3. www.swaraadyasolutions.co.in
  4. 4. Defining Data Science • Data Science deals with the science and algorithms related to data. • Data generated from various sort of sources. • Report says, “Every day, approximately 2 quintillion bytes of data is generated. If it grows at this pace, then by the next 3 years, it is expected that 2MB of data will be created every second for every individual on this planet.” • Last 2 years witnessing the creation of 90% of data over the globe. www.swaraadyasolutions.co.in
  5. 5. • Data has two sources: • Structured • Unstructured • Structured sources include information that is compatible with the relational database. • E.g. ATM transactions, Flight Tickets which enable SQL to make changes in them. • Unstructured data is generated from tweets and comments on social media, audio and video files which the SQL cannot process. www.swaraadyasolutions.co.in
  6. 6. Definition “ Data Science is a broad field which is an assembly of scientific techniques, methods, processes used to clean the data and then extract some useful patterns and insights in form of visualizations.” • Visualizations are crucial to make important business decisions and come up with strategies that are instrumental for organization’s well-being. www.swaraadyasolutions.co.in
  7. 7. History In 1997, when C. F. Jeff at University of Michigan, stated that below concepts should be studied under phrase Data Science. • Data Collection • Data Modeling • DataAnalysis www.swaraadyasolutions.co.in
  8. 8. Role of Data Science on Statistics • Statistics • Mathematics • Computer Science • DataAnalysis • CriticalThinking • Problem Solving • Machine Learning • DataVisualization www.swaraadyasolutions.co.in
  9. 9. Data Science?? In 2012, it was titled as the “The sexiest job of the 21st Century” by Harvard Business School. www.swaraadyasolutions.co.in
  10. 10. www.swaraadyasolutions.co.in
  11. 11. Statistics • Statistics is the branch of mathematics that deals with data collection, categorization, interpretation and presentation. • These techniques helped with the processing and analyzing of the data at a large scale. www.swaraadyasolutions.co.in
  12. 12. StatisticsTechniquesTo Deal with Data • Data Collection – Collecting relevant data/information – Primary data includes surveys, observations and experiments. – Secondary data has internal records and government published data. • Data Categorization and Classification – Organized to get some insights For example, we have data of heights of 10 people 160cm, 165cm, 155cm, 190cm, 177cm, 181cm, 179cm, 185cm, 159cm, 173cm This data in an ordered array will look like 155cm, 159cm, 160cm ,165cm, 173cm, 177cm, 179cm, 181cm, 185cm, 190cm The above data tells us that 155cm is the shortest height while 190cm is the tallest. www.swaraadyasolutions.co.in
  13. 13. StatisticsTechniquesTo Deal with Data • Data Classification – Assembly of relevant facts/data into different categories/groups as per features. – Factors are: • Geographical • Chronological (basis of time) • Qualitative • Quantitative • Data Presentation – Includes frequency distribution using histograms. – For example, assume you are looking for prospective clients for your new product which is an electric bike. www.swaraadyasolutions.co.in
  14. 14. Applications • Data Science has tons of applications in real-world implementation. • Recommender Systems – Content based – keeps track of users watching habits. – Collaborative based – recognizes users with similar tastes. • Voice and Image Recognition • Spam and Fraud Detection • Many more……. www.swaraadyasolutions.co.in
  15. 15. Data Scientists andTheir Role • Data Scientist is a Rockstar!!! • A Data Scientist is an individual who has the power and freedom to experiment with tons of different kinds of data. • Based on knowledge of: – Mathematics – Problem solving – Critical thinking – Careful analysis www.swaraadyasolutions.co.in
  16. 16. • For anyone who is willing to carry this “tag” along should be well-versed with a lot of concepts. Some of them are • Mathematics • Statistics • Problem-solving • Data wrangling or data munging • Coding prowess in both R and Python • SQL • Hadoop • Machine learning and AI • Data visualization • Communication skills www.swaraadyasolutions.co.in
  17. 17. Data Analyst v/s Data Scientist • Data Analyst has a lot to do with converting the data into a structured format in order to process it further. • Focus more on Data Mining and Data Auditing • Data mining involves retrieving information from large databases with the help of SQL to extract new data/information. • Data auditing involves checking the essence of data and trying to figure out if the data is capable enough for gaining useful insights or not. www.swaraadyasolutions.co.in
  18. 18. Data Analyst v/s Data Scientist • Data Scientist take the clean data and trying to gain some meaningful insights. • An algorithm either from classification or regression is implemented in order to create a model and make it sustainable enough to gain some business insights with the help of visualization tools. www.swaraadyasolutions.co.in
  19. 19. www.swaraadyasolutions.co.in
  20. 20. Are There Enough Skilled Data Scientists In The Industry? • According to a survey conducted by IBM, the demand for data scientists will soar by 28% by 2020. • That includes all jobs which require machine learning, big data, visualization likeTableau and PowerBI expertise and knowledge of data analysis. • This is divided among the industries looking for such professionals in finance, insurance, professional services, and IT sectors. www.swaraadyasolutions.co.in
  21. 21. A candidate who is always thirsty for new challenges and loves problem-solving of any kind is capable to become a skilled data scientist. He likes observing and defining a problem from different angles and perspectives. Coding is his daily hustle and loves doing it, not because the problem demands him to do, but he knows how interesting it becomes to come up with new findings and insights and then make a cute little story out of it! www.swaraadyasolutions.co.in
  22. 22. Data Science Effects How Can Data Science Help A Business/CompanyGrow? • Data Science was breathing in the IT industry for a long time. • The sudden increase in the amount of data hinted the companies to make it a norm slowly and steadily. • There are numerous ways in which this emerging discipline can help an organization grow and achieve new heights • Business logistics, including supply chain optimization • Finance • Health and wellness • Education and electronic teaching • Climate and energy www.swaraadyasolutions.co.in
  23. 23. Popular Data ProcessingTOOLS in Data Science • Jupyter – open source tool to create and distribute documents • R Studio – open source tool for R programming. • SAS – analytics tool. • Apache Spark – open source shared software specializes in cluster computing. • Microsoft Excel – spreadsheet. • SQL – programming language. • Tableau – data visualization tool used for representing data in terms of charts. • PowerBI – business intelligence tool developed by Microsoft. www.swaraadyasolutions.co.in
  24. 24. What does Data Science Professional Do? www.swaraadyasolutions.co.in
  25. 25. www.swaraadyasolutions.co.in
  26. 26. www.swaraadyasolutions.co.in
  27. 27. Installation of R and R Studio www.swaraadyasolutions.co.in
  28. 28. Conclusion/Endnotes • Data Science is turning out to be one of the fastest growing fields in the US and India. • Today, it has its foot in weather forecasting, sales prediction, fraud and spam detection, pattern recognition, taxi fare prediction, sentiment analysis, and neural networks. • The future of data science is going to be dominated byArtificial Intelligence and Automation. • These two big-heads have the capability of changing the current market scenario into something that data scientists describe as the “age of revolution”. • Machines are enriching themselves with new concepts and technology every counting second which is making them smarter and sharper than humans. • Looking at the current scenario of the market, data science is slowly and gradually making its way into businesses and enterprises. www.swaraadyasolutions.co.in
  29. 29. www.swaraadyasolutions.co.in

×