Publicité

Data analytics

assistant Professor à bharathiar university
30 Apr 2016
Publicité

Contenu connexe

Publicité

Data analytics

  1. Dr.V.Bhuvaneswari Assistant Professor Department of Computer Applications Bharathiar University Coimbatore bhuvanes_v@yahoo.com, bhuvana_v@buc.edu.in visit at www.budca.in/faculty.php DATA SCIENCE
  2. Data Analytics  Data Science  Data Classification  Components  Data Analytics – Need  Data Analytics – Classification  Data Science – Roles  Data Analytics – Use Cases  Data Analytics – Success Stories 3 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  3. Data Science  "Data Science" was used by statisticians and economist in early 1970 and defined by Peter Naur in 1974.  Data Science” has gained popularity in the last couple of years because of the massive data deposits  Usage of Big Data technology to explore data used in large corporates, government and industries made the term data science catchy. 4Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  4. Data Science as Discipline  Data Science has emerged as a new discipline to provide deep insight on the large volume of data.  Data Science is fusion of major disciplines like Computational Algorithms, Statistics and Visualization  90% of the world’s data has been created in the last two years which includes 10% of structured data and 80% of unstructured data  The digital universe is in data deluge and estimated to be larger than the physical universe and data unit measurement is predicted as Geopbytes 5Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  5. 6Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  6. Data Classification ◦ Open Data ◦ Closed Data ◦ Hot Data ◦ Warm Data ◦ Cold Data ◦ Thin Data ◦ Thick Data 7Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  7. Data Science Components Pre-Processing - ETL Dash Boards ChartsPie, Bar Histogram Data Models Linear Regression, Decision Tree, Dimensionality Reduction Clustering Outlier Analysis Association Analysis 8Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  8. Data Analytics – Need for today  Data considered as digital asset similar to other property.  The organizations believe data generated by them will provide deep insights to understand their business process for arriving strategic decisions.  The earlier limitation of computational storage and processing is overcome by the technologies of cloud computing and big data techniques. 9Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  9. Data Science Vs Data Analytics  Data Science is a discipline which groups techniques and methods from various domains to study about data and data analytics is a component in Data Science.  Data Analytics is a process of analyzing the dataset to find deep insights of data using computational algorithms and statistical methods. There exists no common procedure to 10Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  10. Data Analytics Vs Big Data Analytics  Data Analytics is used to explore and analyze datasets using statistical methods and models.  Big Data Analytics is used to analyze data with the characteristics of Volume, Velocity and Variety by integrating statistics, mathematics, computational algorithms in Big data Platform. 11Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  11. UNDERSTANDING DATA ANALYTICS – A SCENARIO Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 12
  12. Data Analytics - Classification Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 13
  13. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 14
  14. Efforts – Data Analytics Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 15
  15. Data Insights Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 16
  16. Data Analytics - Methods Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 17
  17. Data Science - Landscape Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 18
  18. Statistics in Data Analytics  Basics – Exploratory Data Analytics  Descriptive Data Analysis – Central Tendency, Normal Distributions  Inferential Data Analysis – Sampling Population – Annova, Paired T-test Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 19
  19. Predictive Analytics - Tasks Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 20
  20. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 21
  21. Data Science – Emerging Roles  Data Scientist is responsible for scrubbing data to bring out deep insights of data Skills : Expert in CS, Mathematics, Statistics Work on open ended research problems  Data Engineer is responsible for managing and administering the infrastructure and storage of data. Skills : Strong skills in Programming and Software Engineering  Deep Knowledge in Data warehousing  Expertise in Hadoop, NOSQL and SQL technologies  Data Analyst is one who views the data from one source and has deep insight on the data based on the organization guidance. Skills : Competency Skills in understanding of Statistics 22Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  22. Data Analytics Use Case Scenario 23
  23. Data Science Applications  Data Personalization - Logs, Tweets, Likes  Smart Pricing – Air Transportation  Financial Services – Fraud Detection Insurance  Smart Grids – Energy Management 24Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  24. Air Fare Management – Use case 1 Objectives: Hike airfare based on High Value Customers - CRM. Strategic decision requires Understanding of data insights How customers are divided? Which customer is high value customer? Who is Frequent flyer? How to retain customers? Data sources : Conventional Enterprise information Data from weblogs, social media, competitors pricing 25Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  25. Data Engineering Airfare Classification (Economy, Business,First) Analyse factors (Enterprise Datasources) – Data Exploration techniques Passenger Booking information Forecasted data - Statistics Inventory Customers Behavioral data - Predictive Analytics – Statistical models – Decision tree, classification Information has to be gained from websites that provide route information, dining, preferable locations Holistic Analytics Analyzing customer data from Social profiles, sales, CRM etc. 26Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  26. Complexities and Challenges Data is larger than terabytes Data integration Variety data formats Solution Big data Accelerators Hadoop ecosystem Analytic components Integrated data warehouses Source: Big data spectrum Infosys 27Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  27. Insurance Fraud Detection – Use case Scenario Data Engineering Verifying customer data Customer Profile analysis Verification of claims raised Fraud detection from disparate systems Exact claim reimbursement Data Sources Data about customer, product sold from ERP, CRM Credit history from other sources Data from social networking – Customer profiles, product rating, credit rating from 3rd parties 28Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  28. Health Epidemics Data Engineering Kind of epidemics and target users Causes and effects with respect to locations Environmental and other related issues of epidemics Data on Awareness Data Sources EHR records, Medical Insurance claims, Socialmedia – awareness, ERP Systems Data Analytics Descriptive Analytics Predictive Analytics ( Model based analysis) 29Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  29. Big Data Challenges Privacy Protection All Big data stages collect, store, process, knowledge Integration with enterprise landscape All systems store data in rdbms,DW Does not support bulk loading to Big data store Limited number of analytics from Mahout Big data technologies lack visualization support and deliverable methods Leveraging cloud computing for big data applications Addressing Real time needs with varied format and volume 30Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  30. PART B : Big Data Analytics – Success Scenario 31
  31. ORION - Franz Edelman Award,  Award which recognizes excellence in Operations Research and analytics  ORION, an acronym that stands for On-Road Integrated Optimization and Navigation, is perhaps the largest commercial analytics project ever undertaken.  It’s required well over a decade to build and roll out, and more than $250 million of investment by UPS.  At its peak, over 700 UPSers were working on change management and rollout of the system. So the company clearly went all in on this project.  The company is receiving something in return for its investment; and indeed it is.  savings (in driver productivity and fuel economy) of between $300 and $400 million a year?  How about 100 million fewer miles driven and a resulting cut in carbon emissions of 100,000 metric tons a year?  benefit from an analytics project very often, and these have been confirmed through intensive measurement and reported to Wall Street analysts. –  See more at: http://data-informed.com/prescriptive-analytics-project- delivering-big-dividends-at-ups/#sthash.HcY5kYwu.dpuf Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 32
  32. Predictive analytics - Netflix  Netflix, however, has raised the TV show batting average considerably.  The company’s use of predictive analytics to improve customer recommendation algorithms for movies.  The company has used analytics to predict whether TV shows will be home runs, solid base hits, or strikeouts Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 33
  33. Predictive Analytics – 2022 Source - Dataquest  Antitheft. As you enter your car, a predictive model establishes your identity based on several biometric readings, rendering it virtually impossible for an imposter to start the engine.  Entertainment. Spotify plays new music it predicts you will like.  Traffic. Your navigator pipes up and suggests alternative routing due to predicted traffic delays. Because the new route has hills and your car’s battery – its only energy source – is low, your maximum acceleration is decreased.  Breakfast. An en route drive-through restaurant is suggested by a recommendation system that knows its daily food preference predictions must be accurate or you will disable it.  Social. Your Social Techretary offers to read you select Facebook feeds and Match.com responses it predicts will be of greatest interest. Inappropriate comments are filtered out. CareerBuilder offers to read postings of jobs for which you’re predicted to apply. When playing your voice mail, solicitations such as robocall messages are screened by a predictive model just like e- mail spam.  Deals. You accept your smartphone’s offer to read to you a text message from your wireless carrier. Apparently, they’ve predicted you’re going to switch to a competitor, because they are offering a huge discount on theDr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 34
  34. IoT- Data Analytics - Manufacturing  According to Accenture, the Industrial Internet of Things has the potential to add more than $14 trillion to the global economy by 2030.  Small sensors placed on complex machinery emit performance data that can be used to adjust scheduled maintenance.  With this functionality, industries such as energy and oil extraction are now able to predict and mitigate equipment failures, significantly reducing downtime, increasing site safety, and cutting costs. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 35
  35. IoT – Big Data Analytics  Experts are predicting fully automated farms in the next five years, but already monster machines, such as the New Holland T8.435 tractor, are becoming commonplace not only on very big farms, but also on mid-sized ones.  The tractor’s steering is assisted by satellite. It downloads crop and soil data straight to agronomists and farm managers, works 24/7, can link with ground sensors and drones using infrared thermal cameras to tell, within a square meter, the size of a field and where the most fertile or waterlogged places are. Big data, machinery, climatology, and agronomy are all combining to increase productivity and reduce labor costs.  Livestock farming has not gone unnoticed by big data and IoT developers, either. Wearable technology is no longer just for humans. Any animal, from elephants to cows to cats and dogs, can wear or be injected with devices that capture health and behavioral data.  iNOVOTEC Animal Care, for example, has created wearable and ingestible devices that provide information about an animal’s condition that is not easily observable. This enables farmers to catch illnesses much earlier, leading to healthier stock and cost savings. 36Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  36. Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 37
  37. 38Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Publicité