SlideShare une entreprise Scribd logo
1  sur  48
WDABT 2016 – BHARATHIAR UNIVERSITY
Dr.V.Bhuvaneswari
Assistant Professor
Department of Computer Applications
Bharathiar University
Coimbatore
bhuvanes_v@yahoo.com, bhuvana_v@buc.edu.in
visit at www.budca.in/faculty.php
BIG DATA ROADMAP
Big Data Roadmap
 Timeline – Big Data Predictions
 Data Growth in Units
 Data Landscape
 Data Explosion
 Big Data Myths
 Big Data
 5Vs of Big Data
 Why Big Data
 Data as Data Science
3
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Timeline – Big Data
Predictions
1944- Yale Library in 2040 will have “approximately
200,000,000 Volumes
1961- Scientific Journals will grow exponentially rather than
linearly, doubling every fifteen years and increasing
by a factor of ten during every half-century.
1975- Ministry of Posts and Telecommunications in Japan
introduced words as unifying unit of measurement
1997- First article published by Michael Cox and David
Ellsworth in in the ACM digital library to the term
“Big data.”
Big Data evolved in 1997 and exploded to greater heights in
2010 and become popular in 2012
4Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Growth – in Units
5Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Landscape
6
Dr.V.Bhuvaneswari, Asst.Professor,
Dept. of Computer Applications,
Bhararthiar University
BIG DATA FACTS
 Every 2 days we create as much information
as we did from the beginning of time until
2003
 Over 90% of all the data in the world was
created in the past 2 years.
 It is expected that by 2020 the amount of
digital information in existence will have
grown from 3.2 zettabytes today to 40
zettabytes.
 Every minute we send 204 million emails,
generate 1.8 million Facebook likes, send
278 thousand Tweets, and up-load 200,000Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 7
Big Data Explosion
12+ TBs
of tweet data
every day
25+ TBs
of
log data
every day
?TBsof
dataevery
day
2+
billion
people
on the
Web by
end 2011
30 billion RFID
tags today
(1.3B in 2005)
4.6
billion
camera
phones
world
wide
100s of
millions
of GPS
enabled
devices
sold
annually
76 million smart
meters in 2009…
200M by 2014
Data Deluge
Big Data Market Size
Potential Talent Pool -Big
Data
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
India will require a minimum of 1 lakh data scientists in the next couple
of years in addition to data analysts and data managers to support the
Big Data space.
11
BIG DATA MYTHS
Big Data
• New
• Only About Massive Data Volume
• Means Hadoop
• Need A Data Warehouse
• Means Unstructured Data
• for Social Media & Sentiment
Analysis
12
Dr.V.Bhuvaneswari, Asst.Professor,
Dept. of Computer Applications,
Bhararthiar University
Lets Us Clarify
13
Dr.V.Bhuvaneswari, Asst.Professor,
Dept. of Computer Applications,
Bhararthiar University
Big Data
Big Data is
 A complete subject with tools, techniques
and frameworks.
 Technology which deals with large and
complex dataset which are varied in data
format and structures, does not fit into
the memory.
 Not about huge volume of data; provide
an opportunity to find new insight into the
existing data and guidelines to capture
and analyze future data
14
Dr.V.Bhuvaneswari, Asst.Professor,
Dept. of Computer Applications,
Bhararthiar University
Big Data : A Definition
 Big data is the realization of greater
business intelligence by storing,
processing, and analyzing data that
was previously ignored due to the
limitations of traditional data
management technologies
:Source: Harness the Power of Big Data: The IBM Big Data Platform
15
Dr.V.Bhuvaneswari, Asst.Professor,
Dept. of Computer Applications,
Bhararthiar University
BIG DATA as Platform
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Source: IBM
16
4 V‘s of Big Data
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 17
5Vs of Big Data
Volume
Velocity
Variety
Veracity
Value
18Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Why Big Data ?
19
Big Data
Exploration
Find, visualize,
understand all big
data to improve
decision making
Enhanced 360o View
of the Customer
Extend existing customer
views (MDM, CRM, etc) by
incorporating additional
internal and external
information sources
Security/Intelligence
Extension
Lower risk, detect fraud
and monitor cyber security
in real-time
Data Warehouse Augmentation
Integrate big data and data warehouse
capabilities to increase operational
efficiency
Operations Analysis
Analyze a variety of machine
data for improved business results
The 5 Key Big Data Use Cases
Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer
Applications, Bhararthiar University
2
0
21Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Science
 "Data Science" was used by
statisticians and economist in early
1970 and defined by Peter Naur in
1974.
 Data Science” has gained popularity in
the last couple of years because of the
massive data deposits
 Usage of Big Data technology to
explore data used in large corporates,
government and industries made the
term data science catchy.
22Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Science as Discipline
 Data Science has emerged as a new discipline to
provide deep insight on the large volume of data.
 Data Science is fusion of major disciplines like
Computational Algorithms, Statistics and
Visualization
 90% of the world’s data has been created in the
last two years which includes 10% of structured
data and 80% of unstructured data
 The digital universe is in data deluge and
estimated to be larger than the physical universe
and data unit measurement is predicted as
Geopbytes
23Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
24
Dr.V.Bhuvaneswari, Asst.Professor,
Dept. of Computer Applications,
Bhararthiar University
Data Growth in Bytes
25Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Classification
◦ Open Data
◦ Closed Data
◦ Hot Data
◦ Warm Data
◦ Cold Data
◦ Thin Data
◦ Thick Data
26Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Analytics – Need for
today
 Data considered as digital asset
similar to other property.
 The organizations believe data
generated by them will provide deep
insights to understand their business
process for arriving strategic
decisions.
 The earlier limitation of computational
storage and processing is overcome
by the technologies of cloud
computing and big data techniques.
27Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Science Components
Pre-Processing
- ETL
Dash
Boards
ChartsPie,
Bar
Histogram
Data Models
Linear
Regression,
Decision Tree,
Dimensionality
Reduction
Clustering
Outlier
Analysis
Association
Analysis
28Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Science - Big Data Technology
 Collect, Load, Transform
◦ ETL SCRIBE, FLUME
 Store
◦ HADOOP, SPARK, STORM
 Process, Analyze and Reasoning
◦ Computational Algorithms,
◦ Statistical Methods and Models
 R, PIG, HIVE,
 PHYTON, JAVA, SCALA,
 CLOJURE, MAHOUT
 Visualization
◦ DASHBOARD, APP
29Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Science Vs Data Analytics
 Data Science is a discipline which
groups techniques and methods from
various domains to study about data
and data analytics is a component in
Data Science.
 Data Analytics is a process of
analyzing the dataset to find deep
insights of data using computational
algorithms and statistical methods.
There exists no common procedure to
30Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Analytics Vs Big Data
Analytics
 Data Analytics is used to explore and
analyze datasets using statistical
methods and models.
 Big Data Analytics is used to analyze
data with the characteristics of
Volume, Velocity and Variety by
integrating statistics, mathematics,
computational algorithms in Big data
Platform.
31Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Science – Emerging
Roles
 Data Scientist is responsible for scrubbing data
to bring out deep insights of data
Skills : Expert in CS, Mathematics, Statistics
Work on open ended research problems
 Data Engineer is responsible for managing and
administering the infrastructure and storage of
data.
Skills : Strong skills in Programming and Software Engineering
 Deep Knowledge in Data warehousing
 Expertise in Hadoop, NOSQL and SQL technologies
 Data Analyst is one who views the data from one
source and has deep insight on the data based on
the organization guidance.
Skills : Competency Skills in understanding of Statistics
32Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Analytics Use Case
Scenario
33
Data Science Applications
 Data Personalization - Logs, Tweets, Likes
 Smart Pricing – Air Transportation
 Financial Services – Fraud Detection
Insurance
 Smart Grids – Energy Management
34Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Air Fare Management – Use
case 1
Objectives: Hike airfare based on High Value
Customers - CRM.
Strategic decision requires Understanding of data
insights
How customers are divided?
Which customer is high value customer?
Who is Frequent flyer?
How to retain customers?
Data sources :
Conventional Enterprise information
Data from weblogs, social media, competitors pricing
35Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Engineering
Airfare Classification (Economy, Business,First)
Analyse factors (Enterprise Datasources) – Data
Exploration techniques
Passenger Booking information
Forecasted data - Statistics
Inventory
Customers Behavioral data - Predictive Analytics –
Statistical models – Decision tree, classification
Information has to be gained from websites that
provide route information, dining, preferable locations
Holistic Analytics
Analyzing customer data from Social profiles,
sales, CRM etc.
36Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Complexities and Challenges
Data is larger than terabytes
Data integration
Variety data formats
Solution
Big data Accelerators
Hadoop ecosystem
Analytic components
Integrated data warehouses
Source: Big data spectrum Infosys
37Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Insurance Fraud Detection – Use
case Scenario
Data Engineering
Verifying customer data
Customer Profile analysis
Verification of claims raised
Fraud detection from disparate systems
Exact claim reimbursement
Data Sources
Data about customer, product sold from ERP,
CRM
Credit history from other sources
Data from social networking – Customer
profiles, product rating, credit rating from 3rd
parties 38Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Health Epidemics
Data Engineering
Kind of epidemics and target users
Causes and effects with respect to locations
Environmental and other related issues of
epidemics
Data on Awareness
Data Sources
EHR records, Medical Insurance claims,
Socialmedia – awareness, ERP Systems
Data Analytics
Descriptive Analytics
Predictive Analytics ( Model based
analysis) 39Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Big Data Challenges
Privacy Protection
All Big data stages collect, store, process,
knowledge
Integration with enterprise landscape
All systems store data in rdbms,DW
Does not support bulk loading to Big data store
Limited number of analytics from Mahout
Big data technologies lack visualization support
and deliverable methods
Leveraging cloud computing for big data applications
Addressing Real time needs with varied format
and volume 40Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
PART B : Big Data Use
Cases – Scenario
41
Big Data Applications
42Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Big Data Applications - India
 Big Data – Elections
 SBI uses big data mining to check
defaults
 Karnataka Govt – Identify water
leakage
43Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Big Data - Election
 Mined data from every Internet user in
the country, to accurately understand
voter sentiments and local issues.
 Data-based analysis was used to raise
funds and create different models for
different regions targeting on local
issues.
 India involve more than 800 million
voters with different ideologies and
expectations.
 Innovative usage of Big Data marked a
huge change in the way elections were
fought traditionally.
44Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
Data Analytics
 Modac Analytics built electroal data.
 Processing huge volumes of
unstructured data (around 10TB of
PDF documents), and also structured
data.
 Modak chose Hadoop, and self-built a
64-node cluster that had 128TB of
storage. Apart from Hadoop, the team
used PostgreSQL as the front-end
database.
 They have developed Rapid ETL to
45Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
SBI
 State Bank of India (SBI) ran its newly
acquired data-mining software recently to
check for purity of data.
 Made an interesting find - close to one crore
accountholders have not provided any
nomination for their savings accounts. What
is worse, over half of them are senior
citizens.
 To analyse trends in Banks, SBI has hired a
whole team of statisticians and economists.
 Identify default patterns, high value
customers.
46Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
47Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
48Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University

Contenu connexe

Tendances

Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Edureka!
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introductionhktripathy
 
Career in artificial inteligence
Career in artificial inteligenceCareer in artificial inteligence
Career in artificial inteligenceSandeep Patnaik
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Edureka!
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesDATAVERSITY
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICSNAGARAJAGIDDE
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceEdureka!
 
Big Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
Big Data in Healthcare: Hype and Hope on the Path to Personalized MedicineBig Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
Big Data in Healthcare: Hype and Hope on the Path to Personalized MedicineNew York eHealth Collaborative
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 

Tendances (20)

Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
 
Lect 1 introduction
Lect 1 introductionLect 1 introduction
Lect 1 introduction
 
Career in artificial inteligence
Career in artificial inteligenceCareer in artificial inteligence
Career in artificial inteligence
 
Big data Ppt
Big data PptBig data Ppt
Big data Ppt
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
 
Data Science
Data ScienceData Science
Data Science
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
HEALTH PREDICTION ANALYSIS USING DATA MINING
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
 
Data science
Data scienceData science
Data science
 
Data analytics
Data analyticsData analytics
Data analytics
 
Smart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case StudiesSmart Data Slides: Machine Learning - Case Studies
Smart Data Slides: Machine Learning - Case Studies
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
Big Data in Healthcare: Hype and Hope on the Path to Personalized MedicineBig Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
Big Data in Healthcare: Hype and Hope on the Path to Personalized Medicine
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Data science
Data science Data science
Data science
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Big data
Big dataBig data
Big data
 

En vedette

Data Engineering Quick Guide
Data Engineering Quick GuideData Engineering Quick Guide
Data Engineering Quick GuideAsim Jalis
 
The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processingnathanmarz
 
The Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data SystemsThe Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data Systemsnathanmarz
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseNick Dimiduk
 
11 Hard to Ignore Data Analytics Quotes
11 Hard to Ignore Data Analytics Quotes11 Hard to Ignore Data Analytics Quotes
11 Hard to Ignore Data Analytics QuotesCloudlytics
 
Demystifying Data Engineering
Demystifying Data EngineeringDemystifying Data Engineering
Demystifying Data Engineeringnathanmarz
 
Big Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBig Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBernard Marr
 
Big Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBig Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBernard Marr
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBernard Marr
 

En vedette (12)

Data Engineering Quick Guide
Data Engineering Quick GuideData Engineering Quick Guide
Data Engineering Quick Guide
 
Introduction to Data Engineering
Introduction to Data EngineeringIntroduction to Data Engineering
Introduction to Data Engineering
 
The inherent complexity of stream processing
The inherent complexity of stream processingThe inherent complexity of stream processing
The inherent complexity of stream processing
 
HBase Data Types
HBase Data TypesHBase Data Types
HBase Data Types
 
The Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data SystemsThe Secrets of Building Realtime Big Data Systems
The Secrets of Building Realtime Big Data Systems
 
Apache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBaseApache Big Data EU 2015 - HBase
Apache Big Data EU 2015 - HBase
 
11 Hard to Ignore Data Analytics Quotes
11 Hard to Ignore Data Analytics Quotes11 Hard to Ignore Data Analytics Quotes
11 Hard to Ignore Data Analytics Quotes
 
Demystifying Data Engineering
Demystifying Data EngineeringDemystifying Data Engineering
Demystifying Data Engineering
 
Big Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business NeedsBig Data: The 6 Key Skills Every Business Needs
Big Data: The 6 Key Skills Every Business Needs
 
Big Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must KnowBig Data: The 4 Layers Everyone Must Know
Big Data: The 4 Layers Everyone Must Know
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 

Similaire à Big data road map

Chapter 1 Introduction to Datascience (1).pptx
Chapter 1 Introduction to Datascience (1).pptxChapter 1 Introduction to Datascience (1).pptx
Chapter 1 Introduction to Datascience (1).pptxkiitlabsbsc
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfDr. Radhey Shyam
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
Big data privacy issues in public social media
Big data privacy issues in public social mediaBig data privacy issues in public social media
Big data privacy issues in public social mediaSupriya Radhakrishna
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleDr. Radhey Shyam
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and AnalyticsDhruv Saxena
 
Management of Data Collections
Management of Data CollectionsManagement of Data Collections
Management of Data Collectionsabedejesus
 
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...ijcseit
 
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...ijcseit
 
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...ijcseit
 
Certified Data Science Course in Pune-March
Certified Data Science Course in Pune-MarchCertified Data Science Course in Pune-March
Certified Data Science Course in Pune-MarchDataMites
 
Certified Data Science Course in Pune-March
Certified Data Science Course in Pune-MarchCertified Data Science Course in Pune-March
Certified Data Science Course in Pune-MarchDataMites
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
 
Chapter1_IntroductionIntroductionIntroduction.ppt
Chapter1_IntroductionIntroductionIntroduction.pptChapter1_IntroductionIntroductionIntroduction.ppt
Chapter1_IntroductionIntroductionIntroduction.pptDEEPAK948083
 
Certified Data Science Training in Pune-March
Certified Data Science Training in Pune-MarchCertified Data Science Training in Pune-March
Certified Data Science Training in Pune-MarchDataMites
 

Similaire à Big data road map (20)

Chapter 1 Introduction to Datascience (1).pptx
Chapter 1 Introduction to Datascience (1).pptxChapter 1 Introduction to Datascience (1).pptx
Chapter 1 Introduction to Datascience (1).pptx
 
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdfKIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
KIT-601-L-UNIT-1 (Revised) Introduction to Data Analytcs.pdf
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Big data privacy issues in public social media
Big data privacy issues in public social mediaBig data privacy issues in public social media
Big data privacy issues in public social media
 
Introduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycleIntroduction to Data Analytics and data analytics life cycle
Introduction to Data Analytics and data analytics life cycle
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Introduction to Data Science and Analytics
Introduction to Data Science and AnalyticsIntroduction to Data Science and Analytics
Introduction to Data Science and Analytics
 
Management of Data Collections
Management of Data CollectionsManagement of Data Collections
Management of Data Collections
 
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
 
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
 
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
A COMPREHENSIVE STUDY ON POTENTIAL RESEARCH OPPORTUNITIES OF BIG DATA ANALYTI...
 
Certified Data Science Course in Pune-March
Certified Data Science Course in Pune-MarchCertified Data Science Course in Pune-March
Certified Data Science Course in Pune-March
 
Certified Data Science Course in Pune-March
Certified Data Science Course in Pune-MarchCertified Data Science Course in Pune-March
Certified Data Science Course in Pune-March
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
 
Chapter1_IntroductionIntroductionIntroduction.ppt
Chapter1_IntroductionIntroductionIntroduction.pptChapter1_IntroductionIntroductionIntroduction.ppt
Chapter1_IntroductionIntroductionIntroduction.ppt
 
Big Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARLBig Data & DS Analytics for PAARL
Big Data & DS Analytics for PAARL
 
Big Data for Library Services (2017)
Big Data for Library Services (2017)Big Data for Library Services (2017)
Big Data for Library Services (2017)
 
Certified Data Science Training in Pune-March
Certified Data Science Training in Pune-MarchCertified Data Science Training in Pune-March
Certified Data Science Training in Pune-March
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 

Dernier

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 

Dernier (20)

BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 

Big data road map

  • 1. WDABT 2016 – BHARATHIAR UNIVERSITY
  • 2. Dr.V.Bhuvaneswari Assistant Professor Department of Computer Applications Bharathiar University Coimbatore bhuvanes_v@yahoo.com, bhuvana_v@buc.edu.in visit at www.budca.in/faculty.php BIG DATA ROADMAP
  • 3. Big Data Roadmap  Timeline – Big Data Predictions  Data Growth in Units  Data Landscape  Data Explosion  Big Data Myths  Big Data  5Vs of Big Data  Why Big Data  Data as Data Science 3 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 4. Timeline – Big Data Predictions 1944- Yale Library in 2040 will have “approximately 200,000,000 Volumes 1961- Scientific Journals will grow exponentially rather than linearly, doubling every fifteen years and increasing by a factor of ten during every half-century. 1975- Ministry of Posts and Telecommunications in Japan introduced words as unifying unit of measurement 1997- First article published by Michael Cox and David Ellsworth in in the ACM digital library to the term “Big data.” Big Data evolved in 1997 and exploded to greater heights in 2010 and become popular in 2012 4Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 5. Data Growth – in Units 5Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 6. Data Landscape 6 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 7. BIG DATA FACTS  Every 2 days we create as much information as we did from the beginning of time until 2003  Over 90% of all the data in the world was created in the past 2 years.  It is expected that by 2020 the amount of digital information in existence will have grown from 3.2 zettabytes today to 40 zettabytes.  Every minute we send 204 million emails, generate 1.8 million Facebook likes, send 278 thousand Tweets, and up-load 200,000Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 7
  • 8. Big Data Explosion 12+ TBs of tweet data every day 25+ TBs of log data every day ?TBsof dataevery day 2+ billion people on the Web by end 2011 30 billion RFID tags today (1.3B in 2005) 4.6 billion camera phones world wide 100s of millions of GPS enabled devices sold annually 76 million smart meters in 2009… 200M by 2014
  • 11. Potential Talent Pool -Big Data Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University India will require a minimum of 1 lakh data scientists in the next couple of years in addition to data analysts and data managers to support the Big Data space. 11
  • 12. BIG DATA MYTHS Big Data • New • Only About Massive Data Volume • Means Hadoop • Need A Data Warehouse • Means Unstructured Data • for Social Media & Sentiment Analysis 12 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 13. Lets Us Clarify 13 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 14. Big Data Big Data is  A complete subject with tools, techniques and frameworks.  Technology which deals with large and complex dataset which are varied in data format and structures, does not fit into the memory.  Not about huge volume of data; provide an opportunity to find new insight into the existing data and guidelines to capture and analyze future data 14 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 15. Big Data : A Definition  Big data is the realization of greater business intelligence by storing, processing, and analyzing data that was previously ignored due to the limitations of traditional data management technologies :Source: Harness the Power of Big Data: The IBM Big Data Platform 15 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 16. BIG DATA as Platform Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University Source: IBM 16
  • 17. 4 V‘s of Big Data Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 17
  • 18. 5Vs of Big Data Volume Velocity Variety Veracity Value 18Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 19. Why Big Data ? 19
  • 20. Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Operations Analysis Analyze a variety of machine data for improved business results The 5 Key Big Data Use Cases Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University 2 0
  • 21. 21Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 22. Data Science  "Data Science" was used by statisticians and economist in early 1970 and defined by Peter Naur in 1974.  Data Science” has gained popularity in the last couple of years because of the massive data deposits  Usage of Big Data technology to explore data used in large corporates, government and industries made the term data science catchy. 22Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 23. Data Science as Discipline  Data Science has emerged as a new discipline to provide deep insight on the large volume of data.  Data Science is fusion of major disciplines like Computational Algorithms, Statistics and Visualization  90% of the world’s data has been created in the last two years which includes 10% of structured data and 80% of unstructured data  The digital universe is in data deluge and estimated to be larger than the physical universe and data unit measurement is predicted as Geopbytes 23Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 24. 24 Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 25. Data Growth in Bytes 25Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 26. Data Classification ◦ Open Data ◦ Closed Data ◦ Hot Data ◦ Warm Data ◦ Cold Data ◦ Thin Data ◦ Thick Data 26Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 27. Data Analytics – Need for today  Data considered as digital asset similar to other property.  The organizations believe data generated by them will provide deep insights to understand their business process for arriving strategic decisions.  The earlier limitation of computational storage and processing is overcome by the technologies of cloud computing and big data techniques. 27Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 28. Data Science Components Pre-Processing - ETL Dash Boards ChartsPie, Bar Histogram Data Models Linear Regression, Decision Tree, Dimensionality Reduction Clustering Outlier Analysis Association Analysis 28Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 29. Data Science - Big Data Technology  Collect, Load, Transform ◦ ETL SCRIBE, FLUME  Store ◦ HADOOP, SPARK, STORM  Process, Analyze and Reasoning ◦ Computational Algorithms, ◦ Statistical Methods and Models  R, PIG, HIVE,  PHYTON, JAVA, SCALA,  CLOJURE, MAHOUT  Visualization ◦ DASHBOARD, APP 29Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 30. Data Science Vs Data Analytics  Data Science is a discipline which groups techniques and methods from various domains to study about data and data analytics is a component in Data Science.  Data Analytics is a process of analyzing the dataset to find deep insights of data using computational algorithms and statistical methods. There exists no common procedure to 30Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 31. Data Analytics Vs Big Data Analytics  Data Analytics is used to explore and analyze datasets using statistical methods and models.  Big Data Analytics is used to analyze data with the characteristics of Volume, Velocity and Variety by integrating statistics, mathematics, computational algorithms in Big data Platform. 31Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 32. Data Science – Emerging Roles  Data Scientist is responsible for scrubbing data to bring out deep insights of data Skills : Expert in CS, Mathematics, Statistics Work on open ended research problems  Data Engineer is responsible for managing and administering the infrastructure and storage of data. Skills : Strong skills in Programming and Software Engineering  Deep Knowledge in Data warehousing  Expertise in Hadoop, NOSQL and SQL technologies  Data Analyst is one who views the data from one source and has deep insight on the data based on the organization guidance. Skills : Competency Skills in understanding of Statistics 32Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 33. Data Analytics Use Case Scenario 33
  • 34. Data Science Applications  Data Personalization - Logs, Tweets, Likes  Smart Pricing – Air Transportation  Financial Services – Fraud Detection Insurance  Smart Grids – Energy Management 34Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 35. Air Fare Management – Use case 1 Objectives: Hike airfare based on High Value Customers - CRM. Strategic decision requires Understanding of data insights How customers are divided? Which customer is high value customer? Who is Frequent flyer? How to retain customers? Data sources : Conventional Enterprise information Data from weblogs, social media, competitors pricing 35Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 36. Data Engineering Airfare Classification (Economy, Business,First) Analyse factors (Enterprise Datasources) – Data Exploration techniques Passenger Booking information Forecasted data - Statistics Inventory Customers Behavioral data - Predictive Analytics – Statistical models – Decision tree, classification Information has to be gained from websites that provide route information, dining, preferable locations Holistic Analytics Analyzing customer data from Social profiles, sales, CRM etc. 36Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 37. Complexities and Challenges Data is larger than terabytes Data integration Variety data formats Solution Big data Accelerators Hadoop ecosystem Analytic components Integrated data warehouses Source: Big data spectrum Infosys 37Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 38. Insurance Fraud Detection – Use case Scenario Data Engineering Verifying customer data Customer Profile analysis Verification of claims raised Fraud detection from disparate systems Exact claim reimbursement Data Sources Data about customer, product sold from ERP, CRM Credit history from other sources Data from social networking – Customer profiles, product rating, credit rating from 3rd parties 38Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 39. Health Epidemics Data Engineering Kind of epidemics and target users Causes and effects with respect to locations Environmental and other related issues of epidemics Data on Awareness Data Sources EHR records, Medical Insurance claims, Socialmedia – awareness, ERP Systems Data Analytics Descriptive Analytics Predictive Analytics ( Model based analysis) 39Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 40. Big Data Challenges Privacy Protection All Big data stages collect, store, process, knowledge Integration with enterprise landscape All systems store data in rdbms,DW Does not support bulk loading to Big data store Limited number of analytics from Mahout Big data technologies lack visualization support and deliverable methods Leveraging cloud computing for big data applications Addressing Real time needs with varied format and volume 40Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 41. PART B : Big Data Use Cases – Scenario 41
  • 42. Big Data Applications 42Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 43. Big Data Applications - India  Big Data – Elections  SBI uses big data mining to check defaults  Karnataka Govt – Identify water leakage 43Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 44. Big Data - Election  Mined data from every Internet user in the country, to accurately understand voter sentiments and local issues.  Data-based analysis was used to raise funds and create different models for different regions targeting on local issues.  India involve more than 800 million voters with different ideologies and expectations.  Innovative usage of Big Data marked a huge change in the way elections were fought traditionally. 44Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 45. Data Analytics  Modac Analytics built electroal data.  Processing huge volumes of unstructured data (around 10TB of PDF documents), and also structured data.  Modak chose Hadoop, and self-built a 64-node cluster that had 128TB of storage. Apart from Hadoop, the team used PostgreSQL as the front-end database.  They have developed Rapid ETL to 45Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 46. SBI  State Bank of India (SBI) ran its newly acquired data-mining software recently to check for purity of data.  Made an interesting find - close to one crore accountholders have not provided any nomination for their savings accounts. What is worse, over half of them are senior citizens.  To analyse trends in Banks, SBI has hired a whole team of statisticians and economists.  Identify default patterns, high value customers. 46Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 47. 47Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University
  • 48. 48Dr.V.Bhuvaneswari, Asst.Professor, Dept. of Computer Applications, Bhararthiar University

Notes de l'éditeur

  1. Obviously, there are many other forms and sources of data. Let’s start with the hottest topic associated with Big Data today: social networks. Twitter generates about 12 terabytes a day of tweet data – which is every single day. Now, keep in mind, these numbers are hard to count on, so the point is that they’re big, right? So don’t fixate on the actual number because they change all the time and realize that even if these numbers are out of date in 2 years, it’s at a point where it’s too staggering to handle exclusively using traditional approaches.   +CLICK+ Facebook over a year ago was generating 25 terabytes of log data every day (Facebook log data reference: http://www.datacenterknowledge.com/archives/2009/04/17/a-look-inside-facebooks-data-center/ ) and probably about 7 to 8 terabytes of data that goes up on the Internet.   +CLICK+ Google, who knows? Look at Google Plus, YouTube, Google Maps, and all that kind of stuff. So that’s the left hand of this chart – the social network layer.   +CLICK+ Now let’s get back to instrumentation: there are massive amounts of proliferated technologies that allow us to be more interconnected than in the history of the world – and it just isn’t P2P (people to people) interconnections, it’s M2M (machine to machine) as well. Again, with these numbers, who cares what the current number is, I try to keep them updated, but it’s the point that even if they are out of date, it’s almost unimaginable how large these numbers are. Over 4.6 billion camera phones that leverage built-in GPS to tag the location or your photos, purpose built GPS devices, smart metres. If you recall the bridge that collapsed in Minneapolis a number of years ago in the USA, it was rebuilt with smart sensors inside it that measure the contraction and flex of the concrete based on weather conditions, ice build up, and so much more.   So I didn’t realise how true it was when Sam P launched Smart Planet: I thought it was a marketing play. But truly the world is more instrumented, interconnected, and intelligent than it’s ever been and this capability allows us to address new problems and gain new insight never before thought possible and that’s what the Big Data opportunity is all about!
  2. Our product management, engineering, marketing, CTPs, etc, etc teams have all been working together to help to better understand the big data market. We’ve done surveys, met with analysts and studied their findings, we’ve met in person with customers and prospects (over 300 meetings) and are confident that we found market “sweet spots” for big data. These 5 use cases are our sweet spots. These will resonate with the majority of prospects that you meet with. In the coming slides we’ll cover each of these in detail, we’ll walk through the need, the value and a customer example.