SlideShare une entreprise Scribd logo
1  sur  34
Sizes of data
What is actually Big Data?
Big data-: So large data that it becomes difficult to
process it using the traditional system.
Example...
Do you ever tried opening 0.5GB of file on
your machine?
Its difficult to edit 10TB file in limited
time in traditional system
Difficult to process by the
Traditional System
100GB Image 100MB Document
100TB Video
Unable to View
Unable to Edit
Unable to Sent
Depends on the
Capability of the
System.
Organization Specific
500TB
Text, Audio, Video data
Per day
Big
Data
Not A
Big
Data
Dpends on the
Capabilities of
The Organization
Company A Company B
AreAs Of
ChAllenges
Capture
Storage
Curation Tranfer
VisualizationAnalysis
Search
Sharing
ClassifiCation of Big DataDataClassifiCation of Big DataData
1.Structured Data:
●
It refers to data that has a defined length and format
for big data
●
Ex.numbers, dates, and groups of words and numbers
called strings.
●
It’s usually stored in a database.
2. Unstructured Data
●
No fields
●
Massive data ex. Newspaper
●
Music(Audio)Applications Movie(vedio)
X-Rays Pictures
3. Semi-Structured Data
The data which do not have a proper formate atteched to
it. Ex.
–Data within an emailData within an email
–Data in Doc FileData in Doc File
Why do we need this?
●
How do you like a new movie?
●
In Election exit poll
●
Chess Board
●
Facebook purchase WhatsApp why?
Characteristics of Big Data
1) Velocity
2)Volume
3)Variety
4)Value
5)Veracity
6)Variability
7)Visualization
ExamplEs of VEloCity
●
Almost 2,5 million queries on Google
are performed.
●
Around 20 million photos are viewed.
●
Every minute we upload 100 hours of
video on Youtube.
●
every minute over 200 million emails are
sent.
●
300,000 tweets are sent
●
The amount of data generat every second.
●
Here we are talking about Zettabyte or more.
●
It is the task of big data to convert such
Hadoop data into valuable information.
●
Data is generated by machines, networks and
human interaction on systems like social
media.
●
the volume of data to be analyzed is massive.
Example of Volume...1
AirbusAirbus
●
Airbus generates 10TB every 30 minutes
●
About 640TB is generated in one flight
Example of Volume...2
●
Self-driving cars will generate 2 Petabyte of data
every year.
●
From now on, the amount of data in the world will
double every two years.
●
By 2020, we will have 50 times the amount of data as
that we had in 2011.
Variety
●
Refers to the different types of data we can now use.
●
In past the data was structured that fitted in columns and
rows.
– Stored in Database
– Spread sheets
●
But now the data is unstructured that are difficult to storing,
analysing,mining.
– Email, photo, audio
– monitoring devices, PDFs
●
Having access to big data is no good unless we can turn it into
value.
●
Companies are staring to generate amazing value from their
big data.
-> Discovering a consumer preference or sentiment,
-> To making a relevant offer by location
-> Identifying a piece of equipment that is about to fail.
Value...1
Value....2
●
The real big data challenge is a human one which is
-> learning to ask the right questions,
-> recognizing patterns
-> making informed assumptions
-> predicting behavior.
●
Are the results meaningful for the given problem
space?
●
it’s about data quality and understandability.
●
Especially in automated decision-making, where no
human is involved anymore, you need to be sure that
both the data and the analyses are correct.
Veracity
●
Dissect an answer into its meaning and to figure out
what the right question was.
●
Variability is often confused with variety
-> Say you have bakery that sells 10 different
breads. That is variety.
-> Now imagine you go to that bakery three days in a
row and every day you buy the same type of bread
but each day it tastes and smells different. That is
variability.
Variability..1
●
Variability means that the meaning is changing.
●
Variability..2
●
This is the hard part of big data.
●
Making all that vast amount of data comprehensible in
a manner that is easy to understand and read.
●
Visualizations of course do not mean ordinary graphs
or pie charts.
●
They mean complex graphs that can include many
variables of data while still remaining understandable
and readable.
Visualization
Big Data Sources
●
Users
●
Application
●
System
●
Sensors
Large and growing files
Are creating
Data Generating Points
Smart Phones
●
5 billion camera phones are there in the world
●
Most of them have location awareness(GPS)
●
By the end of year 2013, the number of smart phone
was exceed the number of PC's
Internet
●
2 billion people using internet
●
By the end of 2015, cisco traffic internet traffic 4.8ZB
per year.
Emails:
●
300 billion email send every day
Blogs:
●
There are 200 million entries on the web
Social Media
Facebook:
●
34K likes every minute
●
It deals with 3-4 PB of data each day
●
There are 1 billion active user
Twitter:
●
It generates 12TB of data daily
●
200million user generates 230million tweets daily
Google:
●
It perform 2million search every minute
●
It deals with 20PB of data each day
Youtube:
●
2.9 billion vedio hours vedio watched per month
Limitations of Traditional
System
Data Warehouse
●
Cost
●
Fixed Schema of RDBMS
●
Saving huge file and accessing them
●
Perform analysis
●
Time to do all this task
Appilcations of big data
●
Companies gaining edge by collecting , analyzing, and
understanding information
●
Goverments forecasying events and taking proactive
actions
– Like spred of decieses
Tools for handling big data
Traditiona System
ex. RDBMS
Big Data Tools
ex. Hadoop
Not able to handle
Big data
Created to handle
Big Data

Contenu connexe

Tendances

Big data Presentation
Big data PresentationBig data Presentation
Big data PresentationAswadmehar
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data ScienceJason Geng
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Simplilearn
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data Srinath Perera
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop IntroductionJayant Mukherjee
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceSrishti44
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Introduction of Artificial Intelligence and Machine Learning
Introduction of Artificial Intelligence and Machine Learning Introduction of Artificial Intelligence and Machine Learning
Introduction of Artificial Intelligence and Machine Learning bigdata trunk
 

Tendances (20)

Our big data
Our big dataOur big data
Our big data
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Big data
Big dataBig data
Big data
 
Big data Presentation
Big data PresentationBig data Presentation
Big data Presentation
 
Introduction of Data Science
Introduction of Data ScienceIntroduction of Data Science
Introduction of Data Science
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
What is big data?
What is big data?What is big data?
What is big data?
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...Data Science Training | Data Science For Beginners | Data Science With Python...
Data Science Training | Data Science For Beginners | Data Science With Python...
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
 
Big data Ppt
Big data PptBig data Ppt
Big data Ppt
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Introduction of Artificial Intelligence and Machine Learning
Introduction of Artificial Intelligence and Machine Learning Introduction of Artificial Intelligence and Machine Learning
Introduction of Artificial Intelligence and Machine Learning
 

Similaire à Big Data Presentation

In memory big data management and processing
In memory big data management and processingIn memory big data management and processing
In memory big data management and processingPranav Gontalwar
 
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceQu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceJedha Bootcamp
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its ChallengesKathirvel Ayyaswamy
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptxkalai75
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxtangyechloe
 
BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptxPentaTech
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01nayanbhatia2
 

Similaire à Big Data Presentation (20)

BIG DATA ANALYTICS
BIG DATA ANALYTICSBIG DATA ANALYTICS
BIG DATA ANALYTICS
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
In memory big data management and processing
In memory big data management and processingIn memory big data management and processing
In memory big data management and processing
 
Big Data
Big DataBig Data
Big Data
 
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceQu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its Challenges
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
BigData.pptx
BigData.pptxBigData.pptx
BigData.pptx
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
bigdata.pptx
bigdata.pptxbigdata.pptx
bigdata.pptx
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big data
Big dataBig data
Big data
 
BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptx
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Big data
Big dataBig data
Big data
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big data
Big dataBig data
Big data
 

Dernier

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 

Dernier (20)

microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 

Big Data Presentation

  • 1.
  • 3. What is actually Big Data? Big data-: So large data that it becomes difficult to process it using the traditional system.
  • 4.
  • 5. Example... Do you ever tried opening 0.5GB of file on your machine?
  • 6. Its difficult to edit 10TB file in limited time in traditional system
  • 7. Difficult to process by the Traditional System 100GB Image 100MB Document 100TB Video Unable to View Unable to Edit Unable to Sent Depends on the Capability of the System.
  • 8. Organization Specific 500TB Text, Audio, Video data Per day Big Data Not A Big Data Dpends on the Capabilities of The Organization Company A Company B
  • 10. ClassifiCation of Big DataDataClassifiCation of Big DataData 1.Structured Data: ● It refers to data that has a defined length and format for big data ● Ex.numbers, dates, and groups of words and numbers called strings. ● It’s usually stored in a database.
  • 11. 2. Unstructured Data ● No fields ● Massive data ex. Newspaper ● Music(Audio)Applications Movie(vedio) X-Rays Pictures
  • 12. 3. Semi-Structured Data The data which do not have a proper formate atteched to it. Ex. –Data within an emailData within an email –Data in Doc FileData in Doc File
  • 13. Why do we need this? ● How do you like a new movie? ● In Election exit poll ● Chess Board ● Facebook purchase WhatsApp why?
  • 14. Characteristics of Big Data 1) Velocity 2)Volume 3)Variety 4)Value 5)Veracity 6)Variability 7)Visualization
  • 15.
  • 16. ExamplEs of VEloCity ● Almost 2,5 million queries on Google are performed. ● Around 20 million photos are viewed. ● Every minute we upload 100 hours of video on Youtube. ● every minute over 200 million emails are sent. ● 300,000 tweets are sent
  • 17. ● The amount of data generat every second. ● Here we are talking about Zettabyte or more. ● It is the task of big data to convert such Hadoop data into valuable information. ● Data is generated by machines, networks and human interaction on systems like social media. ● the volume of data to be analyzed is massive.
  • 18. Example of Volume...1 AirbusAirbus ● Airbus generates 10TB every 30 minutes ● About 640TB is generated in one flight
  • 19. Example of Volume...2 ● Self-driving cars will generate 2 Petabyte of data every year. ● From now on, the amount of data in the world will double every two years. ● By 2020, we will have 50 times the amount of data as that we had in 2011.
  • 20. Variety ● Refers to the different types of data we can now use. ● In past the data was structured that fitted in columns and rows. – Stored in Database – Spread sheets ● But now the data is unstructured that are difficult to storing, analysing,mining. – Email, photo, audio – monitoring devices, PDFs
  • 21. ● Having access to big data is no good unless we can turn it into value. ● Companies are staring to generate amazing value from their big data. -> Discovering a consumer preference or sentiment, -> To making a relevant offer by location -> Identifying a piece of equipment that is about to fail. Value...1
  • 22. Value....2 ● The real big data challenge is a human one which is -> learning to ask the right questions, -> recognizing patterns -> making informed assumptions -> predicting behavior.
  • 23. ● Are the results meaningful for the given problem space? ● it’s about data quality and understandability. ● Especially in automated decision-making, where no human is involved anymore, you need to be sure that both the data and the analyses are correct. Veracity
  • 24. ● Dissect an answer into its meaning and to figure out what the right question was. ● Variability is often confused with variety -> Say you have bakery that sells 10 different breads. That is variety. -> Now imagine you go to that bakery three days in a row and every day you buy the same type of bread but each day it tastes and smells different. That is variability. Variability..1
  • 25. ● Variability means that the meaning is changing. ● Variability..2
  • 26. ● This is the hard part of big data. ● Making all that vast amount of data comprehensible in a manner that is easy to understand and read. ● Visualizations of course do not mean ordinary graphs or pie charts. ● They mean complex graphs that can include many variables of data while still remaining understandable and readable. Visualization
  • 28. Data Generating Points Smart Phones ● 5 billion camera phones are there in the world ● Most of them have location awareness(GPS) ● By the end of year 2013, the number of smart phone was exceed the number of PC's
  • 29. Internet ● 2 billion people using internet ● By the end of 2015, cisco traffic internet traffic 4.8ZB per year. Emails: ● 300 billion email send every day Blogs: ● There are 200 million entries on the web
  • 30. Social Media Facebook: ● 34K likes every minute ● It deals with 3-4 PB of data each day ● There are 1 billion active user Twitter: ● It generates 12TB of data daily ● 200million user generates 230million tweets daily
  • 31. Google: ● It perform 2million search every minute ● It deals with 20PB of data each day Youtube: ● 2.9 billion vedio hours vedio watched per month
  • 32. Limitations of Traditional System Data Warehouse ● Cost ● Fixed Schema of RDBMS ● Saving huge file and accessing them ● Perform analysis ● Time to do all this task
  • 33. Appilcations of big data ● Companies gaining edge by collecting , analyzing, and understanding information ● Goverments forecasying events and taking proactive actions – Like spred of decieses
  • 34. Tools for handling big data Traditiona System ex. RDBMS Big Data Tools ex. Hadoop Not able to handle Big data Created to handle Big Data