SlideShare une entreprise Scribd logo
1  sur  34
Sizes of data
What is actually Big Data?
Big data-: So large data that it becomes difficult to
process it using the traditional system.
Example...
Do you ever tried opening 0.5GB of file on
your machine?
Its difficult to edit 10TB file in limited
time in traditional system
Difficult to process by the
Traditional System
100GB Image 100MB Document
100TB Video
Unable to View
Unable to Edit
Unable to Sent
Depends on the
Capability of the
System.
Organization Specific
500TB
Text, Audio, Video data
Per day
Big
Data
Not A
Big
Data
Dpends on the
Capabilities of
The Organization
Company A Company B
AreAs Of
ChAllenges
Capture
Storage
Curation Tranfer
VisualizationAnalysis
Search
Sharing
ClassifiCation of Big DataDataClassifiCation of Big DataData
1.Structured Data:
●
It refers to data that has a defined length and format
for big data
●
Ex.numbers, dates, and groups of words and numbers
called strings.
●
It’s usually stored in a database.
2. Unstructured Data
●
No fields
●
Massive data ex. Newspaper
●
Music(Audio)Applications Movie(vedio)
X-Rays Pictures
3. Semi-Structured Data
The data which do not have a proper formate atteched to
it. Ex.
–Data within an emailData within an email
–Data in Doc FileData in Doc File
Why do we need this?
●
How do you like a new movie?
●
In Election exit poll
●
Chess Board
●
Facebook purchase WhatsApp why?
Characteristics of Big Data
1) Velocity
2)Volume
3)Variety
4)Value
5)Veracity
6)Variability
7)Visualization
ExamplEs of VEloCity
●
Almost 2,5 million queries on Google
are performed.
●
Around 20 million photos are viewed.
●
Every minute we upload 100 hours of
video on Youtube.
●
every minute over 200 million emails are
sent.
●
300,000 tweets are sent
●
The amount of data generat every second.
●
Here we are talking about Zettabyte or more.
●
It is the task of big data to convert such
Hadoop data into valuable information.
●
Data is generated by machines, networks and
human interaction on systems like social
media.
●
the volume of data to be analyzed is massive.
Example of Volume...1
AirbusAirbus
●
Airbus generates 10TB every 30 minutes
●
About 640TB is generated in one flight
Example of Volume...2
●
Self-driving cars will generate 2 Petabyte of data
every year.
●
From now on, the amount of data in the world will
double every two years.
●
By 2020, we will have 50 times the amount of data as
that we had in 2011.
Variety
●
Refers to the different types of data we can now use.
●
In past the data was structured that fitted in columns and
rows.
– Stored in Database
– Spread sheets
●
But now the data is unstructured that are difficult to storing,
analysing,mining.
– Email, photo, audio
– monitoring devices, PDFs
●
Having access to big data is no good unless we can turn it into
value.
●
Companies are staring to generate amazing value from their
big data.
-> Discovering a consumer preference or sentiment,
-> To making a relevant offer by location
-> Identifying a piece of equipment that is about to fail.
Value...1
Value....2
●
The real big data challenge is a human one which is
-> learning to ask the right questions,
-> recognizing patterns
-> making informed assumptions
-> predicting behavior.
●
Are the results meaningful for the given problem
space?
●
it’s about data quality and understandability.
●
Especially in automated decision-making, where no
human is involved anymore, you need to be sure that
both the data and the analyses are correct.
Veracity
●
Dissect an answer into its meaning and to figure out
what the right question was.
●
Variability is often confused with variety
-> Say you have bakery that sells 10 different
breads. That is variety.
-> Now imagine you go to that bakery three days in a
row and every day you buy the same type of bread
but each day it tastes and smells different. That is
variability.
Variability..1
●
Variability means that the meaning is changing.
●
Variability..2
●
This is the hard part of big data.
●
Making all that vast amount of data comprehensible in
a manner that is easy to understand and read.
●
Visualizations of course do not mean ordinary graphs
or pie charts.
●
They mean complex graphs that can include many
variables of data while still remaining understandable
and readable.
Visualization
Big Data Sources
●
Users
●
Application
●
System
●
Sensors
Large and growing files
Are creating
Data Generating Points
Smart Phones
●
5 billion camera phones are there in the world
●
Most of them have location awareness(GPS)
●
By the end of year 2013, the number of smart phone
was exceed the number of PC's
Internet
●
2 billion people using internet
●
By the end of 2015, cisco traffic internet traffic 4.8ZB
per year.
Emails:
●
300 billion email send every day
Blogs:
●
There are 200 million entries on the web
Social Media
Facebook:
●
34K likes every minute
●
It deals with 3-4 PB of data each day
●
There are 1 billion active user
Twitter:
●
It generates 12TB of data daily
●
200million user generates 230million tweets daily
Google:
●
It perform 2million search every minute
●
It deals with 20PB of data each day
Youtube:
●
2.9 billion vedio hours vedio watched per month
Limitations of Traditional
System
Data Warehouse
●
Cost
●
Fixed Schema of RDBMS
●
Saving huge file and accessing them
●
Perform analysis
●
Time to do all this task
Appilcations of big data
●
Companies gaining edge by collecting , analyzing, and
understanding information
●
Goverments forecasying events and taking proactive
actions
– Like spred of decieses
Tools for handling big data
Traditiona System
ex. RDBMS
Big Data Tools
ex. Hadoop
Not able to handle
Big data
Created to handle
Big Data

Contenu connexe

Tendances

Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake OverviewJames Serra
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyRohit Dubey
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computingViet-Trung TRAN
 
Big data 2017 final
Big data 2017   finalBig data 2017   final
Big data 2017 finalAmjid Ali
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)SiamAhmed16
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDATAVERSITY
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingMinhazul Arefin
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 

Tendances (20)

Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Data Lake Overview
Data Lake OverviewData Lake Overview
Data Lake Overview
 
Big Data ppt
Big Data pptBig Data ppt
Big Data ppt
 
Big data
Big dataBig data
Big data
 
Big Data PPT by Rohit Dubey
Big Data PPT by Rohit DubeyBig Data PPT by Rohit Dubey
Big Data PPT by Rohit Dubey
 
Big data
Big dataBig data
Big data
 
Google BigQuery
Google BigQueryGoogle BigQuery
Google BigQuery
 
Big data-ppt
Big data-pptBig data-ppt
Big data-ppt
 
Overview of big data in cloud computing
Overview of big data in cloud computingOverview of big data in cloud computing
Overview of big data in cloud computing
 
Big data 2017 final
Big data 2017   finalBig data 2017   final
Big data 2017 final
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)Presentation About Big Data (DBMS)
Presentation About Big Data (DBMS)
 
DI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data WarehouseDI&A Slides: Data Lake vs. Data Warehouse
DI&A Slides: Data Lake vs. Data Warehouse
 
Building an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse Architecture
 
Our big data
Our big dataOur big data
Our big data
 
The rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computingThe rise of “Big Data” on cloud computing
The rise of “Big Data” on cloud computing
 
BigQuery for Beginners
BigQuery for BeginnersBigQuery for Beginners
BigQuery for Beginners
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Data Lake,beyond the Data Warehouse
Data Lake,beyond the Data WarehouseData Lake,beyond the Data Warehouse
Data Lake,beyond the Data Warehouse
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 

Similaire à Big Data Presentation

In memory big data management and processing
In memory big data management and processingIn memory big data management and processing
In memory big data management and processingPranav Gontalwar
 
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceQu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceJedha Bootcamp
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its ChallengesKathirvel Ayyaswamy
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptxkalai75
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxtangyechloe
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Hritika Raj
 
BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptxPentaTech
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01nayanbhatia2
 

Similaire à Big Data Presentation (20)

BIG DATA ANALYTICS
BIG DATA ANALYTICSBIG DATA ANALYTICS
BIG DATA ANALYTICS
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
In memory big data management and processing
In memory big data management and processingIn memory big data management and processing
In memory big data management and processing
 
Big Data
Big DataBig Data
Big Data
 
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air FranceQu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
Qu'est ce que le Big Data ? Avec Victoria Galano Data Scientist chez Air France
 
Research issues in the big data and its Challenges
Research issues in the big data and its ChallengesResearch issues in the big data and its Challenges
Research issues in the big data and its Challenges
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
BigData.pptx
BigData.pptxBigData.pptx
BigData.pptx
 
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docxBIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
BIGDATAPrepared ByMuhammad Abrar UddinIntrodu.docx
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
Big data PPT prepared by Hritika Raj (Shivalik college of engg.)
 
bigdata.pptx
bigdata.pptxbigdata.pptx
bigdata.pptx
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Big data
Big dataBig data
Big data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
BigDataFinal.pptx
BigDataFinal.pptxBigDataFinal.pptx
BigDataFinal.pptx
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Big data
Big dataBig data
Big data
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 

Dernier

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...anjaliyadav012327
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 

Dernier (20)

Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
JAPAN: ORGANISATION OF PMDA, PHARMACEUTICAL LAWS & REGULATIONS, TYPES OF REGI...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 

Big Data Presentation

  • 1.
  • 3. What is actually Big Data? Big data-: So large data that it becomes difficult to process it using the traditional system.
  • 4.
  • 5. Example... Do you ever tried opening 0.5GB of file on your machine?
  • 6. Its difficult to edit 10TB file in limited time in traditional system
  • 7. Difficult to process by the Traditional System 100GB Image 100MB Document 100TB Video Unable to View Unable to Edit Unable to Sent Depends on the Capability of the System.
  • 8. Organization Specific 500TB Text, Audio, Video data Per day Big Data Not A Big Data Dpends on the Capabilities of The Organization Company A Company B
  • 10. ClassifiCation of Big DataDataClassifiCation of Big DataData 1.Structured Data: ● It refers to data that has a defined length and format for big data ● Ex.numbers, dates, and groups of words and numbers called strings. ● It’s usually stored in a database.
  • 11. 2. Unstructured Data ● No fields ● Massive data ex. Newspaper ● Music(Audio)Applications Movie(vedio) X-Rays Pictures
  • 12. 3. Semi-Structured Data The data which do not have a proper formate atteched to it. Ex. –Data within an emailData within an email –Data in Doc FileData in Doc File
  • 13. Why do we need this? ● How do you like a new movie? ● In Election exit poll ● Chess Board ● Facebook purchase WhatsApp why?
  • 14. Characteristics of Big Data 1) Velocity 2)Volume 3)Variety 4)Value 5)Veracity 6)Variability 7)Visualization
  • 15.
  • 16. ExamplEs of VEloCity ● Almost 2,5 million queries on Google are performed. ● Around 20 million photos are viewed. ● Every minute we upload 100 hours of video on Youtube. ● every minute over 200 million emails are sent. ● 300,000 tweets are sent
  • 17. ● The amount of data generat every second. ● Here we are talking about Zettabyte or more. ● It is the task of big data to convert such Hadoop data into valuable information. ● Data is generated by machines, networks and human interaction on systems like social media. ● the volume of data to be analyzed is massive.
  • 18. Example of Volume...1 AirbusAirbus ● Airbus generates 10TB every 30 minutes ● About 640TB is generated in one flight
  • 19. Example of Volume...2 ● Self-driving cars will generate 2 Petabyte of data every year. ● From now on, the amount of data in the world will double every two years. ● By 2020, we will have 50 times the amount of data as that we had in 2011.
  • 20. Variety ● Refers to the different types of data we can now use. ● In past the data was structured that fitted in columns and rows. – Stored in Database – Spread sheets ● But now the data is unstructured that are difficult to storing, analysing,mining. – Email, photo, audio – monitoring devices, PDFs
  • 21. ● Having access to big data is no good unless we can turn it into value. ● Companies are staring to generate amazing value from their big data. -> Discovering a consumer preference or sentiment, -> To making a relevant offer by location -> Identifying a piece of equipment that is about to fail. Value...1
  • 22. Value....2 ● The real big data challenge is a human one which is -> learning to ask the right questions, -> recognizing patterns -> making informed assumptions -> predicting behavior.
  • 23. ● Are the results meaningful for the given problem space? ● it’s about data quality and understandability. ● Especially in automated decision-making, where no human is involved anymore, you need to be sure that both the data and the analyses are correct. Veracity
  • 24. ● Dissect an answer into its meaning and to figure out what the right question was. ● Variability is often confused with variety -> Say you have bakery that sells 10 different breads. That is variety. -> Now imagine you go to that bakery three days in a row and every day you buy the same type of bread but each day it tastes and smells different. That is variability. Variability..1
  • 25. ● Variability means that the meaning is changing. ● Variability..2
  • 26. ● This is the hard part of big data. ● Making all that vast amount of data comprehensible in a manner that is easy to understand and read. ● Visualizations of course do not mean ordinary graphs or pie charts. ● They mean complex graphs that can include many variables of data while still remaining understandable and readable. Visualization
  • 28. Data Generating Points Smart Phones ● 5 billion camera phones are there in the world ● Most of them have location awareness(GPS) ● By the end of year 2013, the number of smart phone was exceed the number of PC's
  • 29. Internet ● 2 billion people using internet ● By the end of 2015, cisco traffic internet traffic 4.8ZB per year. Emails: ● 300 billion email send every day Blogs: ● There are 200 million entries on the web
  • 30. Social Media Facebook: ● 34K likes every minute ● It deals with 3-4 PB of data each day ● There are 1 billion active user Twitter: ● It generates 12TB of data daily ● 200million user generates 230million tweets daily
  • 31. Google: ● It perform 2million search every minute ● It deals with 20PB of data each day Youtube: ● 2.9 billion vedio hours vedio watched per month
  • 32. Limitations of Traditional System Data Warehouse ● Cost ● Fixed Schema of RDBMS ● Saving huge file and accessing them ● Perform analysis ● Time to do all this task
  • 33. Appilcations of big data ● Companies gaining edge by collecting , analyzing, and understanding information ● Goverments forecasying events and taking proactive actions – Like spred of decieses
  • 34. Tools for handling big data Traditiona System ex. RDBMS Big Data Tools ex. Hadoop Not able to handle Big data Created to handle Big Data