SlideShare une entreprise Scribd logo
1  sur  17
Big Data
Hello
Rezaur Rahman (Jitu)
CTO, G&R Ad Network
jitu@gandr.com.bd
@jituboss
Data is growing at a exponential rate and traditional tools like
RDBMS is not enough to process
Data is everywhere:
• Flickr (87 million registered members and 3.5 million
photos per day)
• YouTube (4B videos streamed per day)
• Yahoo! Webmap (3 trillion links, 300TB compressed,
5PB disk)
• Facebook is collecting your data 500 terabytes a day
• Walmart handles more than 1 million customer
transactions every hour
• IDC Estimates that by 2020, business transactions on
the internet- business-to-business and business-to-
consumer – will reach 450 billion per day.
Data is growing at a 40% rate, reaching nearly 45 ZB by 2020
according to IDC
1 ZB is equal to 1 billion TB
What is Big Data and what is not?
• Order details of a e-commerce site
• All Orders across 1000s of e-commerce sites
• One person’s voter ID information
• Every citizen’s voter ID information dataset
Simple Definition: Big Data is Data, that is too big to
process with a single machine
What is Big Data?
3 v’s of Big Data
Types of Data:
• Relational Data (Tables/Transaction/Legacy
Data)
• Unstructured Data – Apache weblogs
• Text Data (Web)
• Semi-structured Data (XML)
• Graph Data
• Social Network, Semantic Web (RDF)
• Streaming Data
Data Processing Tasks:
• Aggregation and Statistics - Data warehouse
• Contextual Advertising – Real Time Bidding,
Remarketing
• Indexing, Searching, and Querying - Keyword
based search, Pattern recognition
• Knowledge discovery - Data Mining, Statistical
Modeling
Traditional Architecture
• Relational Data is everything
– SQL
– Embedded
– Client-Server Based
• Data Stack
– Web, CDN, Load Balancers, Application, Database
and Storage
Traditional Scalability
• Scale-up
– Memory And Hardware has limitations
• Scale-out
– Reading
• Cache is everything
– Query Cache
– Memcache
• Pre-fetching, Replication
– Writes
• Redundant Disk Arrays, RAID
• Sharding
NoSQL Solution
• Lot of companies emerged to solve data problem
• Big Table: Google started to implement massively
distributed scalable system
• Many companies followed building scale-out
architecture using commodity hardware
• ACID was termed as bad for scaling, so relaxed
consistency model came
• Google Big Table and Amazon Dynamo are
notable
Big Data Tools
Big Data Landscape
Thanks
Questions?

Contenu connexe

Tendances

Tendances (20)

Abi
AbiAbi
Abi
 
Graphs & the Police: How Law Enforcement Analyze Connected Data at Scale
Graphs & the Police: How Law Enforcement Analyze Connected Data at ScaleGraphs & the Police: How Law Enforcement Analyze Connected Data at Scale
Graphs & the Police: How Law Enforcement Analyze Connected Data at Scale
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Chapter 1 big data
Chapter 1 big dataChapter 1 big data
Chapter 1 big data
 
A Gentle Introduction to Big Data
A Gentle Introduction to Big DataA Gentle Introduction to Big Data
A Gentle Introduction to Big Data
 
Big Data Cloud June 3rd Meetup - Presentation by Mark Davis
Big Data Cloud June 3rd Meetup - Presentation by Mark Davis Big Data Cloud June 3rd Meetup - Presentation by Mark Davis
Big Data Cloud June 3rd Meetup - Presentation by Mark Davis
 
Hadoop Training Tutorial for Freshers
Hadoop Training Tutorial for FreshersHadoop Training Tutorial for Freshers
Hadoop Training Tutorial for Freshers
 
Bigdata
BigdataBigdata
Bigdata
 
Lju Lazarevic
Lju LazarevicLju Lazarevic
Lju Lazarevic
 
Big data
Big dataBig data
Big data
 
Data mining on big data
Data mining on big dataData mining on big data
Data mining on big data
 
Big data converted
Big data convertedBig data converted
Big data converted
 
Big data introduction
Big data introductionBig data introduction
Big data introduction
 
Big Data
Big DataBig Data
Big Data
 
Big data
Big dataBig data
Big data
 
View on big data technologies
View on big data technologiesView on big data technologies
View on big data technologies
 
Big Data
Big DataBig Data
Big Data
 
Analysis of big data in pandemic case
Analysis of big data in pandemic case Analysis of big data in pandemic case
Analysis of big data in pandemic case
 
Unit 1
Unit 1Unit 1
Unit 1
 
MongoDB and Web Scraping with the Gyes platform. MongoDB Atlanta 2013
MongoDB and Web Scraping with the Gyes platform. MongoDB Atlanta 2013MongoDB and Web Scraping with the Gyes platform. MongoDB Atlanta 2013
MongoDB and Web Scraping with the Gyes platform. MongoDB Atlanta 2013
 

Similaire à Presentation at Google Day on Big Data

Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
dickonsondorris
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
kalai75
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
Manish Chopra
 

Similaire à Presentation at Google Day on Big Data (20)

Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Big data with Hadoop - Introduction
Big data with Hadoop - IntroductionBig data with Hadoop - Introduction
Big data with Hadoop - Introduction
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Big data
Big dataBig data
Big data
 
Big data.ppt
Big data.pptBig data.ppt
Big data.ppt
 
Lecture1
Lecture1Lecture1
Lecture1
 
Lecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in detailsLecture1 BIG DATA and Types of data in details
Lecture1 BIG DATA and Types of data in details
 
Big Data Landscape 2018
Big Data Landscape 2018Big Data Landscape 2018
Big Data Landscape 2018
 
IARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptxIARE_BDBA_ PPT_0.pptx
IARE_BDBA_ PPT_0.pptx
 
Big data in telecom
Big data in telecomBig data in telecom
Big data in telecom
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Hadoop HDFS.ppt
Hadoop HDFS.pptHadoop HDFS.ppt
Hadoop HDFS.ppt
 

Dernier

Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
RafigAliyev2
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
cyebo
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
pyhepag
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
Amil baba
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 

Dernier (20)

Fuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertaintyFuzzy Sets decision making under information of uncertainty
Fuzzy Sets decision making under information of uncertainty
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 
Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)Atlantic Grupa Case Study (Mintec Data AI)
Atlantic Grupa Case Study (Mintec Data AI)
 
社内勉強会資料  Mamba - A new era or ephemeral
社内勉強会資料   Mamba - A new era or ephemeral社内勉強会資料   Mamba - A new era or ephemeral
社内勉強会資料  Mamba - A new era or ephemeral
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
AI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdfAI Imagen for data-storytelling Infographics.pdf
AI Imagen for data-storytelling Infographics.pdf
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理一比一原版麦考瑞大学毕业证成绩单如何办理
一比一原版麦考瑞大学毕业证成绩单如何办理
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Formulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdfFormulas dax para power bI de microsoft.pdf
Formulas dax para power bI de microsoft.pdf
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 

Presentation at Google Day on Big Data

  • 2. Hello Rezaur Rahman (Jitu) CTO, G&R Ad Network jitu@gandr.com.bd @jituboss
  • 3. Data is growing at a exponential rate and traditional tools like RDBMS is not enough to process
  • 4. Data is everywhere: • Flickr (87 million registered members and 3.5 million photos per day) • YouTube (4B videos streamed per day) • Yahoo! Webmap (3 trillion links, 300TB compressed, 5PB disk) • Facebook is collecting your data 500 terabytes a day • Walmart handles more than 1 million customer transactions every hour • IDC Estimates that by 2020, business transactions on the internet- business-to-business and business-to- consumer – will reach 450 billion per day.
  • 5. Data is growing at a 40% rate, reaching nearly 45 ZB by 2020 according to IDC 1 ZB is equal to 1 billion TB
  • 6. What is Big Data and what is not? • Order details of a e-commerce site • All Orders across 1000s of e-commerce sites • One person’s voter ID information • Every citizen’s voter ID information dataset Simple Definition: Big Data is Data, that is too big to process with a single machine
  • 7. What is Big Data?
  • 8. 3 v’s of Big Data
  • 9. Types of Data: • Relational Data (Tables/Transaction/Legacy Data) • Unstructured Data – Apache weblogs • Text Data (Web) • Semi-structured Data (XML) • Graph Data • Social Network, Semantic Web (RDF) • Streaming Data
  • 10. Data Processing Tasks: • Aggregation and Statistics - Data warehouse • Contextual Advertising – Real Time Bidding, Remarketing • Indexing, Searching, and Querying - Keyword based search, Pattern recognition • Knowledge discovery - Data Mining, Statistical Modeling
  • 11. Traditional Architecture • Relational Data is everything – SQL – Embedded – Client-Server Based • Data Stack – Web, CDN, Load Balancers, Application, Database and Storage
  • 12. Traditional Scalability • Scale-up – Memory And Hardware has limitations • Scale-out – Reading • Cache is everything – Query Cache – Memcache • Pre-fetching, Replication – Writes • Redundant Disk Arrays, RAID • Sharding
  • 13. NoSQL Solution • Lot of companies emerged to solve data problem • Big Table: Google started to implement massively distributed scalable system • Many companies followed building scale-out architecture using commodity hardware • ACID was termed as bad for scaling, so relaxed consistency model came • Google Big Table and Amazon Dynamo are notable