SlideShare une entreprise Scribd logo
1  sur  24
Understanding Big Data
Overview 
Why Big Data 
Big Data Users 
What is Big Data
Big Data 
Big data is a popular term used to describe the exponential growth and availability of data, both structured and 
unstructured. And big data may be as important to business – and society – as the Internet has become. Why? 
More data may lead to more accurate analyses. 
More accurate analyses may lead to more confident decision making. And better decisions can mean greater 
operational efficiencies, cost reductions and reduced risk.
Big Data 
Mainstream definition of big data as the three V’s of big data:
Big Data
Big Data 
• Volume: Many factors contribute to the increase in data volume. Transaction-based data stored through 
the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to- 
machine data being collected. 
• Velocity: Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID 
tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. 
Reacting quickly enough to deal with data velocity is a challenge for most organizations. 
• Variety: Data today comes in all types of formats. Structured, numeric data in traditional databases. 
Information created from line-of-business applications. Unstructured text documents, email, video, audio, 
stock ticker data and financial transactions. Managing, merging and governing different varieties of data is 
something many organizations still grapple with.
Big Data 
Let us consider two additional dimensions when thinking about big data: 
• Variability: In addition to the increasing velocities and varieties of data, data flows can be highly 
inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered 
peak data loads can be challenging to manage. Even more so with unstructured data involved. 
• Complexity: Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse 
and transform data across systems. However, it is necessary to connect and correlate relationships, 
hierarchies and multiple data linkages or your data can quickly spiral out of control.
Big Data
Why Big Data ?? 
The real issue is not that you are acquiring large amounts of data. It's what you do with the data that counts. The 
hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyse 
it to find answers that enable 
1) cost reductions 
2) time reductions 
3) new product development and optimized offerings 
4) smarter business decision making.
Big Data Ecosystem
Big Data platform typically works by storing data first into clusters , then process the data through 
MapReduce workflows which executes by Mapping the input data through independent chunks processed 
by appropriate algorithms, the output from Map phase then moves to Shuffle/Sorting phase & finally the 
output from Shuffle phase comes to Reduce phase as input. 
Typical Big Data MapReduce workflow:
Big Data users in Next Five Years
Big Data users in Private Sector 
• Ebay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, 
consumer recommendations, and merchandising. 
• Amazon.com handles millions of back-end operations every day, as well as queries from more than half a 
million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005 
they had the world’s three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB. 
• Walmart handles more than 1 million customer transactions every hour, which are imported into databases 
estimated to contain more than 2.5 petabytes (2560 terabytes) of data – the equivalent of 167 times the 
information contained in all the books in the US Library of Congress. 
• Facebook handles 50 billion photos from its user base.
Big Data users in Private Sector 
• FICO Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide. The volume of 
business data worldwide, across all companies, doubles every 1.2 years, according to estimates. 
• Windermere Real Estate uses anonymous GPS signals from nearly 100 million drivers to help new home 
buyers determine their typical drive times to and from work throughout various times of the day.
Thank You.. 
Queries are welcome 
Praneet Samaiya

Contenu connexe

Tendances (20)

Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Big data.
Big data.Big data.
Big data.
 
Introduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 SystemIntroduction to Big Data & Big Data 1.0 System
Introduction to Big Data & Big Data 1.0 System
 
Big data
Big dataBig data
Big data
 
A Short History of Big Data
A Short History of Big DataA Short History of Big Data
A Short History of Big Data
 
Big Data
Big DataBig Data
Big Data
 
Big Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data ScientistsBig Data, Big Deal: For Future Big Data Scientists
Big Data, Big Deal: For Future Big Data Scientists
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Data mining with big data implementation
Data mining with big data implementationData mining with big data implementation
Data mining with big data implementation
 
big data Presentation
big data Presentationbig data Presentation
big data Presentation
 
Data mining on big data
Data mining on big dataData mining on big data
Data mining on big data
 
JPJ1417 Data Mining With Big Data
JPJ1417   Data Mining With Big DataJPJ1417   Data Mining With Big Data
JPJ1417 Data Mining With Big Data
 
BIG DATA & DATA ANALYTICS
BIG  DATA & DATA  ANALYTICSBIG  DATA & DATA  ANALYTICS
BIG DATA & DATA ANALYTICS
 

En vedette

Nuclear Winter
Nuclear WinterNuclear Winter
Nuclear Winterbrookeec
 
acceso a la biblioteca
acceso a la biblioteca acceso a la biblioteca
acceso a la biblioteca kthriin
 
FACTORS AFFECTING LLS
FACTORS AFFECTING LLSFACTORS AFFECTING LLS
FACTORS AFFECTING LLSMau5pls
 
Діагностика уваги
Діагностика увагиДіагностика уваги
Діагностика увагиyfnfkmz1990
 
Movie assignment MoeR
Movie assignment MoeRMovie assignment MoeR
Movie assignment MoeRXin Yi Zyx
 
Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013Manager Asesores
 
Klaster pariwisata desa sembungan
Klaster pariwisata desa sembunganKlaster pariwisata desa sembungan
Klaster pariwisata desa sembunganfebrinasas
 
Horario cesanjose2014
Horario cesanjose2014Horario cesanjose2014
Horario cesanjose2014dallas60
 
#10dieci: Turismo
#10dieci: Turismo#10dieci: Turismo
#10dieci: Turismopaticchio
 
How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?Velocity Software
 
Pardalis & Nohavicka llp final
Pardalis & Nohavicka llp finalPardalis & Nohavicka llp final
Pardalis & Nohavicka llp finalJoseph Nohavicka
 
Ieee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologiesIeee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologiesIEEEJAVAPROJECTS
 
Tests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion crackingTests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion crackingkoshykanjirapallikaran
 

En vedette (19)

Nuclear Winter
Nuclear WinterNuclear Winter
Nuclear Winter
 
Bab 6 kls xi
Bab 6 kls xiBab 6 kls xi
Bab 6 kls xi
 
acceso a la biblioteca
acceso a la biblioteca acceso a la biblioteca
acceso a la biblioteca
 
FACTORS AFFECTING LLS
FACTORS AFFECTING LLSFACTORS AFFECTING LLS
FACTORS AFFECTING LLS
 
Діагностика уваги
Діагностика увагиДіагностика уваги
Діагностика уваги
 
Movie assignment MoeR
Movie assignment MoeRMovie assignment MoeR
Movie assignment MoeR
 
Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013Anuario estadístico América Latina 2013
Anuario estadístico América Latina 2013
 
Klaster pariwisata desa sembungan
Klaster pariwisata desa sembunganKlaster pariwisata desa sembungan
Klaster pariwisata desa sembungan
 
Horario cesanjose2014
Horario cesanjose2014Horario cesanjose2014
Horario cesanjose2014
 
#10dieci: Turismo
#10dieci: Turismo#10dieci: Turismo
#10dieci: Turismo
 
How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?How to create a new Master Page in SharePoint 2013?
How to create a new Master Page in SharePoint 2013?
 
Pardalis & Nohavicka llp final
Pardalis & Nohavicka llp finalPardalis & Nohavicka llp final
Pardalis & Nohavicka llp final
 
Ieee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologiesIeee 2014 2015 dotnet projects titles globalsoft technologies
Ieee 2014 2015 dotnet projects titles globalsoft technologies
 
Encuestas a niños
Encuestas a niñosEncuestas a niños
Encuestas a niños
 
Sky fall production company
Sky fall production company Sky fall production company
Sky fall production company
 
Tests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion crackingTests for intergranular corrosion and stress corrosion cracking
Tests for intergranular corrosion and stress corrosion cracking
 
Grand Cianjur
Grand CianjurGrand Cianjur
Grand Cianjur
 
VozDigital DevFest 31/10/14
VozDigital DevFest 31/10/14VozDigital DevFest 31/10/14
VozDigital DevFest 31/10/14
 
Presentation
PresentationPresentation
Presentation
 

Similaire à Understanding the Basics of Big Data and its Growing Impact

Similaire à Understanding the Basics of Big Data and its Growing Impact (20)

Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Bigdata " new level"
Bigdata " new level"Bigdata " new level"
Bigdata " new level"
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 
Big data
Big dataBig data
Big data
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
Big data
Big dataBig data
Big data
 
Big data Analytics
Big data Analytics Big data Analytics
Big data Analytics
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big data
Big dataBig data
Big data
 
Big data and analytics
Big data and analyticsBig data and analytics
Big data and analytics
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 

Dernier

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 

Understanding the Basics of Big Data and its Growing Impact

  • 2. Overview Why Big Data Big Data Users What is Big Data
  • 3. Big Data Big data is a popular term used to describe the exponential growth and availability of data, both structured and unstructured. And big data may be as important to business – and society – as the Internet has become. Why? More data may lead to more accurate analyses. More accurate analyses may lead to more confident decision making. And better decisions can mean greater operational efficiencies, cost reductions and reduced risk.
  • 4.
  • 5.
  • 6.
  • 7. Big Data Mainstream definition of big data as the three V’s of big data:
  • 9. Big Data • Volume: Many factors contribute to the increase in data volume. Transaction-based data stored through the years. Unstructured data streaming in from social media. Increasing amounts of sensor and machine-to- machine data being collected. • Velocity: Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for most organizations. • Variety: Data today comes in all types of formats. Structured, numeric data in traditional databases. Information created from line-of-business applications. Unstructured text documents, email, video, audio, stock ticker data and financial transactions. Managing, merging and governing different varieties of data is something many organizations still grapple with.
  • 10. Big Data Let us consider two additional dimensions when thinking about big data: • Variability: In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data involved. • Complexity: Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse and transform data across systems. However, it is necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control.
  • 12. Why Big Data ?? The real issue is not that you are acquiring large amounts of data. It's what you do with the data that counts. The hopeful vision is that organizations will be able to take data from any source, harness relevant data and analyse it to find answers that enable 1) cost reductions 2) time reductions 3) new product development and optimized offerings 4) smarter business decision making.
  • 13.
  • 14.
  • 15.
  • 17.
  • 18. Big Data platform typically works by storing data first into clusters , then process the data through MapReduce workflows which executes by Mapping the input data through independent chunks processed by appropriate algorithms, the output from Map phase then moves to Shuffle/Sorting phase & finally the output from Shuffle phase comes to Reduce phase as input. Typical Big Data MapReduce workflow:
  • 19.
  • 20.
  • 21. Big Data users in Next Five Years
  • 22. Big Data users in Private Sector • Ebay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB Hadoop cluster for search, consumer recommendations, and merchandising. • Amazon.com handles millions of back-end operations every day, as well as queries from more than half a million third-party sellers. The core technology that keeps Amazon running is Linux-based and as of 2005 they had the world’s three largest Linux databases, with capacities of 7.8 TB, 18.5 TB, and 24.7 TB. • Walmart handles more than 1 million customer transactions every hour, which are imported into databases estimated to contain more than 2.5 petabytes (2560 terabytes) of data – the equivalent of 167 times the information contained in all the books in the US Library of Congress. • Facebook handles 50 billion photos from its user base.
  • 23. Big Data users in Private Sector • FICO Falcon Credit Card Fraud Detection System protects 2.1 billion active accounts world-wide. The volume of business data worldwide, across all companies, doubles every 1.2 years, according to estimates. • Windermere Real Estate uses anonymous GPS signals from nearly 100 million drivers to help new home buyers determine their typical drive times to and from work throughout various times of the day.
  • 24. Thank You.. Queries are welcome Praneet Samaiya