SlideShare une entreprise Scribd logo
1  sur  22
SKILLWISE-BIG DATA
Big Data Analytics
What is the aim of the course
Focus is on “Systems” and applications for cloud-based
storage and processing of BIG DATA.
+Big Data - Definition
+Big Data - Analytics
+Big Data - Storage (HDFS)
+Big Data - Computing (Map/Reduce)
+Big Data - Database (HBase)
+Big Data – Graph DB (Titan)
+Big Data - Streaming (Strom)
• Get Convinced about “Big Data”
• Understand why we need a different paradigm.
• Ascertain with confidence the need to look at data computing in a different
way.
• Realize the potential of big data
– All of you are skilled enough to get into it.
• What we will not do
– Do research on why things have evolved into the current trends as it stands.
– Try to be hands-on – But not guaranteed
Introduction to Big Data
What are we going to understand
• What is Big Data?
• Why we landed up there?
• To whom does it matter
• Where is the money?
• Are we ready to handle it?
• What are the concerns?
• Tools and Technologies
– Is Big Data <=> Hadoop
Simple to start
• What is the maximum file size you have dealt so far?
– Movies/Files/Streaming video that you have used?
– What have you observed?
• What is the maximum download speed you get?
• Simple computation
– How much time to just transfer.
What is big data?
• “Every day, we create 2.5 quintillion bytes of data — so
much that 90% of the data in the world today has been
created in the last two years alone. This data comes
from everywhere: sensors used to gather climate
information, posts to social media sites, digital pictures
and videos, purchase transaction records, and cell
phone GPS signals to name a few.
This data is “big data.”
Huge amount of data
• There are huge volumes of data in the world:
+ From the beginning of recorded time until 2003,
+ We created 5 billion gigabytes (exabytes) of data.
+ In 2011, the same amount was created every two days
+ In 2013, the same amount of data is created every 10
minutes.
Big data spans three dimensions:
Volume, Velocity and Variety• Volume: Enterprises are awash with ever-growing data of all types, easily amassing
terabytes—even petabytes—of information.
– Turn 12 terabytes of Tweets created each day into improved product sentiment
analysis
– Convert 350 billion annual meter readings to better predict power consumption
• Velocity: Sometimes 2 minutes is too late. For time-sensitive processes such as catching
fraud, big data must be used as it streams into your enterprise in order to maximize its
value.
– Scrutinize 5 million trade events created each day to identify potential fraud
– Analyze 500 million daily call detail records in real-time to predict customer churn
faster
– The latest I have heard is 10 nano seconds delay is too much.
• Variety: Big data is any type of data - structured and unstructured data such as text,
sensor data, audio, video, click streams, log files and more. New insights are found
when analyzing these data types together.
– Monitor 100’s of live video feeds from surveillance cameras to target points of
interest
– Exploit the 80% data growth in images, video and documents to improve customer
satisfaction
Finally….
`Big- Data’ is similar to ‘Small-data’ but bigger
.. But having data bigger it requires different
approaches:
Techniques, tools, architecture
… with an aim to solve new problems
Or old problems in a better way
Whom does it matter
• Research Community 
• Business Community - New tools, new
capabilities, new infrastructure, new business
models etc.,
• On sectors
Financial Services..
How are revenues looking like….
The Social Layer in an Instrumented Interconnected World
2+
billion
people on
the Web
by end
2011
30 billion RFID
tags today
(1.3B in 2005)
4.6
billion
camera
phones
world wide
100s of
millions
of GPS
enabled
devices sold
annually
76 million smart
meters in 2009…
200M by 2014
12+ TBs
of tweet data
every day
25+ TBs of
log data
every day
?TBsof
dataeveryday
What does Big Data trigger?
BIG DATA is not just HADOOP
Manage & store huge
volume of any data
Hadoop File System
MapReduce
Manage streaming data Stream Computing
Analyze unstructured data Text Analytics Engine
Data WarehousingStructure and control data
Integrate and govern all
data sources
Integration, Data Quality, Security,
Lifecycle Management, MDM
Understand and navigate
federated big data sources
Federated Discovery and Navigation
Types of tools typically used in Big
Data Scenario
• Where is the processing hosted?
– Distributed server/cloud
• Where data is stored?
– Distributed Storage (eg: Amazon s3)
• Where is the programming model?
– Distributed processing (Map Reduce)
• How data is stored and indexed?
– High performance schema free database
• What operations are performed on the data?
– Analytic/Semantic Processing (Eg. RDF/OWL)
When dealing with Big Data is hard
• When the operations on data are complex:
– Eg. Simple counting is not a complex problem.
– Modeling and reasoning with data of different kinds can get
extremely complex
• Good news with big-data:
– Often, because of the vast amount of data, modeling
techniques can get simpler (e.g., smart counting can
replace complex model-based analytics)…
– …as long as we deal with the scale.
Time for thinking
• What do you do with the data.
– Lets take an example:
• “From application developers to video streamers, organizations of all
sizes face the challenge of capturing, searching, analyzing, and
leveraging as much as terabytes of data per second—too much for the
constraints of traditional system capabilities and database
management tools.”
Why Big-Data?
• Key enablers for the appearance and growth
of ‘Big-Data’ are:
+Increase in storage capabilities
+Increase in processing power
+Availability of data
SKILLWISE-BIGDATA ANALYSIS

Contenu connexe

Tendances (20)

Chapter 4 what is data and data types
Chapter 4  what is data and data typesChapter 4  what is data and data types
Chapter 4 what is data and data types
 
Big Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation SlidesBig Data Characteristics And Process PowerPoint Presentation Slides
Big Data Characteristics And Process PowerPoint Presentation Slides
 
Big data
Big dataBig data
Big data
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Big data
Big dataBig data
Big data
 
Big data tools
Big data toolsBig data tools
Big data tools
 
Team 2 Big Data Presentation
Team 2 Big Data PresentationTeam 2 Big Data Presentation
Team 2 Big Data Presentation
 
Big data
Big dataBig data
Big data
 
Big data 2017 final
Big data 2017   finalBig data 2017   final
Big data 2017 final
 
Big Data
Big DataBig Data
Big Data
 
Big data, Big decision
Big data, Big decisionBig data, Big decision
Big data, Big decision
 
Moneytree - Data Aggregation with SWF
Moneytree - Data Aggregation with SWFMoneytree - Data Aggregation with SWF
Moneytree - Data Aggregation with SWF
 
Ppt for Application of big data
Ppt for Application of big dataPpt for Application of big data
Ppt for Application of big data
 
Applications of Big Data
Applications of Big DataApplications of Big Data
Applications of Big Data
 
Mining Big Data in Real Time
Mining Big Data in Real TimeMining Big Data in Real Time
Mining Big Data in Real Time
 
Big Data - Applications and Technologies Overview
Big Data - Applications and Technologies OverviewBig Data - Applications and Technologies Overview
Big Data - Applications and Technologies Overview
 
Big data
Big dataBig data
Big data
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 
Presentation Big Data
Presentation Big DataPresentation Big Data
Presentation Big Data
 
5 v of big data
5 v of big data5 v of big data
5 v of big data
 

En vedette

Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Srinath Perera
 
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...Lora Cecere
 
Security issues associated with big data in cloud
Security issues associated  with big data in cloudSecurity issues associated  with big data in cloud
Security issues associated with big data in cloudsornalathaNatarajan
 
OpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo ProjectOpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo ProjectBYOUNG GON KIM
 
Big Data Analytics in Energy & Utilities
Big Data Analytics in Energy & UtilitiesBig Data Analytics in Energy & Utilities
Big Data Analytics in Energy & UtilitiesAnders Quitzau
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An OverviewC. Scyphers
 

En vedette (8)

Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack Big Data Analysis : Deciphering the haystack
Big Data Analysis : Deciphering the haystack
 
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
Imagining Supply Chain Processes Outside-in. Building Value Networks at IBM t...
 
Security issues associated with big data in cloud
Security issues associated  with big data in cloudSecurity issues associated  with big data in cloud
Security issues associated with big data in cloud
 
Big Data (security Issue)
Big Data (security Issue)Big Data (security Issue)
Big Data (security Issue)
 
Big data security
Big data securityBig data security
Big data security
 
OpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo ProjectOpenSource Big Data Platform - Flamingo Project
OpenSource Big Data Platform - Flamingo Project
 
Big Data Analytics in Energy & Utilities
Big Data Analytics in Energy & UtilitiesBig Data Analytics in Energy & Utilities
Big Data Analytics in Energy & Utilities
 
Big Data Platforms: An Overview
Big Data Platforms: An OverviewBig Data Platforms: An Overview
Big Data Platforms: An Overview
 

Similaire à SKILLWISE-BIGDATA ANALYSIS

big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfVirajSaud
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introductionamiyadash
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalIIIT Allahabad
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusersBob Hardaway
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01nayanbhatia2
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big dataVedanand Singh
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxdickonsondorris
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptxkalai75
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesRukshan Batuwita
 

Similaire à SKILLWISE-BIGDATA ANALYSIS (20)

big-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdfbig-datagroup6-150317090053-conversion-gate01.pdf
big-datagroup6-150317090053-conversion-gate01.pdf
 
Big data
Big dataBig data
Big data
 
Data analytics introduction
Data analytics introductionData analytics introduction
Data analytics introduction
 
Big data.pptx
Big data.pptxBig data.pptx
Big data.pptx
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Big data4businessusers
Big data4businessusersBig data4businessusers
Big data4businessusers
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 
Big_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptxBig_Data_ppt[1] (1).pptx
Big_Data_ppt[1] (1).pptx
 
bigdatappt.pptx
bigdatappt.pptxbigdatappt.pptx
bigdatappt.pptx
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01Bigdatappt 140225061440-phpapp01
Bigdatappt 140225061440-phpapp01
 
Special issues on big data
Special issues on big dataSpecial issues on big data
Special issues on big data
 
Content1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docxContent1. Introduction2. What is Big Data3. Characte.docx
Content1. Introduction2. What is Big Data3. Characte.docx
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
ppt final.pptx
ppt final.pptxppt final.pptx
ppt final.pptx
 
Big Data
Big DataBig Data
Big Data
 
big-data-notes1.ppt
big-data-notes1.pptbig-data-notes1.ppt
big-data-notes1.ppt
 
Big Data.pptx
Big Data.pptxBig Data.pptx
Big Data.pptx
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our Lives
 
Kartikey tripathi
Kartikey tripathiKartikey tripathi
Kartikey tripathi
 

Plus de Skillwise Consulting (19)

Insurace brochure for clients
Insurace brochure for clientsInsurace brochure for clients
Insurace brochure for clients
 
Health care profile
Health care profileHealth care profile
Health care profile
 
Manufacturing profile
Manufacturing profileManufacturing profile
Manufacturing profile
 
Skillwise profile
Skillwise profileSkillwise profile
Skillwise profile
 
Technology platform
Technology platformTechnology platform
Technology platform
 
JMETER-SKILLWISE
JMETER-SKILLWISEJMETER-SKILLWISE
JMETER-SKILLWISE
 
SKILLWISE_SELENIUM
SKILLWISE_SELENIUMSKILLWISE_SELENIUM
SKILLWISE_SELENIUM
 
Android Application Fundamentals.
Android Application Fundamentals.Android Application Fundamentals.
Android Application Fundamentals.
 
Skillwise Consulting_Android
Skillwise Consulting_AndroidSkillwise Consulting_Android
Skillwise Consulting_Android
 
Technical Comptency_ppt
Technical Comptency_pptTechnical Comptency_ppt
Technical Comptency_ppt
 
Advanced Soft skill_Skillwise Consulting
Advanced Soft skill_Skillwise ConsultingAdvanced Soft skill_Skillwise Consulting
Advanced Soft skill_Skillwise Consulting
 
Technical Skillwise
Technical SkillwiseTechnical Skillwise
Technical Skillwise
 
Softskill skillwise consulting ppt
Softskill skillwise consulting pptSoftskill skillwise consulting ppt
Softskill skillwise consulting ppt
 
Skillwise Consulting_Soft skills
Skillwise Consulting_Soft skillsSkillwise Consulting_Soft skills
Skillwise Consulting_Soft skills
 
Skillwise_Technical competency
Skillwise_Technical competencySkillwise_Technical competency
Skillwise_Technical competency
 
Skillwise consulting _Soft Skills
Skillwise consulting _Soft SkillsSkillwise consulting _Soft Skills
Skillwise consulting _Soft Skills
 
Skillwise Consulting -Technical competency
Skillwise Consulting -Technical competencySkillwise Consulting -Technical competency
Skillwise Consulting -Technical competency
 
Skillwise Profile
Skillwise ProfileSkillwise Profile
Skillwise Profile
 
Skillwise Consulting
Skillwise ConsultingSkillwise Consulting
Skillwise Consulting
 

Dernier

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 

Dernier (20)

Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

SKILLWISE-BIGDATA ANALYSIS

  • 2.
  • 4. What is the aim of the course Focus is on “Systems” and applications for cloud-based storage and processing of BIG DATA. +Big Data - Definition +Big Data - Analytics +Big Data - Storage (HDFS) +Big Data - Computing (Map/Reduce) +Big Data - Database (HBase) +Big Data – Graph DB (Titan) +Big Data - Streaming (Strom)
  • 5. • Get Convinced about “Big Data” • Understand why we need a different paradigm. • Ascertain with confidence the need to look at data computing in a different way. • Realize the potential of big data – All of you are skilled enough to get into it. • What we will not do – Do research on why things have evolved into the current trends as it stands. – Try to be hands-on – But not guaranteed
  • 7. What are we going to understand • What is Big Data? • Why we landed up there? • To whom does it matter • Where is the money? • Are we ready to handle it? • What are the concerns? • Tools and Technologies – Is Big Data <=> Hadoop
  • 8. Simple to start • What is the maximum file size you have dealt so far? – Movies/Files/Streaming video that you have used? – What have you observed? • What is the maximum download speed you get? • Simple computation – How much time to just transfer.
  • 9. What is big data? • “Every day, we create 2.5 quintillion bytes of data — so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. This data is “big data.”
  • 10. Huge amount of data • There are huge volumes of data in the world: + From the beginning of recorded time until 2003, + We created 5 billion gigabytes (exabytes) of data. + In 2011, the same amount was created every two days + In 2013, the same amount of data is created every 10 minutes.
  • 11. Big data spans three dimensions: Volume, Velocity and Variety• Volume: Enterprises are awash with ever-growing data of all types, easily amassing terabytes—even petabytes—of information. – Turn 12 terabytes of Tweets created each day into improved product sentiment analysis – Convert 350 billion annual meter readings to better predict power consumption • Velocity: Sometimes 2 minutes is too late. For time-sensitive processes such as catching fraud, big data must be used as it streams into your enterprise in order to maximize its value. – Scrutinize 5 million trade events created each day to identify potential fraud – Analyze 500 million daily call detail records in real-time to predict customer churn faster – The latest I have heard is 10 nano seconds delay is too much. • Variety: Big data is any type of data - structured and unstructured data such as text, sensor data, audio, video, click streams, log files and more. New insights are found when analyzing these data types together. – Monitor 100’s of live video feeds from surveillance cameras to target points of interest – Exploit the 80% data growth in images, video and documents to improve customer satisfaction
  • 12. Finally…. `Big- Data’ is similar to ‘Small-data’ but bigger .. But having data bigger it requires different approaches: Techniques, tools, architecture … with an aim to solve new problems Or old problems in a better way
  • 13. Whom does it matter • Research Community  • Business Community - New tools, new capabilities, new infrastructure, new business models etc., • On sectors Financial Services..
  • 14. How are revenues looking like….
  • 15. The Social Layer in an Instrumented Interconnected World 2+ billion people on the Web by end 2011 30 billion RFID tags today (1.3B in 2005) 4.6 billion camera phones world wide 100s of millions of GPS enabled devices sold annually 76 million smart meters in 2009… 200M by 2014 12+ TBs of tweet data every day 25+ TBs of log data every day ?TBsof dataeveryday
  • 16. What does Big Data trigger?
  • 17. BIG DATA is not just HADOOP Manage & store huge volume of any data Hadoop File System MapReduce Manage streaming data Stream Computing Analyze unstructured data Text Analytics Engine Data WarehousingStructure and control data Integrate and govern all data sources Integration, Data Quality, Security, Lifecycle Management, MDM Understand and navigate federated big data sources Federated Discovery and Navigation
  • 18. Types of tools typically used in Big Data Scenario • Where is the processing hosted? – Distributed server/cloud • Where data is stored? – Distributed Storage (eg: Amazon s3) • Where is the programming model? – Distributed processing (Map Reduce) • How data is stored and indexed? – High performance schema free database • What operations are performed on the data? – Analytic/Semantic Processing (Eg. RDF/OWL)
  • 19. When dealing with Big Data is hard • When the operations on data are complex: – Eg. Simple counting is not a complex problem. – Modeling and reasoning with data of different kinds can get extremely complex • Good news with big-data: – Often, because of the vast amount of data, modeling techniques can get simpler (e.g., smart counting can replace complex model-based analytics)… – …as long as we deal with the scale.
  • 20. Time for thinking • What do you do with the data. – Lets take an example: • “From application developers to video streamers, organizations of all sizes face the challenge of capturing, searching, analyzing, and leveraging as much as terabytes of data per second—too much for the constraints of traditional system capabilities and database management tools.”
  • 21. Why Big-Data? • Key enablers for the appearance and growth of ‘Big-Data’ are: +Increase in storage capabilities +Increase in processing power +Availability of data