SlideShare une entreprise Scribd logo

02 a holistic approach to big data

1  sur  68
Télécharger pour lire hors ligne
Raul F. Chong
Senior Big Data and Cloud Program Manager
Big Data University Community Leader
@raulchong
A holistic approach to Big Data
© 2013 BigDataUniversity.com
Agenda
 Introduction to Big Data
 The state of Big Data adoption
 Big Data – A holistic approach
 The 5 high value Big Data use cases
 Technical details of key Big Data components
 The future of Big Data and Cloud
 Demos
 Resources
Agenda
 Introduction to Big Data
 The state of Big Data adoption
 Big Data – A holistic approach
 The 5 high value Big Data use cases
 Technical details of key Big Data components
 The future of Big Data and Cloud
 Demos
 Resources
What is Big Data?
Big data are datasets that grow so large
that they become awkward to work with
using on-hand database management tools.
Difficulties include capture, storage, search,
sharing, analytics, and visualizing.
Source: Wikipedia
Big Data Characteristics
Information is growing at a phenomenal rate
as much data and content over coming decade
2009
800,000 petabytes
2020
35 zettabytes
=
4 Trillion 8GB iPods
44x
Source: IDC, The Digital Universe Decade – Are You Ready?, May 2010
Big Data Characteristics
• About 80%of the world’s data is unstructured
• It may be data we’ve been collecting before, but could not
process

Recommandé

Big Data Scotland 2017
Big Data Scotland 2017Big Data Scotland 2017
Big Data Scotland 2017Ray Bugg
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dataconomy Media
 
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",..."From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...
"From Big Data To Big Valuewith HPE Predictive Analytics & Machine Learning",...Dataconomy Media
 
"Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr...
"Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr..."Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr...
"Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr...Dataconomy Media
 
Big Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreBig Data in Action : Operations, Analytics and more
Big Data in Action : Operations, Analytics and moreSoftweb Solutions
 
Big Data Use Cases
Big Data Use CasesBig Data Use Cases
Big Data Use CasesInSemble
 

Contenu connexe

Tendances

What is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesWhat is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesTony Pearson
 
Big data competitive landscape overview
Big data competitive landscape overviewBig data competitive landscape overview
Big data competitive landscape overviewBisakha Praharaj
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendCaserta
 
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse..."Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...Dataconomy Media
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive AnalyticsInfochimps, a CSC Big Data Business
 
Overview of analytics and big data in practice
Overview of analytics and big data in practiceOverview of analytics and big data in practice
Overview of analytics and big data in practiceVivek Murugesan
 
The Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big DataThe Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big DataInMobi Technology
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream Inc.
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyDatabricks
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...Romeo Kienzler
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data PlatformVikas Manoria
 
Strategyzing big data in telco industry
Strategyzing big data in telco industryStrategyzing big data in telco industry
Strategyzing big data in telco industryParviz Iskhakov
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Dataconomy Media
 
IoT and Big Data - Iot Asia 2014
IoT and Big Data - Iot Asia 2014IoT and Big Data - Iot Asia 2014
IoT and Big Data - Iot Asia 2014John Berns
 
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...Dataconomy Media
 
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...Databricks
 

Tendances (20)

What is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use CasesWhat is big data - Architectures and Practical Use Cases
What is big data - Architectures and Practical Use Cases
 
Big data competitive landscape overview
Big data competitive landscape overviewBig data competitive landscape overview
Big data competitive landscape overview
 
Big Data Tech Stack
Big Data Tech StackBig Data Tech Stack
Big Data Tech Stack
 
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
Infochimps + CloudCon: Infinite Monkey Theorem
Infochimps + CloudCon: Infinite Monkey TheoremInfochimps + CloudCon: Infinite Monkey Theorem
Infochimps + CloudCon: Infinite Monkey Theorem
 
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse..."Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
"Industrializing Machine Learning – How to Integrate ML in Existing Businesse...
 
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
[Webinar] Measure Twice, Build Once: Real-Time Predictive Analytics
 
Overview of analytics and big data in practice
Overview of analytics and big data in practiceOverview of analytics and big data in practice
Overview of analytics and big data in practice
 
The Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big DataThe Synapse IoT Stack: Technology Trends in IOT and Big Data
The Synapse IoT Stack: Technology Trends in IOT and Big Data
 
ParStream - Big Data for Business Users
ParStream - Big Data for Business UsersParStream - Big Data for Business Users
ParStream - Big Data for Business Users
 
Transforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform StrategyTransforming GE Healthcare with Data Platform Strategy
Transforming GE Healthcare with Data Platform Strategy
 
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
The European Conference on Software Architecture (ECSA) 14 - IBM BigData Refe...
 
Big data storage
Big data storageBig data storage
Big data storage
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Strategyzing big data in telco industry
Strategyzing big data in telco industryStrategyzing big data in telco industry
Strategyzing big data in telco industry
 
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
 
IoT and Big Data - Iot Asia 2014
IoT and Big Data - Iot Asia 2014IoT and Big Data - Iot Asia 2014
IoT and Big Data - Iot Asia 2014
 
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...Stephen Cantrell, kdb+ Developer at Kx Systems  “Kdb+: How Wall Street Tech c...
Stephen Cantrell, kdb+ Developer at Kx Systems “Kdb+: How Wall Street Tech c...
 
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
How a Media Data Platform Drives Real-time Insights & Analytics using Apache ...
 

En vedette

Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)Sumant Tambe
 
Vasilis Bankov & Calin Iliescu AEGON
Vasilis Bankov & Calin Iliescu AEGONVasilis Bankov & Calin Iliescu AEGON
Vasilis Bankov & Calin Iliescu AEGONBigDataExpo
 
Cyberlaw and Cybercrime
Cyberlaw and CybercrimeCyberlaw and Cybercrime
Cyberlaw and CybercrimePravir Karna
 
Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3
Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3
Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3Holger Mueller
 
Things you should know about Scalability!
Things you should know about Scalability!Things you should know about Scalability!
Things you should know about Scalability!Robert Mederer
 
Fontys eric van tol
Fontys eric van tolFontys eric van tol
Fontys eric van tolBigDataExpo
 
Drive faster & better software delivery with performance monitoring & DevOps
Drive faster & better software delivery with performance monitoring & DevOpsDrive faster & better software delivery with performance monitoring & DevOps
Drive faster & better software delivery with performance monitoring & DevOpsVolker Linz
 
Science ABC Book
Science ABC BookScience ABC Book
Science ABC Booktjelk1
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBigDataExpo
 
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...EMC
 
Polar bears and black bears
Polar bears and black bearsPolar bears and black bears
Polar bears and black bearsEmily Kissner
 
Developers Summit 2012 16-E-1
Developers Summit 2012 16-E-1Developers Summit 2012 16-E-1
Developers Summit 2012 16-E-1Kohei Kumazawa
 
Revue de presse Telecom Valley - Juin 2016
Revue de presse Telecom Valley - Juin 2016Revue de presse Telecom Valley - Juin 2016
Revue de presse Telecom Valley - Juin 2016TelecomValley
 
First day of school for sixth grade
First day of school for sixth gradeFirst day of school for sixth grade
First day of school for sixth gradeEmily Kissner
 
AWSome Day - Milan, July 24th 2014
AWSome Day - Milan, July 24th 2014AWSome Day - Milan, July 24th 2014
AWSome Day - Milan, July 24th 2014Amazon Web Services
 

En vedette (20)

Waarom ontwikkelt elk kind zich anders - prof. dr. Frank Verhulst
Waarom ontwikkelt elk kind zich anders - prof. dr. Frank VerhulstWaarom ontwikkelt elk kind zich anders - prof. dr. Frank Verhulst
Waarom ontwikkelt elk kind zich anders - prof. dr. Frank Verhulst
 
Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)Native XML processing in C++ (BoostCon'11)
Native XML processing in C++ (BoostCon'11)
 
Vasilis Bankov & Calin Iliescu AEGON
Vasilis Bankov & Calin Iliescu AEGONVasilis Bankov & Calin Iliescu AEGON
Vasilis Bankov & Calin Iliescu AEGON
 
Cyberlaw and Cybercrime
Cyberlaw and CybercrimeCyberlaw and Cybercrime
Cyberlaw and Cybercrime
 
Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3
Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3
Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3
 
Rb wilmer peres
Rb wilmer peresRb wilmer peres
Rb wilmer peres
 
Things you should know about Scalability!
Things you should know about Scalability!Things you should know about Scalability!
Things you should know about Scalability!
 
Fontys eric van tol
Fontys eric van tolFontys eric van tol
Fontys eric van tol
 
Drive faster & better software delivery with performance monitoring & DevOps
Drive faster & better software delivery with performance monitoring & DevOpsDrive faster & better software delivery with performance monitoring & DevOps
Drive faster & better software delivery with performance monitoring & DevOps
 
ecdevday7
ecdevday7ecdevday7
ecdevday7
 
Science ABC Book
Science ABC BookScience ABC Book
Science ABC Book
 
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use CasesBig Data Expo 2015 - Hortonworks Common Hadoop Use Cases
Big Data Expo 2015 - Hortonworks Common Hadoop Use Cases
 
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
EMC Enterprise Hybrid Cloud 2.5.1, Federation SDDC Edition: Foundation Infras...
 
Polar bears and black bears
Polar bears and black bearsPolar bears and black bears
Polar bears and black bears
 
Andreas weigend
Andreas weigendAndreas weigend
Andreas weigend
 
Developers Summit 2012 16-E-1
Developers Summit 2012 16-E-1Developers Summit 2012 16-E-1
Developers Summit 2012 16-E-1
 
Revue de presse Telecom Valley - Juin 2016
Revue de presse Telecom Valley - Juin 2016Revue de presse Telecom Valley - Juin 2016
Revue de presse Telecom Valley - Juin 2016
 
okspring3x
okspring3xokspring3x
okspring3x
 
First day of school for sixth grade
First day of school for sixth gradeFirst day of school for sixth grade
First day of school for sixth grade
 
AWSome Day - Milan, July 24th 2014
AWSome Day - Milan, July 24th 2014AWSome Day - Milan, July 24th 2014
AWSome Day - Milan, July 24th 2014
 

Similaire à 02 a holistic approach to big data

Big data insights part i
Big data insights   part iBig data insights   part i
Big data insights part iRaji Gogulapati
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data BSP Media Group
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data miningEmran Hossain
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataPrakalp Agarwal
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...Experfy
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion ahmed alshikh
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedcedrinemadera
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieSunil Ranka
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolutionitnewsafrica
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigManish Chopra
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesDATAVERSITY
 
Are You Prepared For The Future Of Data Technologies?
Are You Prepared For The Future Of Data Technologies?Are You Prepared For The Future Of Data Technologies?
Are You Prepared For The Future Of Data Technologies?Dell World
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Denodo
 

Similaire à 02 a holistic approach to big data (20)

Big data insights part i
Big data insights   part iBig data insights   part i
Big data insights part i
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
Oh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG DataOh! Session on Introduction to BIG Data
Oh! Session on Introduction to BIG Data
 
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
March Towards Big Data - Big Data Implementation, Migration, Ingestion, Manag...
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Data Mining With Big Data
Data Mining With Big DataData Mining With Big Data
Data Mining With Big Data
 
Big data peresintaion
Big data peresintaion Big data peresintaion
Big data peresintaion
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
Why Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A LieWhy Everything You Know About bigdata Is A Lie
Why Everything You Know About bigdata Is A Lie
 
Big Data Evolution
Big Data EvolutionBig Data Evolution
Big Data Evolution
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Data Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & ApproachesData Lake Architecture – Modern Strategies & Approaches
Data Lake Architecture – Modern Strategies & Approaches
 
Are You Prepared For The Future Of Data Technologies?
Are You Prepared For The Future Of Data Technologies?Are You Prepared For The Future Of Data Technologies?
Are You Prepared For The Future Of Data Technologies?
 
uae views on big data
  uae views on  big data  uae views on  big data
uae views on big data
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
Data mining with big data
Data mining with big dataData mining with big data
Data mining with big data
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 

Plus de Raul Chong

Managing & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
Managing & Processing Big Data for Cancer Genomics, an insight of BioinformaticsManaging & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
Managing & Processing Big Data for Cancer Genomics, an insight of BioinformaticsRaul Chong
 
Design thinking
Design thinkingDesign thinking
Design thinkingRaul Chong
 
Risk and financial portfolio analytics - A technical Introduction
Risk and financial portfolio analytics - A technical IntroductionRisk and financial portfolio analytics - A technical Introduction
Risk and financial portfolio analytics - A technical IntroductionRaul Chong
 
Introducing Bluemix
Introducing BluemixIntroducing Bluemix
Introducing BluemixRaul Chong
 
Business Analytics and Optimization Introduction (part 2)
Business Analytics and Optimization Introduction (part 2)Business Analytics and Optimization Introduction (part 2)
Business Analytics and Optimization Introduction (part 2)Raul Chong
 
Business Analytics and Optimization Introduction
Business Analytics and Optimization IntroductionBusiness Analytics and Optimization Introduction
Business Analytics and Optimization IntroductionRaul Chong
 
What has IBM Watson been up to since the Jeopardy! challenge?
What has IBM Watson been up to since the Jeopardy! challenge?What has IBM Watson been up to since the Jeopardy! challenge?
What has IBM Watson been up to since the Jeopardy! challenge?Raul Chong
 
SMAC projects - The best summer internship experience I ever had!
SMAC projects - The best summer internship experience I ever had!SMAC projects - The best summer internship experience I ever had!
SMAC projects - The best summer internship experience I ever had!Raul Chong
 
Starting your education in big data - Sneak peek to the new Big Data University
Starting your education in big data - Sneak peek to the new Big Data UniversityStarting your education in big data - Sneak peek to the new Big Data University
Starting your education in big data - Sneak peek to the new Big Data UniversityRaul Chong
 
Developing wearable technology apps quickly
Developing wearable technology apps quicklyDeveloping wearable technology apps quickly
Developing wearable technology apps quicklyRaul Chong
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2Raul Chong
 
Mobile solutions for iOS (and other platforms) - Cloudant
Mobile solutions for iOS (and other platforms) - CloudantMobile solutions for iOS (and other platforms) - Cloudant
Mobile solutions for iOS (and other platforms) - CloudantRaul Chong
 
Mobile solutions for iOS (and other platforms) - Worklight
Mobile solutions for iOS (and other platforms) - WorklightMobile solutions for iOS (and other platforms) - Worklight
Mobile solutions for iOS (and other platforms) - WorklightRaul Chong
 
Rapidly developing IoT (Internet of Things) applications - Part 2: Arduino, B...
Rapidly developing IoT (Internet of Things) applications - Part 2: Arduino, B...Rapidly developing IoT (Internet of Things) applications - Part 2: Arduino, B...
Rapidly developing IoT (Internet of Things) applications - Part 2: Arduino, B...Raul Chong
 
An Intro to Text Analytics on Big Data with a use case
An Intro to Text Analytics on Big Data with a use caseAn Intro to Text Analytics on Big Data with a use case
An Intro to Text Analytics on Big Data with a use caseRaul Chong
 
0626 2014 01_toronto-smac meetup_io_t
0626 2014 01_toronto-smac meetup_io_t0626 2014 01_toronto-smac meetup_io_t
0626 2014 01_toronto-smac meetup_io_tRaul Chong
 
0430 toronto smac_meetup_worklight_intro_final
0430 toronto smac_meetup_worklight_intro_final0430 toronto smac_meetup_worklight_intro_final
0430 toronto smac_meetup_worklight_intro_finalRaul Chong
 

Plus de Raul Chong (17)

Managing & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
Managing & Processing Big Data for Cancer Genomics, an insight of BioinformaticsManaging & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
Managing & Processing Big Data for Cancer Genomics, an insight of Bioinformatics
 
Design thinking
Design thinkingDesign thinking
Design thinking
 
Risk and financial portfolio analytics - A technical Introduction
Risk and financial portfolio analytics - A technical IntroductionRisk and financial portfolio analytics - A technical Introduction
Risk and financial portfolio analytics - A technical Introduction
 
Introducing Bluemix
Introducing BluemixIntroducing Bluemix
Introducing Bluemix
 
Business Analytics and Optimization Introduction (part 2)
Business Analytics and Optimization Introduction (part 2)Business Analytics and Optimization Introduction (part 2)
Business Analytics and Optimization Introduction (part 2)
 
Business Analytics and Optimization Introduction
Business Analytics and Optimization IntroductionBusiness Analytics and Optimization Introduction
Business Analytics and Optimization Introduction
 
What has IBM Watson been up to since the Jeopardy! challenge?
What has IBM Watson been up to since the Jeopardy! challenge?What has IBM Watson been up to since the Jeopardy! challenge?
What has IBM Watson been up to since the Jeopardy! challenge?
 
SMAC projects - The best summer internship experience I ever had!
SMAC projects - The best summer internship experience I ever had!SMAC projects - The best summer internship experience I ever had!
SMAC projects - The best summer internship experience I ever had!
 
Starting your education in big data - Sneak peek to the new Big Data University
Starting your education in big data - Sneak peek to the new Big Data UniversityStarting your education in big data - Sneak peek to the new Big Data University
Starting your education in big data - Sneak peek to the new Big Data University
 
Developing wearable technology apps quickly
Developing wearable technology apps quicklyDeveloping wearable technology apps quickly
Developing wearable technology apps quickly
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Mobile solutions for iOS (and other platforms) - Cloudant
Mobile solutions for iOS (and other platforms) - CloudantMobile solutions for iOS (and other platforms) - Cloudant
Mobile solutions for iOS (and other platforms) - Cloudant
 
Mobile solutions for iOS (and other platforms) - Worklight
Mobile solutions for iOS (and other platforms) - WorklightMobile solutions for iOS (and other platforms) - Worklight
Mobile solutions for iOS (and other platforms) - Worklight
 
Rapidly developing IoT (Internet of Things) applications - Part 2: Arduino, B...
Rapidly developing IoT (Internet of Things) applications - Part 2: Arduino, B...Rapidly developing IoT (Internet of Things) applications - Part 2: Arduino, B...
Rapidly developing IoT (Internet of Things) applications - Part 2: Arduino, B...
 
An Intro to Text Analytics on Big Data with a use case
An Intro to Text Analytics on Big Data with a use caseAn Intro to Text Analytics on Big Data with a use case
An Intro to Text Analytics on Big Data with a use case
 
0626 2014 01_toronto-smac meetup_io_t
0626 2014 01_toronto-smac meetup_io_t0626 2014 01_toronto-smac meetup_io_t
0626 2014 01_toronto-smac meetup_io_t
 
0430 toronto smac_meetup_worklight_intro_final
0430 toronto smac_meetup_worklight_intro_final0430 toronto smac_meetup_worklight_intro_final
0430 toronto smac_meetup_worklight_intro_final
 

Dernier

Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfIntroducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfSafe Software
 
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsFrom Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsInflectra
 
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)François
 
National Institute of Standards and Technology (NIST) Cybersecurity Framework...
National Institute of Standards and Technology (NIST) Cybersecurity Framework...National Institute of Standards and Technology (NIST) Cybersecurity Framework...
National Institute of Standards and Technology (NIST) Cybersecurity Framework...MichaelBenis1
 
Pragmatic UI testing with Compose Semantics.pdf
Pragmatic UI testing with Compose Semantics.pdfPragmatic UI testing with Compose Semantics.pdf
Pragmatic UI testing with Compose Semantics.pdfinfogdgmi
 
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro KozhevinFwdays
 
Artificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human JusticeArtificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human JusticeJosh Gellers
 
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaBuilding Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaISPMAIndia
 
"Platform Engineering with Development Containers", Igor Fesenko
"Platform Engineering with Development Containers", Igor Fesenko"Platform Engineering with Development Containers", Igor Fesenko
"Platform Engineering with Development Containers", Igor FesenkoFwdays
 
Launching New Products In Companies Where It Matters Most by Product Director...
Launching New Products In Companies Where It Matters Most by Product Director...Launching New Products In Companies Where It Matters Most by Product Director...
Launching New Products In Companies Where It Matters Most by Product Director...Product School
 
"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura Rochniak"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura RochniakFwdays
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVARobert McDermott
 
How we think about an advisor tech stack
How we think about an advisor tech stackHow we think about an advisor tech stack
How we think about an advisor tech stackSummit
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceVijayananda Mohire
 
Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...Product School
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, GoogleISPMAIndia
 
Utilising Energy Modelling for LCSF and PSDS Funding Applications
Utilising Energy Modelling for LCSF and PSDS Funding ApplicationsUtilising Energy Modelling for LCSF and PSDS Funding Applications
Utilising Energy Modelling for LCSF and PSDS Funding ApplicationsIES VE
 
Act Like an Owner, Challenge Like a VC by former CPO, Tripadvisor
Act Like an Owner,  Challenge Like a VC by former CPO, TripadvisorAct Like an Owner,  Challenge Like a VC by former CPO, Tripadvisor
Act Like an Owner, Challenge Like a VC by former CPO, TripadvisorProduct School
 
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...htrindia
 
Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?MENGSAYLOEM1
 

Dernier (20)

Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfIntroducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
 
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+PluginsFrom Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
From Challenger to Champion: How SpiraPlan Outperforms JIRA+Plugins
 
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
Mind your App Footprint 🐾⚡️🌱 (@FlutterHeroes 2024)
 
National Institute of Standards and Technology (NIST) Cybersecurity Framework...
National Institute of Standards and Technology (NIST) Cybersecurity Framework...National Institute of Standards and Technology (NIST) Cybersecurity Framework...
National Institute of Standards and Technology (NIST) Cybersecurity Framework...
 
Pragmatic UI testing with Compose Semantics.pdf
Pragmatic UI testing with Compose Semantics.pdfPragmatic UI testing with Compose Semantics.pdf
Pragmatic UI testing with Compose Semantics.pdf
 
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
"DevOps Practisting Platform on EKS with Karpenter autoscaling", Dmytro Kozhevin
 
Artificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human JusticeArtificial Intelligence, Design, and More-than-Human Justice
Artificial Intelligence, Design, and More-than-Human Justice
 
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish GuptaBuilding Products That Think- Bhaskaran Srinivasan & Ashish Gupta
Building Products That Think- Bhaskaran Srinivasan & Ashish Gupta
 
"Platform Engineering with Development Containers", Igor Fesenko
"Platform Engineering with Development Containers", Igor Fesenko"Platform Engineering with Development Containers", Igor Fesenko
"Platform Engineering with Development Containers", Igor Fesenko
 
Launching New Products In Companies Where It Matters Most by Product Director...
Launching New Products In Companies Where It Matters Most by Product Director...Launching New Products In Companies Where It Matters Most by Product Director...
Launching New Products In Companies Where It Matters Most by Product Director...
 
"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura Rochniak"Testing of Helm Charts or There and Back Again", Yura Rochniak
"Testing of Helm Charts or There and Back Again", Yura Rochniak
 
Introduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVAIntroduction to Multimodal LLMs with LLaVA
Introduction to Multimodal LLMs with LLaVA
 
How we think about an advisor tech stack
How we think about an advisor tech stackHow we think about an advisor tech stack
How we think about an advisor tech stack
 
My Journey towards Artificial Intelligence
My Journey towards Artificial IntelligenceMy Journey towards Artificial Intelligence
My Journey towards Artificial Intelligence
 
Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...Relationship Counselling: From Disjointed Features to Product-First Thinking ...
Relationship Counselling: From Disjointed Features to Product-First Thinking ...
 
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
"The Transformative Power of AI and Open Challenges" by Dr. Manish Gupta, Google
 
Utilising Energy Modelling for LCSF and PSDS Funding Applications
Utilising Energy Modelling for LCSF and PSDS Funding ApplicationsUtilising Energy Modelling for LCSF and PSDS Funding Applications
Utilising Energy Modelling for LCSF and PSDS Funding Applications
 
Act Like an Owner, Challenge Like a VC by former CPO, Tripadvisor
Act Like an Owner,  Challenge Like a VC by former CPO, TripadvisorAct Like an Owner,  Challenge Like a VC by former CPO, Tripadvisor
Act Like an Owner, Challenge Like a VC by former CPO, Tripadvisor
 
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
HBR SERIES METAL HOUSED RESISTORS POWER ELECTRICAL ABSORBS HIGH CURRENT DURIN...
 
Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?Are Human-generated Demonstrations Necessary for In-context Learning?
Are Human-generated Demonstrations Necessary for In-context Learning?
 

02 a holistic approach to big data

  • 1. Raul F. Chong Senior Big Data and Cloud Program Manager Big Data University Community Leader @raulchong A holistic approach to Big Data © 2013 BigDataUniversity.com
  • 2. Agenda  Introduction to Big Data  The state of Big Data adoption  Big Data – A holistic approach  The 5 high value Big Data use cases  Technical details of key Big Data components  The future of Big Data and Cloud  Demos  Resources
  • 3. Agenda  Introduction to Big Data  The state of Big Data adoption  Big Data – A holistic approach  The 5 high value Big Data use cases  Technical details of key Big Data components  The future of Big Data and Cloud  Demos  Resources
  • 4. What is Big Data? Big data are datasets that grow so large that they become awkward to work with using on-hand database management tools. Difficulties include capture, storage, search, sharing, analytics, and visualizing. Source: Wikipedia
  • 5. Big Data Characteristics Information is growing at a phenomenal rate as much data and content over coming decade 2009 800,000 petabytes 2020 35 zettabytes = 4 Trillion 8GB iPods 44x Source: IDC, The Digital Universe Decade – Are You Ready?, May 2010
  • 6. Big Data Characteristics • About 80%of the world’s data is unstructured • It may be data we’ve been collecting before, but could not process
  • 7. Types of Big Data • Data in movement - streams • Twitter / Facebook comments • Stock market data • Sensors: Vital signs of a newly-born • Data at rest - oceans • Collection of what has streamed • Web logs, emails, social media • Unstructured documents: forms, claims • Structured data from disparate systems
  • 8. IT Structures the data to answer that question IT Delivers a platform to enable creative discovery Business Explores what questions could be asked Business Users Determine what question to ask Monthly sales reports Profitability analysis Customer surveys Brand sentiment Product strategy Maximum asset utilization Big Data Approach Iterative & Exploratory Analysis Traditional Approach Structured & Repeatable Analysis Traditional vs. big data business approaches
  • 9. Applications for Big Data Analytics Homeland Security FinanceSmarter Healthcare Multi-channel sales Telecom Manufacturing Traffic Control Trading Analytics Fraud and Risk Log Analysis Search Quality Retail: Churn, NBO
  • 10. Agenda  The state of Big Data adoption  Big Data – A holistic approach  The 5 high value Big Data use cases  Technical details of key Big Data components  The future of Big Data and Cloud  Demos  Resources
  • 12. Use of Big Data globally and in the financial sector Multiple responses accepted
  • 13. Big Data: In Demand Well Paying Skill Skills are in Demand Pays well “If you can claim to be a data scientist and have the chops to back that up, you can pretty much write your own ticket even in this tough job market.” Source: Gigaom http://gigaom.com/cloud/big-data-skills-bring-big-dough/
  • 14. Agenda  The state of Big Data adoption  Big Data – A holistic approach  The 5 high value Big Data use cases  Technical details of key Big Data components  The future of Big Data and Cloud  Demos  Resources
  • 15. 15 KTH Swedish Royal Institute of Technology Reducing Traffic Congestion • Deployed real-time Smarter Traffic system to predict and improve traffic flow. • Analyzes streaming real-time data gathered from cameras at entry/exit to city, GPS data from taxis and trucks, and weather information. • Predicts best time and method to travel such as when to leave to catch a flight at the airport Results • Enables ability to analyze and predict traffic faster and more accurately than ever before • Provides new insight into mechanisms that affect a complex traffic system • Smarter, more efficient, and more environmentally friendly traffic 15
  • 16. Benefits  Real-time display of public sentiment as candidates respond to questions  Debate winner prediction based on public opinion instead of solely political analysts University of Southern California Innovation Lab Monitors Political Debates
  • 17. Big Data – A holistic approach Big Data is Not Only Hadoop!  Examples where Hadoop is not entirely applicable: – Cyber security, Stock market, Traffic control, Sensor information, monitoring trends in Social Media – What if your company has many silos of information, difficult to move to HDFS? – What about governance? Can we trust the source of this data?
  • 18. Solutions Big Data Platform Analytics and Decision Management Big Data Infrastructure Big data holistic approach: A platform
  • 19. Solutions Big Data Platform Analytics and Decision Management Big Data Infrastructure The IBM Big Data Platform Delivers deep insight with advanced in- database analytics & operational analytics Data Warehouse Data Warehouse Big data holistic approach: A platform
  • 20. Solutions Big Data Platform Analytics and Decision Management Big Data Infrastructure Stream Computing Data Warehouse Analyze streaming data and large data bursts for real-time insightsStream Computing Big data holistic approach: A platform
  • 21. Solutions Big Data Platform Analytics and Decision Management Big Data Infrastructure The IBM Big Data Platform Hadoop System Stream Computing Data Warehouse Cost-effectively analyze Petabytes of unstructured and structured data Hadoop System Big data holistic approach: A platform
  • 22. Solutions Big Data Platform Analytics and Decision Management Big Data Infrastructure 22 Information Integration & Governance Hadoop System Stream Computing Data Warehouse Govern data quality and manage the information lifecycle Information Integration & Governance Big data holistic approach: A platform
  • 23. Solutions Big Data Platform Analytics and Decision Management Big Data Infrastructure Accelerators Information Integration & Governance Hadoop System Stream Computing Data Warehouse Speed time to value with analytic and application accelerators Accelerators Big data holistic approach: A platform
  • 24. Solutions Big Data Platform Analytics and Decision Management Big Data Infrastructure Accelerators Information Integration & Governance Hadoop System Stream Computing Data Warehouse Systems Management Application Development Visualization & Discovery The IBM Big Data Platform Discover, understand, search, and navigate federated sources of big data Visualization & Discovery Big data holistic approach: A platform
  • 25.  Process any type of data – Structured, unstructured, in- motion, at-rest, in-place  Built-for-purpose engines – Designed to handle different requirements  Manage and govern data in the ecosystem  Enterprise data integration  Grow and evolve on current infrastructure  The whole is greater than the sum of parts  Integrated components  Out of the box, standards-based services  Start small (value is additive) 25 Solutions Big Data Platform Analytics and Decision Management Big Data Infrastructure Accelerators Information Integration & Governance Hadoop System Stream Computing Data Warehouse Systems Management Application Development Visualization & Discovery Big data holistic approach: A platform
  • 26. ETL, MDM, Data Governance Metadata and Governance Zone Warehousing Zone Enterprise Warehouse Data Marts Ingestion and Real-time Analytic Zone Streams Connectors BI & Reporting Predictive Analytics Analytics and Reporting Zone Visualization & Discovery Landing and Analytics Sandbox Zone Hive/HBase Col Stores Documents in variety of formats MapReduce Hadoop An example of the big data platform in practice
  • 27. Agenda  The state of Big Data adoption  Big Data – A holistic approach  The 5 high value Big Data use cases  Technical details of key Big Data components  The future of Big Data and Cloud  Demos  Resources
  • 28. Big Data Exploration Find, visualize, understand all big data to improve business knowledge Enhanced 360o View of the Customer Achieve a true unified view, incorporating internal and external sources Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency Operations Analysis Analyze a variety of machine data for improved business results The 5 High Value Big Data Use Cases
  • 29. Find, visualize and understand all big data to improve business knowledge • Greater efficiencies in business processes • New insights from combining and analyzing data types in new ways • Develop new business models with resulting increased market presence and revenue CM, RM, DM RDBMS Feeds Web 2.0 Email Web CRM, ERP File Systems Connector Framework App Builder Hadoop Integration & Governance UI / User Streams Big Data Exploration: Illustrated WarehouseData Explorer
  • 30. Big Data Exploration: Example in Practice • Exploring 4 TB to drive point business solutions (supplier portal, call center, etc.) • Single-point of data fusion for all employees to use • Reduced costs & improved operational performance for the business  How do you enable employees to navigate and explore enterprise and external content? Can you present this in a single user interface?  How do you identify areas of data risk before they become a problem?  What is the starting point for your big data initiatives? Is Big Data Exploration Right for You?  How do you separate the “noise” from useful content?  How do you perform data exploration on large and complex data?  How do you find insights in new or unstructured data types (e.g. social media and email)? Airplane Manufacturer Blinded for confidentiality Big Data Platform Component Starting Point: Data Explorer
  • 31. Enhanced 360º View of the Customer: Illustrated CRM J Robertson Pittsburgh, PA 15213 35 West 15th Name: Address: Address: ERP Janet Robertson Pittsburgh, PA 15213 35 West 15th St. Name: Address: Address: Legacy Jan Robertson Pittsburgh, PA 15213 36 West 15th St. Name: Address: Address: SOURCE SYSTEMS Janet 35 West 15th St Pittsburgh Robertson PA / 15213 F 48 1/4/64 First: Last: Address: City: State/Zip: Gender: Age: DOB: 360 View of Party Identity Master Data Management Unified View of Party’s Information Hadoop Streams Warehouse
  • 32. Logs Events Alerts Configuration information System audit trails External threat intelligence feeds Network flows and anomalies Identity context Web page text Video/audio surveillance E-mail and social activity Business process data Customer transactions Traditional Security Operations and Technology Big Data Analytics New Considerations Collection, Storage and Processing Collection and integration Size and speed Enrichment and correlation Analytics and Workflow Visualization Unstructured analysis Learning and prediction Customization Sharing and export Security/Intelligence Extension: Illustrated
  • 33. “Reconstructing Events” – Integrating Multimedia from Diverse Sources • Correlate multimedia content across a wide diversity of sources and dynamic topology of cameras • Exploit partial overlaps in field of view, re- identification of objects/people and contextual information • Obtain real-time operational picture across diverse content• 100K security cameras (static cameras, slowly changing topology) • 10M mobile photos/day (limited knowledge about locations) • 50M social media photos/video (uncertain geo-temporal context) • Moving vehicles (patrol cars), overhead drones, broadcast, retail, 311, etc. Overhead Social MediaMobile Cameras Security Cameras 33
  • 34. Security/Intelligence Extension: Customer Example  What are your plans to enrich your security or intel system with unused or underleveraged data sources (video, audio, smart devices, network, Telco, social media)?  How will you address the need sub second detection, identification, resolution of physical or cyber threats?  How do you intend to follow activities of criminals, terrorists, or persons in a blacklist?  How do you plan to enhance your surveillance system with real-time data from video, acoustic, thermal or other security sensors?  Do you want to correlate lots of technical or human intel data and sources looking for associations or patterns (big data forensics)?  How are you going to deal with unstructured data (email, social, etc.) in your Security Information & Event Management (SIEM) solution to improve cyber threat detection & remediation? Would the Security / Intelligence Extension benefit you? Captured and analyzed 42TB of daily traffic in real-time for tracking persons of interest to take suitable action and reduce risk. Big Data Platform Component Starting Point: Streams, Hadoop
  • 35. RawLogsandMachineData Indexing, Search Statistical Modeling Root Cause Analysis Federated Navigation & Discovery Real-time Analysis Only store what is needed Operations Analysis: Illustrated Machine Data Accelerator
  • 36. 1 http://www.information-management.com/infodirect/2009_133/downtime_cost-10015855-1.html 2 http://www.itchannelplanet.com/business_news/article.php/3916786/IT-System-Downtime-Costs-265-Billion-A-Year-Study-Finds.htm Operations analysis is a Business Imperative Cost of System Down Time – 49% of Fortune 500 companies > 80 hrs down time/year1 • Cost of down time: $90,000/hr to $6.48 million/hr • 80 hours * $6.48M = approx $500M per year – System downtown costs North American businesses $26.5 billion a year in lost revenue2
  • 37. Operations Analysis: Customer Example • Intelligent Infrastructure Management: log analytics, energy bill forecasting, energy consumption optimization, anomalous energy usage detection, presence-aware energy management • Optimized building energy consumption with centralized monitoring; Automated preventive and corrective maintenance • Utilized InfoSphere Streams, InfoSphere BigInsights, IBM Cognos  Do you deal with large volumes of machine data?  How do you access and search that data?  How do you perform root cause analysis?  How do you perform complex real-time analysis to correlate across different data sets?  How do you monitor and visualize streaming data in real time and generate alerts? Would Operations Analysis benefit you? Big Data Platform Component Starting Point: Hadoop, Streams
  • 38. Integrate big data and data warehouse capabilities to increase operational efficiency Data Warehouse Augmentation: Needs Need to leverage variety of data Extend warehouse infrastructure • Optimized storage, maintenance and licensing costs by migrating rarely used data to Hadoop • Reduced storage costs through smart processing of streaming data • Improved warehouse performance by determining what data to feed into it • Structured, unstructured, and streaming data sources required for deep analysis • Low latency requirements (hours—not weeks or months) • Required query access to data
  • 39. Filter and summarize big data for the warehouse Hadoop Data Warehouse Augmentation: Illustrated
  • 40. Hadoop as a query-ready archive for a data warehouse Hadoop Data Warehouse Augmentation: Illustrated
  • 41. Agenda  The state of Big Data adoption  Big Data – A holistic approach  The 5 high value Big Data use cases  Technical details of key Big Data components  The future of Big Data and Cloud  Demos  Resources
  • 42. Open Source Hadoop Visualization & Discovery Connectors Workload Optimization Flume Runtime Advanced Engines File System MapReduce HDFS Data Store HBase Development Tools Eclipse Plug-ins Systems Management Jaql Pig ZooKeeper Lucene Oozie Hive Open Source Mahout Whirr Sqoop Hue H Catalog R
  • 43. Visualization & Discovery Integration Workload Optimization Streams Netezza Flume DB2 DataStage IBM InfoSphere BigInsights v2.1 Enterprise Edition Runtime Advanced Analytic Engines File System MapReduce HDFS Data Store HBase Text Processing Engine & Extractor Library) BigSheets JDBC Applications & Development Text Analytics Administration Index Splittable Text Compression Enhanced Security Flexible Scheduler Jaql Pig ZooKeeper Lucene Oozie Adaptive MapReduce Hive Integrated Installer Admin Console Sqoop Adaptive Algorithms Dashboard & Visualization Apps Workflow Monitoring Management Security Audit & History Lineage R Guardium Platform Computing Cognos GPFS IBMOpen Source High Availability Big SQL H Catalog Whirr Mahout Hue Added Value on Top of Open Source Hadoop
  • 44. InfoSphere BigInsights Added Value InfoSphere BigInsights Administration & Security Workload Optimization (MapReduce/SQL) Connectors Development Tools IBM tested & supported open source components Accelerators Open source based components Workload Management Security Development Environment Analytics/Extractors Analytics Extraction engine (System T) Visualization & Exploration Extractors and APIs SQL API
  • 45. InfoSphere BigInsights Added Value: Accelerators Data Ingest and Prep Extract Buzz, Intent , Sentiment Entity Analytics: Profile Resolution Real time analytics. Pre-defined views and charts Dashboard Stream Computing and Analytics BigInsights System and Analytics Online flow: Data-in-motion analysis Offline flow: Data-at-rest analysis Pre-defined Workbooks and Dashboards Social Media Data Extract Buzz, Intent , Sentiment And Consumer Profiles Entity Analytics and Integration Comprehensive Social Media Customer Profiles Social Media Optional: Indexed Search Index using Push API Data Explorer Ad hoc access Social Data Analytics Accelerator Architecture
  • 46. InfoSphere BigInsights Added Value: BigSheets InfoSphere BigInsights Administration & Security Workload Optimization (MapReduce/SQL) Connectors Development Tools IBM tested & supported open source components Accelerators Open source based components Workload Management Security Development Environment Analytics/Extractors Analytics Extraction engine (System T) Visualization & Exploration Extractors and APIs SQL API BigSheets Visualization and Exploration • Web-based analysis and visualization for Users • Familiar spreadsheet-like interface • Define and manage long running data collection jobs
  • 47. InfoSphere BigInsights Added Value: BigSheets No programming knowledge needed! How it works  Model “big data” collected from various sources as collections  Filter and enrich content with built-in functions  Combine data in different collections  Visualize results through spreadsheets, charts  Export data into common formats (if desired)
  • 48. InfoSphere BigInsights Added Value: Dev Tools InfoSphere BigInsights Administration & Security Workload Optimization (MapReduce/SQL) Connectors Development Tools IBM tested & supported open source components Accelerators Open source based components Workload Management Security Development Environment Analytics/Extractors Analytics Extraction engine (System T) Visualization & Exploration Extractors and APIs SQL API Development Environment • Eclipse based dev environment • Developer tools and a set of analytic extractors for fast adoption and reduction in coding and debugging time • Plugin for Text Analytics, MapReduce programming, Jaql development, Hive query development, …. and more
  • 49. InfoSphere BigInsights Added Value: Dev Tools How it works • Built-in Apps make it easy to run Big Data applications & tasks:  Import and Export Data from a Database or files  Import and Export Web and Social Data  Perform Tex Analytics on specified content  Query HBase Content  Query content stored in BigInsights using Big SQL.  Execute Pig or JAQL applications • EXT E N S I B L E !! Build your own applications and make them easy to execute from an appealing Application launcher © 2013 IBM Corporation
  • 50. InfoSphere BigInsights Added Value: Dev Tools
  • 51. InfoSphere BigInsights Added Value: Text Analytics 51 Advanced Text Analytics Engine Automatically identify and understand key information in text Football World Cup 2010, one team distinguished themselves well, losing to the eventual champions 1-0 in the Final. Early in the second half, Netherlands’ striker, Arjen Robben, had a breakaway, but the keeper for Spain, Iker Casillas made the save. Winger Andres Iniesta scored for Spain for the win. InfoSphere BigInsights Administration & Security Workload Optimization Connectors Advanced Engines Visualization & Exploration Development Tools Open source Hadoop components © 2013 IBM Corporation
  • 53. © 2013 BigDataUniversity.com Architecture Diagram AQL Text AnalyticsText Analytics Optimizer Text Analytics RuntimeGraph (.aog) Compiled Operator Graph (.aog) Rule language with familiar SQL-like syntax Specify annotator semantics declaratively Choose an efficient execution plan that implements the semantics Highly scalable, embeddable Java runtime Input Document Stream Annotated Document Stream
  • 54. © 2013 BigDataUniversity.com InfoSphere BigInsights – Added Value: Connectors Connectors • Databases • DB2, Netezza, Oracle, Teradata Integrations • InfoSphere Data Stage (data collection and integration) • InfoSphere Streams (real-time streams processing) • InfoSphere Guardium (security and monitoring) • Cognos Business Intelligence (Business Intelligence capabilities) • IBM Platform Computing (cluster/grid infrastructure and management) and more… InfoSphere BigInsights Administration & Security Workload Optimization Connectors Advanced Engines Visualization & Exploration Development Tools Open source Hadoop components
  • 55. © 2013 BigDataUniversity.com BigInsights – Added Value: Workload optimization 55 Task Map Adaptive Map Reduce Hadoop System Scheduler • Identifies small and large jobs from prior experience • Sequences work to reduce overhead Adaptive MapReduce • Drop-in replacement for Hadoop batch scheduler • Dramatic performance gains for latency- sensitive application workloads • Agile scheduling, dynamically adjust priorities at run-time © 2013 IBM Corporation InfoSphere BigInsights Administration & Security Workload Optimization (MapReduce/SQL) Connectors Development Tools IBM tested & supported open source components Accelerators Open source based components Workload Management Security Development Environment Analytics/Extractors Analytics Analytics Extraction Engine Visualization & Exploration Extractors and APIs SQL API
  • 56. © 2013 BigDataUniversity.com BigInsights – Added Value: Web Console 56 Web Console • Start / stop services • Run / monitor jobs (applications) • Explore / modify file system • Built in Apps simplify common tasks InfoSphere BigInsights Administration & Security Workload Optimization Connectors Advanced Engines Visualization & Exploration Development Tools Open source Hadoop components
  • 57. BigInsights – Added Value: Security Security • LDAP authentication • Support for PAM & Flat File configuration • Administrators restrict access to authorized users • HTTPS support for the InfoSphere BigInsights console, and reverse proxy. • Role based access InfoSphere BigInsights Administration & Security Workload Optimization Connectors Advanced Engines Visualization & Exploration Development Tools Open source Hadoop components
  • 58. Achieve scale: By partitioning applications into software components By distributing across stream-connected hardware hosts Infrastructure provides services for Scheduling analytics across hardware hosts, Establishing streaming connectivity Transform Filter / Sample Classify Correlate Annotate Where appropriate: Elements can be fused together for lower communication latency  Continuous ingestion  Continuous analysis How Streams Works
  • 59. Agenda  The state of Big Data adoption  Big Data – A holistic approach  The 5 high value Big Data use cases  Technical details of key Big Data components  The future of Big Data and Cloud  Demos  Resources
  • 60. The Future of Big Data and Cloud  SQL for Hadoop support improvements – towards full ANSI support  Hive  Impala (Cloudera)  Big SQL (IBM)  Stinger (Hortonworks)  Drill (MapR)  HAWQ (Pivotal)  SQL-H (Teradata)  Improvements in Multimedia Analytics  Growth in usage and adoption of R programming language  Cloud  Bare metal support helping with Hadoop workloads  Private network  Full support with APIs
  • 61. Big SQL overview Big SQL fully integrates with SQL applications and BI tooling with benefits including: • Existing queries run with no or few modifications • Existing JDBC and ODBC compliant tools can be leveraged • Applications do not have to compensate for constraints of Hive QL which may result in: • more statements • potentially moving more data over the network to the application Data Sources Hive Tables HBase Tables CSV Files BigSQL Engine BigInsights Application SQL Language JDBC / ODBC Driver JDBC / ODBC Server Try it out! Big SQL 3.0 Technology Preview: bigsql.imdemocloud.com
  • 62. Agenda  The state of Big Data adoption  Big Data – A holistic approach  The 5 high value Big Data use cases  Technical details of key Big Data components  The future of Big Data and Cloud  Demos  Resources
  • 63. BigInsights on the Cloud - Making Learning Hadoop Easy and FunM2M Demos (using Streams) •The Connected Car Demo – http://ausgsa.ibm.com/projects/c/connected_car/index.html – http://m2m.demos.ibm.com/  YouTube IBM Big Data Channel – http://www.youtube.com/user/ibmbigdata Big Data University (bigdatauniversity.com)
  • 64. Agenda  The state of Big Data adoption  Big Data – A holistic approach  The 5 high value Big Data use cases  Technical details of key Big Data components  The future of Big Data and Cloud  Demos  Resources
  • 65.  Flexible on-line delivery allows learning @your place and @your pace  Free courses, free study materials.  Cloud-based sandbox for exercises – zero setup with Robust Course Management System and Content Distribution infrastructure  169,000 registered students.  Free IBM Hadoop, BigInsights Publications Big Data University (bigdatauniversity.com)
  • 66. BigInsights on the Cloud - Making Learning Hadoop Easy and FunQuick Start Editions available (Free, non- production, no time bomb): – IBM InfoSphere BigInsights (IBM’s Hadoop Distribution) ibm.co/QuickStart – IBM InfoSphere Streams ibm.co/streamsqs Big Data University (bigdatauniversity.com)
  • 67. 67 My contact information Contact Info: Twitter: @raulchong Facebook: facebook.com/raul.f.chong LinkedIN: linkedin.com/pub/raul-f-chong/8/aa2/b63 My contact information
  • 68. Thank You! © 2013 BigDataUniversity.com