SlideShare une entreprise Scribd logo
1  sur  26
BIG DATA SCIENCE 
Chandan Rajah [ @ChandanRajah ] 
“The price of light is far less than the cost of darkness”
COST SPEED 
BENEFITS OF BIG DATA 
AGILITY CAPABILITY
WHAT WHY 
Steps to the EPIPHANY 
WHERE 
DEMO
What is Big Data ? 
Big Data ≠ Data Volume 
Big Data = Crude Oil 
Think of data like ‘Crude Oil’ 
Big Data is about extracting ‘crude oil’; transporting it in ‘pipelines’; storing it in ‘mega tanks’
What is Data Science ? 
Data Science ≠ Statistical Analysis 
Data Science = Oil Refinery 
Data science is about ‘treating’ data; applying ‘science’ to the data; 
Refine the data ‘results’; and combine to form ‘insight’
Knowns, Unknowns & DIKUW FTW! 
known knowns 
we know we know 
known unknowns 
we know we don’t know 
unknown unknowns 
we don’t know we don’t know 
D 
DATA 
I 
INFORMATION 
K 
KNOWLEDGE 
W 
WISDOM 
U 
UNDERSTANDING 
PAST FUTURE 
Data Engineer Data Analyst Data Miner Data Scientist 
raw what how to why when 
numbers description experience cause & effect prediction 
letters context tested proven what’s best 
symbols relationship instruction 
known knowns 
known unknowns unknown unknowns 
signals reports programs models
Data Analytics to Data Discovery ? 
data you know 
data you don’t know 
questions you’re asking 
questions you’re not asking 
Data Analyst 
Data Scientist 
Data 
Analytics 
Data Discovery 
DATA MODELLING 
Y  F( X, random noise, parameters) 
ALGORITHMIC MODELLING 
Y  [ BLACK BOX ]  X
DIVIDE 
SCATTER 
Split Data in Block 
Replicate and Store 
Petabytes of Resilience 
CONQUER 
EXPLORE 
1000s of Parallel Threads 
Explore Every Path 
Machine Learning 
INSIGHT 
GATHER 
Real Time Action 
Periodic Dashboards 
Iterative Evolution 
What is the Big Idea ?
Divide = HDFS 
Name Node 
Client 1. Create Metadata 
2. Put Blocks 
1 2 3 
Control / Monitoring 
2 2 
1 1 
Data Nodes 
3 3 
WRITE 
Name Node 
Client 1. Get Metadata 
Control / Monitoring 
1 1 1 2 
2 
2 
4 3 3 3 
4 4 
2. Fetch Blocks 
Data Nodes 
READ
Conquer = MapReduce
Insight = Functional Paradigm
WHAT WHY 
Steps to the EPIPHANY 
WHERE 
DEMO
Why is Big Data needed ? 
VOLUME VELOCITY VARIETY 
Exponential growth; 2x in 2 yrs 
PB (1000 TB) is now common 
Event streams; never at rest 
640k GB per internet minute 
100s of data sources 
85% not in a table
Where in the Value Chain ? 
Generation Transport Knowledge Output Value 
BIG DATA SCIENCE 
Straddles all four Challenge Areas
WHAT WHY 
Steps to the EPIPHANY 
WHERE 
DEMO
Big Data Heat Map – Gartner 2012
Big Data Potential by Sector – McKinsey for USBLS, 2011
Big Data Investment by Industry – Gartner, 2012
Top Big Data Challenges – Gartner, 2012
Survey on Big Data Investments – IDG Survey, 2013
Survey on Main Drivers to Invest – IDG Survey, 2014
WHAT WHY 
Steps to the EPIPHANY 
WHERE 
DEMO
DEMO
COST SPEED 
RECAP OF BENEFITS 
AGILITY CAPABILITY
TIME VALUE OF DATA KNOWLEDGE IS POWER 
LAST WORDS OF WISDOM 
NOT ALL ROADS LEAD TO ROME 
I AM AN INDIVIDUAL
“The price of light is far less than the cost of darkness”

Contenu connexe

Tendances

Big Data Analysis for page ranking using map reduce concept
Big Data Analysis for page ranking using map reduce conceptBig Data Analysis for page ranking using map reduce concept
Big Data Analysis for page ranking using map reduce conceptVidhya Kumar
 
Big data analytics
Big data analyticsBig data analytics
Big data analyticsRavi Teja
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data miningEmran Hossain
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance Qubole
 
What is a Data Scientist
What is a Data Scientist What is a Data Scientist
What is a Data Scientist Experian_US
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsRavi Teja
 
"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.orgAIBDP
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data Srinath Perera
 
BIG DATA Analysis for page ranking using Map Reduce
BIG DATA Analysis for page ranking using Map ReduceBIG DATA Analysis for page ranking using Map Reduce
BIG DATA Analysis for page ranking using Map ReduceVidhya Kumar
 
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...Keshav Murthy
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6Zhihao Lin
 

Tendances (20)

Big Data Analysis for page ranking using map reduce concept
Big Data Analysis for page ranking using map reduce conceptBig Data Analysis for page ranking using map reduce concept
Big Data Analysis for page ranking using map reduce concept
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big data and data mining
Big data and data miningBig data and data mining
Big data and data mining
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
 
What is a Data Scientist
What is a Data Scientist What is a Data Scientist
What is a Data Scientist
 
Big data
Big dataBig data
Big data
 
De-Mystifying Big Data
De-Mystifying Big DataDe-Mystifying Big Data
De-Mystifying Big Data
 
Tools for Unstructured Data Analytics
Tools for Unstructured Data AnalyticsTools for Unstructured Data Analytics
Tools for Unstructured Data Analytics
 
Big data
Big dataBig data
Big data
 
"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org"Demystifying Big Data by AIBDP.org
"Demystifying Big Data by AIBDP.org
 
What is Big Data ?
What is Big Data ?What is Big Data ?
What is Big Data ?
 
Big data Introduction by Mohan
Big data Introduction by MohanBig data Introduction by Mohan
Big data Introduction by Mohan
 
Introduction to Big Data
Introduction to Big Data Introduction to Big Data
Introduction to Big Data
 
Exploring Big Data Analytics Tools
Exploring Big Data Analytics ToolsExploring Big Data Analytics Tools
Exploring Big Data Analytics Tools
 
BIG DATA Analysis for page ranking using Map Reduce
BIG DATA Analysis for page ranking using Map ReduceBIG DATA Analysis for page ranking using Map Reduce
BIG DATA Analysis for page ranking using Map Reduce
 
Thilga
ThilgaThilga
Thilga
 
Big Data & Data Mining
Big Data & Data MiningBig Data & Data Mining
Big Data & Data Mining
 
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
XLDB Lightning Talk: Databases for an Engaged World: Requirements and Design...
 
AI and Applications
AI and ApplicationsAI and Applications
AI and Applications
 
How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6How to build a data science team 20115.03.13v6
How to build a data science team 20115.03.13v6
 

Similaire à Big Data Science at the Digital Catapult

Steps to the Big Data Science Epiphany
Steps to the Big Data Science EpiphanySteps to the Big Data Science Epiphany
Steps to the Big Data Science EpiphanyChandan Rajah
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsChandan Rajah
 
From Info Science to Data Science & Smart Nation
From Info Science to Data Science & Smart Nation From Info Science to Data Science & Smart Nation
From Info Science to Data Science & Smart Nation CK Toh
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigDataValarmathi V
 
BDA 2012 Big data why the big fuss?
BDA 2012 Big data why the big fuss?BDA 2012 Big data why the big fuss?
BDA 2012 Big data why the big fuss?Christopher Bradley
 
Project Management Careers in Data Science
Project Management Careers in Data ScienceProject Management Careers in Data Science
Project Management Careers in Data ScienceGanes Kesari
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data ScienceTJ Stalcup
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?Caserta
 
What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...Ganes Kesari
 
iTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"MDS ap
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science TJ Stalcup
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallTrillium Software
 
Democratizing Big Data (Updated)
Democratizing Big Data (Updated)Democratizing Big Data (Updated)
Democratizing Big Data (Updated)Jeff Kelly
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Sciencesarith divakar
 
Democratizing Big Data
Democratizing Big DataDemocratizing Big Data
Democratizing Big DataJeff Kelly
 
Accelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricAccelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricCambridge Semantics
 
Data Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentData Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentKalido
 
Level Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationLevel Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationDoug Denton
 

Similaire à Big Data Science at the Digital Catapult (20)

Steps to the Big Data Science Epiphany
Steps to the Big Data Science EpiphanySteps to the Big Data Science Epiphany
Steps to the Big Data Science Epiphany
 
Big Data Science: Intro and Benefits
Big Data Science: Intro and BenefitsBig Data Science: Intro and Benefits
Big Data Science: Intro and Benefits
 
From Info Science to Data Science & Smart Nation
From Info Science to Data Science & Smart Nation From Info Science to Data Science & Smart Nation
From Info Science to Data Science & Smart Nation
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
BDA 2012 Big data why the big fuss?
BDA 2012 Big data why the big fuss?BDA 2012 Big data why the big fuss?
BDA 2012 Big data why the big fuss?
 
Project Management Careers in Data Science
Project Management Careers in Data ScienceProject Management Careers in Data Science
Project Management Careers in Data Science
 
Intro to Data Science
Intro to Data ScienceIntro to Data Science
Intro to Data Science
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
 
You're the New CDO, Now What?
You're the New CDO, Now What?You're the New CDO, Now What?
You're the New CDO, Now What?
 
What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...What's the Value of Data Science for Organizations: Tips for Invincibility in...
What's the Value of Data Science for Organizations: Tips for Invincibility in...
 
iTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun SukhaniiTrain Malaysia: Data Science by Tarun Sukhani
iTrain Malaysia: Data Science by Tarun Sukhani
 
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
 
Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
 
The Bigger They Are The Harder They Fall
The Bigger They Are The Harder They FallThe Bigger They Are The Harder They Fall
The Bigger They Are The Harder They Fall
 
Democratizing Big Data (Updated)
Democratizing Big Data (Updated)Democratizing Big Data (Updated)
Democratizing Big Data (Updated)
 
The Future of Data Science
The Future of Data ScienceThe Future of Data Science
The Future of Data Science
 
Democratizing Big Data
Democratizing Big DataDemocratizing Big Data
Democratizing Big Data
 
Accelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricAccelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data Fabric
 
Data Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business InvestmentData Scientists: Your Must-Have Business Investment
Data Scientists: Your Must-Have Business Investment
 
Level Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentationLevel Seven - Expedient Big Data presentation
Level Seven - Expedient Big Data presentation
 

Plus de Chandan Rajah

Business Change through Predictive Analytics
Business Change through Predictive AnalyticsBusiness Change through Predictive Analytics
Business Change through Predictive AnalyticsChandan Rajah
 
Business Change through Predictive Analytics
Business Change through Predictive AnalyticsBusiness Change through Predictive Analytics
Business Change through Predictive AnalyticsChandan Rajah
 
Data Disruption by Vertical Innovation
Data Disruption by Vertical InnovationData Disruption by Vertical Innovation
Data Disruption by Vertical InnovationChandan Rajah
 
Data Innovation in the UK
Data Innovation in the UKData Innovation in the UK
Data Innovation in the UKChandan Rajah
 
Data Disruption by Vertical Innovation in Media
Data Disruption by Vertical Innovation in MediaData Disruption by Vertical Innovation in Media
Data Disruption by Vertical Innovation in MediaChandan Rajah
 
Catalysing Sector Advantage
Catalysing Sector AdvantageCatalysing Sector Advantage
Catalysing Sector AdvantageChandan Rajah
 
Rise of the Machines
Rise of the MachinesRise of the Machines
Rise of the MachinesChandan Rajah
 
Health Innovation and the Digital Catapult
Health Innovation and the Digital CatapultHealth Innovation and the Digital Catapult
Health Innovation and the Digital CatapultChandan Rajah
 
Connected Farms ...and the Digital Catapult
Connected Farms ...and the Digital CatapultConnected Farms ...and the Digital Catapult
Connected Farms ...and the Digital CatapultChandan Rajah
 
Data Innovation in the Digital Economy
Data Innovation in the Digital EconomyData Innovation in the Digital Economy
Data Innovation in the Digital EconomyChandan Rajah
 
Disruptive Data in Future Care
Disruptive Data in Future CareDisruptive Data in Future Care
Disruptive Data in Future CareChandan Rajah
 
Data Warehouse to Data Science
Data Warehouse to Data ScienceData Warehouse to Data Science
Data Warehouse to Data ScienceChandan Rajah
 
Business Impact of Predictive Analytics
Business Impact of Predictive AnalyticsBusiness Impact of Predictive Analytics
Business Impact of Predictive AnalyticsChandan Rajah
 
Social Triangulation with Big Data
Social Triangulation with Big DataSocial Triangulation with Big Data
Social Triangulation with Big DataChandan Rajah
 
Big Data Science Challenges in Media
Big Data Science Challenges in MediaBig Data Science Challenges in Media
Big Data Science Challenges in MediaChandan Rajah
 

Plus de Chandan Rajah (17)

Business Change through Predictive Analytics
Business Change through Predictive AnalyticsBusiness Change through Predictive Analytics
Business Change through Predictive Analytics
 
Business Change through Predictive Analytics
Business Change through Predictive AnalyticsBusiness Change through Predictive Analytics
Business Change through Predictive Analytics
 
Data Disruption by Vertical Innovation
Data Disruption by Vertical InnovationData Disruption by Vertical Innovation
Data Disruption by Vertical Innovation
 
Data Innovation in the UK
Data Innovation in the UKData Innovation in the UK
Data Innovation in the UK
 
Data Disruption by Vertical Innovation in Media
Data Disruption by Vertical Innovation in MediaData Disruption by Vertical Innovation in Media
Data Disruption by Vertical Innovation in Media
 
Catalysing Sector Advantage
Catalysing Sector AdvantageCatalysing Sector Advantage
Catalysing Sector Advantage
 
Rise of the Machines
Rise of the MachinesRise of the Machines
Rise of the Machines
 
Health Innovation and the Digital Catapult
Health Innovation and the Digital CatapultHealth Innovation and the Digital Catapult
Health Innovation and the Digital Catapult
 
Connected Farms ...and the Digital Catapult
Connected Farms ...and the Digital CatapultConnected Farms ...and the Digital Catapult
Connected Farms ...and the Digital Catapult
 
Data Innovation in the Digital Economy
Data Innovation in the Digital EconomyData Innovation in the Digital Economy
Data Innovation in the Digital Economy
 
Disruptive Data in Future Care
Disruptive Data in Future CareDisruptive Data in Future Care
Disruptive Data in Future Care
 
Data Warehouse to Data Science
Data Warehouse to Data ScienceData Warehouse to Data Science
Data Warehouse to Data Science
 
Business Impact of Predictive Analytics
Business Impact of Predictive AnalyticsBusiness Impact of Predictive Analytics
Business Impact of Predictive Analytics
 
Social Triangulation with Big Data
Social Triangulation with Big DataSocial Triangulation with Big Data
Social Triangulation with Big Data
 
Big Data Science Challenges in Media
Big Data Science Challenges in MediaBig Data Science Challenges in Media
Big Data Science Challenges in Media
 
Hadoop and friends
Hadoop and friendsHadoop and friends
Hadoop and friends
 
IPTV Case Study
IPTV Case StudyIPTV Case Study
IPTV Case Study
 

Dernier

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 

Dernier (20)

Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 

Big Data Science at the Digital Catapult

  • 1. BIG DATA SCIENCE Chandan Rajah [ @ChandanRajah ] “The price of light is far less than the cost of darkness”
  • 2. COST SPEED BENEFITS OF BIG DATA AGILITY CAPABILITY
  • 3. WHAT WHY Steps to the EPIPHANY WHERE DEMO
  • 4. What is Big Data ? Big Data ≠ Data Volume Big Data = Crude Oil Think of data like ‘Crude Oil’ Big Data is about extracting ‘crude oil’; transporting it in ‘pipelines’; storing it in ‘mega tanks’
  • 5. What is Data Science ? Data Science ≠ Statistical Analysis Data Science = Oil Refinery Data science is about ‘treating’ data; applying ‘science’ to the data; Refine the data ‘results’; and combine to form ‘insight’
  • 6. Knowns, Unknowns & DIKUW FTW! known knowns we know we know known unknowns we know we don’t know unknown unknowns we don’t know we don’t know D DATA I INFORMATION K KNOWLEDGE W WISDOM U UNDERSTANDING PAST FUTURE Data Engineer Data Analyst Data Miner Data Scientist raw what how to why when numbers description experience cause & effect prediction letters context tested proven what’s best symbols relationship instruction known knowns known unknowns unknown unknowns signals reports programs models
  • 7. Data Analytics to Data Discovery ? data you know data you don’t know questions you’re asking questions you’re not asking Data Analyst Data Scientist Data Analytics Data Discovery DATA MODELLING Y  F( X, random noise, parameters) ALGORITHMIC MODELLING Y  [ BLACK BOX ]  X
  • 8. DIVIDE SCATTER Split Data in Block Replicate and Store Petabytes of Resilience CONQUER EXPLORE 1000s of Parallel Threads Explore Every Path Machine Learning INSIGHT GATHER Real Time Action Periodic Dashboards Iterative Evolution What is the Big Idea ?
  • 9. Divide = HDFS Name Node Client 1. Create Metadata 2. Put Blocks 1 2 3 Control / Monitoring 2 2 1 1 Data Nodes 3 3 WRITE Name Node Client 1. Get Metadata Control / Monitoring 1 1 1 2 2 2 4 3 3 3 4 4 2. Fetch Blocks Data Nodes READ
  • 12. WHAT WHY Steps to the EPIPHANY WHERE DEMO
  • 13. Why is Big Data needed ? VOLUME VELOCITY VARIETY Exponential growth; 2x in 2 yrs PB (1000 TB) is now common Event streams; never at rest 640k GB per internet minute 100s of data sources 85% not in a table
  • 14. Where in the Value Chain ? Generation Transport Knowledge Output Value BIG DATA SCIENCE Straddles all four Challenge Areas
  • 15. WHAT WHY Steps to the EPIPHANY WHERE DEMO
  • 16. Big Data Heat Map – Gartner 2012
  • 17. Big Data Potential by Sector – McKinsey for USBLS, 2011
  • 18. Big Data Investment by Industry – Gartner, 2012
  • 19. Top Big Data Challenges – Gartner, 2012
  • 20. Survey on Big Data Investments – IDG Survey, 2013
  • 21. Survey on Main Drivers to Invest – IDG Survey, 2014
  • 22. WHAT WHY Steps to the EPIPHANY WHERE DEMO
  • 23. DEMO
  • 24. COST SPEED RECAP OF BENEFITS AGILITY CAPABILITY
  • 25. TIME VALUE OF DATA KNOWLEDGE IS POWER LAST WORDS OF WISDOM NOT ALL ROADS LEAD TO ROME I AM AN INDIVIDUAL
  • 26. “The price of light is far less than the cost of darkness”

Notes de l'éditeur

  1. COST – 20x less per TB v/s Teradata, Netezza, Oracle – 75% less average marginal cost per capacity SPEED – 10x faster than Teradata, Netezza AGILITY – 115% lesser average cost per data source v/s Oracle SCIENCE – Machine learning, prediction
  2. WHAT - What is Big Data Science? WHY - Why is it needed? WHERE - Where is it being used? HOW - How will it evolve?
  3. WHAT - What is Big Data Science? WHY - Why is it needed? WHERE - Where is it being used? HOW - How will it evolve?
  4. WHAT - What is Big Data Science? WHY - Why is it needed? WHERE - Where is it being used? HOW - How will it evolve?
  5. WHAT - What is Big Data Science? WHY - Why is it needed? WHERE - Where is it being used? HOW - How will it evolve?
  6. COST – 20x less per TB v/s Teradata, Netezza, Oracle – 75% less average marginal cost per capacity SPEED – 10x faster than Teradata, Netezza AGILITY – 115% lesser average cost per data source v/s Oracle SCIENCE – Machine learning, prediction
  7. TIME VALUE - Yesterday’s data is less valuable than today’s data - Historical data is more valuable than just now alone POWER - Get from unknown unknowns to known unknowns or known knowns is powerful LEAD TO ROME - Exploring with no direct business impact is not a bad thing INDIVUDUAL - Treat every customer as an individual not an aggregate and analyse - Aggregate only individual insights