SlideShare une entreprise Scribd logo
1  sur  23
Conflict in the Cloud 
Big data and cloud computing 
Keith Peterson, CEO 
Halo BI 
©2014 Halo Business Intelligence | All Rights Reserved
Starting points 
• Information management and analytics issues hurting business objectives 
• Taking days and weeks to get to the data 
• Multiple copies of data around the organization 
• No shared view of the truth 
• ETL and data warehouse unable to handle loads 
• BI and reporting eating up capacity 
• Data volumes growing but budgets static 
• Desire to leverage new machine data sources 
©2014 Halo Business Intelligence | All Rights Reserved
The “Big Data” challenge (Executive View) 
… 
© 2014 Halo Business Intelligence | All Rights Reserved
The “Big Data” challenge (Business View) 
… 
Big Data Is… 
• Ever escalating volumes 
• Expanding sources, such as Internet Of Things 
• Increasingly high velocities 
• With a widening variety of unstructured formats and semantic contexts 
© 2014 Halo Business Intelligence | All Rights Reserved 
Big Data Is Not Really Useful 
Unless insight can be gleaned through analytics...with a reasonable effort!
Three Strategies to Deal with Big Data 
1 212 313 
Ignore It Archive It Analyze It 
©2014 Halo Business Intelligence | All Rights Reserved 
Don’t jump on the 
bandwagon 
You have better things to 
focus on 
Just collect and store it 
You can always analyze 
it when resources free 
up 
With a clear business 
problem and ROI 
Invest in infrastructure to 
derive the insights 
needed
Big Data 
Google 1 Trillion Web Pages per Year 
Facebook 1 Million GB of Disk Storage 
Yelp! 100 GB of log data per day 
Youtube.com 20 Petabytes new video per year 
. 
. 
Regional medical center – patient sensors 25 TB 
Mid-market retailer – POS 10 TB 
Mid-market manufacturer – machine sensors 6 TB 
http://www.google.com/trends/explore#q=%2Fm%2F04y7lrx%2C%20Amazon%20Aws%2C%20Rackspace&cmpt=q 
©2014 Halo Business Intelligence | All Rights Reserved
Five Big Data Questions 
111 212 313 414 515 
Left Behind? Cloud? Data? Tools? Usefulness? 
©2014 Halo Business Intelligence | All Rights Reserved 
Everyone is doing 
it…you need to. 
Really? 
Or will cost 
exceed benefit? 
Big Data requires Big 
Compute 
Outsourcing risks: 
• Loss of Control 
• Platform Reliability 
• Privacy 
• Security 
Which data and 
sources? 
Too much to handle 
Some or all? 
All vendors have 
a Big Data suite 
Which one? 
How to query the data 
Skills needed 
Machine Learning
Big Data and Cloud Computing 
Commodity computing to execute distributed queries across 
multiple data sets 
Rent commodity server instances to execute computation 
remotely 
Cloud hosting for $10/TB/Mo 
©2014 Halo Business Intelligence | All Rights Reserved
Traditional BI Architecture 
On Premise Or Cloud 
Operational Data 
• Data Volumes = 100 GB – 5 TB 
• Manageable on-premise or in the cloud 
©2014 Halo Business Intelligence | All Rights Reserved 
Data Warehouse
©2014 Halo Business Intelligence | All Rights Reserved 
Add Big Data 
Data volumes = 6 TB + 
image 
Big Data Logs 
Cluster
Big Data in the Cloud 
The “Traditional” Approach 
Data Platform 
Commodity Storage Traditional RDMS 
©2014 Halo Business Intelligence | All Rights Reserved 
Client 
Familiar BI Tools 
MPP 
SQL 
SSAS 
Sharepoint 
BI 
Stream 
Machine 
Learning 
Browser 
SQOOP 
HIVE ODBC 
• Use Amazon 
Redshift, 
Azure 
HDInsight or 
similar 
• Use Blob 
storage to 
persist big 
data 
• Spin up 
Compute 
clusters as 
needed 
• Keep Data in 
Cloud 
perpetually
Big Data Storage Costs 
Cost per TB 
per year 
Sources: 
http://calculator.s3.amazonaws.com/index.html 
http://azure.microsoft.com/en-us/pricing/calculator/ 
As of Nov 2014 
Provider Type 
©2014 Halo Business Intelligence | All Rights Reserved 
Cost per PB 
per year 
Amazon EBS SSD storage $ 1,229 $ 1,258,291 
Amazon EBS Magnetic Storage $ 614 $ 629,145 
Amazon S3 Storage $ 411 $ 420,372 
Azure Tables & Queues $ 792 $ 811,302 
Azure Blob Storage $ 288 $ 294,912
Hosting Considerations 
• What if you host big data on-premise? 
• Cost of managing hundreds of servers, expensive processing power 
• Costs can be hidden in data center budget until too late 
• What is your Big Data output? 
• Beyond about 25 TB of data, cloud hosting costs become significant. 
• Data Transfer costs must be considered as well 
• Inbound is usually free 
• Outbound can be $1,000’s per month 
• Direct connect or physically ship 
• For audit purposes, data may need to be kept for up to 7 years 
• Factor this into your storage costs 
• Location 
• Will regulations impact ability to store or process on machines in different countries 
© 2014 Halo Business Intelligence | All Rights Reserved
Big Data Considerations 
Databases 
• High speed analysis of transactional data 
• Multi-step computations 
• Interactive querying 
• Lots of updates (adds/deletes/mods) 
MapReduce HDFS 
• Low cost storage and compute 
• High performance queries on large data 
• Complex data simple query 
• Simple scaling 
Note: Ideas in this slide are borrowed and adapted from “Running, Managing, and Adapting Hadoop at Sears,” by Andy McNalis, Senior Manager, 
Hadoop Infrastructure, Sears Holdings. 
© 2014 Halo Business Intelligence | All Rights Reserved
Cloud Considerations 
• Big Data needs Big Compute 
• Which cloud services will you choose? 
• Time, effort and skills will vary considerably 
• Microsoft Azure 
• Amazon EC2 
• Google Cloud Platform 
• Verizon Cloud 
• Rackspace 
http://online.wsj.com/articles/little-space-remains-for-rackspace-ahead-of- 
the-tape-1415557510 
©2014 Halo Business Intelligence | All Rights Reserved
Big Data in the Cloud 
The “Traditional” Approach 
Data Platform 
Traditional RDMS 
Commodity Storage Client 
©2014 Halo Business Intelligence | All Rights Reserved 
Familiar BI Tools 
MPP 
SQL 
SSAS 
Sharepoint 
BI 
Stream 
Machine 
Learning 
Browser 
SQOOP 
HIVE ODBC
©2014 Halo Business Intelligence | All Rights Reserved 
Big Data in the Cloud 
Premise-Cloud Hybrid Approach 
Data Platform 
Traditional RDMS 
Commodity Storage Client 
Familiar BI Tools 
MPP 
SQL 
SSAS 
Sharepoint 
BI 
Stream 
Machine 
Learning 
Browser 
SQOOP 
HIVE ODBC 
ETL and Pre-aggregate on-premise 
Analyze Visualize in Cloud
Enterprise Data Hub 
©2014 Halo Business Intelligence | All Rights Reserved 
On-premise Hadoop 
Clusters 
Data Warehouse 
Accelerator 
Cloud Hosting 
Cloud BI Reporting and 
Analytics
ROI Strategies 
Finding critical applications 
Cost of Labor 
 Use lower skill-lower cost resources 
 Avoid extra headcount 
 Share experiences among plants 
 Move experienced talent to higher 
value activity 
Cost of Capital 
 Use under-resourced equipment / 
assets more efficiently 
 Make equipment last longer, run more 
efficiently 
 Avoid more equipment purchases 
Cost of Materials 
 User fewer raw materials 
 Improve quality of raw materials 
sourced 
 Improve delivery and inventory 
Cost of Overheads 
 Reduce transportation costs 
 Reduce or optimize energy and 
resource costs 
 Reduce management layers 
©2014 Halo Business Intelligence | All Rights Reserved 
Cost of Lost Opportunities 
 Reduce time to market 
 Improve product end-of-life 
 Reduce downtime 
 Reduce order to cash 
Cost of Reputation 
 Reduce product defects 
 Anticipate customer reactions 
 Tailor service and response profiles 
More available: info@halobi.com
Warehouse Operations 
Machine sensor data for inventory and labor optimization 
$300K 
Cases per man hour 
Picking accuracy 
©2014 Halo Business Intelligence | All Rights Reserved
Drought Management for Growers 
Smarter water use 
©2014 Halo Business Intelligence | All Rights Reserved 
$475K potential 
Water per output
Retail promotions 
Demand forecasting, sentiment analysis, and pricing 
©2014 Halo Business Intelligence | All Rights Reserved 
$6.2M 
Sales per Square Foot 
Returns Rate
Summary 
• The value of investing in Big Data in the Cloud 
depends on your use case 
• Cost is an issue – 25 TB 
• Skills are an issue – steep learning curves 
• Process is an issue – requires change in the way 
people think and operate 
• Partners are an issue – do you want a large or niche 
provider 
• Database design is important 
©2014 Halo Business Intelligence | All Rights Reserved

Contenu connexe

Tendances

Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2Datameer
 
Unlocking value in your (big) data
Unlocking value in your (big) dataUnlocking value in your (big) data
Unlocking value in your (big) dataOscar Renalias
 
Best Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by DatameerBest Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by DatameerDatameer
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemCapgemini
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects FailSense Corp
 
Cox Automotive: data sells cars
Cox Automotive: data sells carsCox Automotive: data sells cars
Cox Automotive: data sells carsCloudera, Inc.
 
Are you getting the most out of your data?
Are you getting the most out of your data?Are you getting the most out of your data?
Are you getting the most out of your data?SAS Canada
 
Applications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityApplications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityGanes Kesari
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupCaserta
 
Analyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarAnalyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarDatameer
 
8 from zero to insight with real time big data
8 from zero to insight with real time big data8 from zero to insight with real time big data
8 from zero to insight with real time big dataDr. Wilfred Lin (Ph.D.)
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalHarvinder Atwal
 
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessFive Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessVMware Tanzu
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Chief Analytics Officer Forum
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18Harvinder Atwal
 
Customer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital TransformationCustomer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital TransformationCloudera, Inc.
 
Analytics: What is it really and how can it help my organization?
Analytics: What is it really and how can it help my organization?Analytics: What is it really and how can it help my organization?
Analytics: What is it really and how can it help my organization?SAS Canada
 
Analytics Solutions from SAP
Analytics Solutions from SAPAnalytics Solutions from SAP
Analytics Solutions from SAPSAP Analytics
 

Tendances (20)

Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2
 
Unlocking value in your (big) data
Unlocking value in your (big) dataUnlocking value in your (big) data
Unlocking value in your (big) data
 
Best Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by DatameerBest Practices for Big Data Analytics with Machine Learning by Datameer
Best Practices for Big Data Analytics with Machine Learning by Datameer
 
Informatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake EcosystemInformatica Becomes Part of the Business Data Lake Ecosystem
Informatica Becomes Part of the Business Data Lake Ecosystem
 
Why Data Science Projects Fail
Why Data Science Projects FailWhy Data Science Projects Fail
Why Data Science Projects Fail
 
Cox Automotive: data sells cars
Cox Automotive: data sells carsCox Automotive: data sells cars
Cox Automotive: data sells cars
 
Are you getting the most out of your data?
Are you getting the most out of your data?Are you getting the most out of your data?
Are you getting the most out of your data?
 
Applications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus RealityApplications of AI in Supply Chain Management: Hype versus Reality
Applications of AI in Supply Chain Management: Hype versus Reality
 
Predictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing MeetupPredictive Analytics - Big Data Warehousing Meetup
Predictive Analytics - Big Data Warehousing Meetup
 
Analyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop WebinarAnalyzing Unstructured Data in Hadoop Webinar
Analyzing Unstructured Data in Hadoop Webinar
 
8 from zero to insight with real time big data
8 from zero to insight with real time big data8 from zero to insight with real time big data
8 from zero to insight with real time big data
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for SuccessFive Pitfalls when Operationalizing Data Science and a Strategy for Success
Five Pitfalls when Operationalizing Data Science and a Strategy for Success
 
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
Dow Chemical presentation at the Chief Analytics Officer Forum East Coast USA...
 
DataOps: Nine steps to transform your data science impact Strata London May 18
DataOps: Nine steps to transform your data science impact  Strata London May 18DataOps: Nine steps to transform your data science impact  Strata London May 18
DataOps: Nine steps to transform your data science impact Strata London May 18
 
Infrastructure Matters
Infrastructure MattersInfrastructure Matters
Infrastructure Matters
 
Customer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital TransformationCustomer Experience: A Catalyst for Digital Transformation
Customer Experience: A Catalyst for Digital Transformation
 
Analytics - Trends and Prospects
Analytics - Trends and ProspectsAnalytics - Trends and Prospects
Analytics - Trends and Prospects
 
Analytics: What is it really and how can it help my organization?
Analytics: What is it really and how can it help my organization?Analytics: What is it really and how can it help my organization?
Analytics: What is it really and how can it help my organization?
 
Analytics Solutions from SAP
Analytics Solutions from SAPAnalytics Solutions from SAP
Analytics Solutions from SAP
 

Similaire à Conflict in the Cloud – Issues & Solutions for Big Data

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Cloudera, Inc.
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...MapR Technologies
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB
 
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...DataWorks Summit
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyDataWorks Summit
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic IntelAPAC
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopDatameer
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...DataWorks Summit/Hadoop Summit
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesDataWorks Summit
 
Create your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouseCreate your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouseJeff Kelly
 
How to implement Hadoop successfully
How to implement Hadoop successfullyHow to implement Hadoop successfully
How to implement Hadoop successfullyAdir Sharabi
 

Similaire à Conflict in the Cloud – Issues & Solutions for Big Data (20)

The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Retail & CPG
Retail & CPGRetail & CPG
Retail & CPG
 
Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8Building a Modern Analytic Database with Cloudera 5.8
Building a Modern Analytic Database with Cloudera 5.8
 
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
Hadoop in 2015: Keys to Achieving Operational Excellence for the Real-Time En...
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
Faster, Cheaper, Easier... and Successful Best Practices for Big Data Integra...
 
SoftServe BI/BigData Workshop in Utah
SoftServe BI/BigData Workshop in UtahSoftServe BI/BigData Workshop in Utah
SoftServe BI/BigData Workshop in Utah
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, ClouderaMongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
MongoDB IoT City Tour STUTTGART: Hadoop and future data management. By, Cloudera
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic Gab Genai Cloudera - Going Beyond Traditional Analytic
Gab Genai Cloudera - Going Beyond Traditional Analytic
 
Complement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & HadoopComplement Your Existing Data Warehouse with Big Data & Hadoop
Complement Your Existing Data Warehouse with Big Data & Hadoop
 
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
Create your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouseCreate your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouse
 
How to implement Hadoop successfully
How to implement Hadoop successfullyHow to implement Hadoop successfully
How to implement Hadoop successfully
 

Dernier

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 

Dernier (20)

HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 

Conflict in the Cloud – Issues & Solutions for Big Data

  • 1. Conflict in the Cloud Big data and cloud computing Keith Peterson, CEO Halo BI ©2014 Halo Business Intelligence | All Rights Reserved
  • 2. Starting points • Information management and analytics issues hurting business objectives • Taking days and weeks to get to the data • Multiple copies of data around the organization • No shared view of the truth • ETL and data warehouse unable to handle loads • BI and reporting eating up capacity • Data volumes growing but budgets static • Desire to leverage new machine data sources ©2014 Halo Business Intelligence | All Rights Reserved
  • 3. The “Big Data” challenge (Executive View) … © 2014 Halo Business Intelligence | All Rights Reserved
  • 4. The “Big Data” challenge (Business View) … Big Data Is… • Ever escalating volumes • Expanding sources, such as Internet Of Things • Increasingly high velocities • With a widening variety of unstructured formats and semantic contexts © 2014 Halo Business Intelligence | All Rights Reserved Big Data Is Not Really Useful Unless insight can be gleaned through analytics...with a reasonable effort!
  • 5. Three Strategies to Deal with Big Data 1 212 313 Ignore It Archive It Analyze It ©2014 Halo Business Intelligence | All Rights Reserved Don’t jump on the bandwagon You have better things to focus on Just collect and store it You can always analyze it when resources free up With a clear business problem and ROI Invest in infrastructure to derive the insights needed
  • 6. Big Data Google 1 Trillion Web Pages per Year Facebook 1 Million GB of Disk Storage Yelp! 100 GB of log data per day Youtube.com 20 Petabytes new video per year . . Regional medical center – patient sensors 25 TB Mid-market retailer – POS 10 TB Mid-market manufacturer – machine sensors 6 TB http://www.google.com/trends/explore#q=%2Fm%2F04y7lrx%2C%20Amazon%20Aws%2C%20Rackspace&cmpt=q ©2014 Halo Business Intelligence | All Rights Reserved
  • 7. Five Big Data Questions 111 212 313 414 515 Left Behind? Cloud? Data? Tools? Usefulness? ©2014 Halo Business Intelligence | All Rights Reserved Everyone is doing it…you need to. Really? Or will cost exceed benefit? Big Data requires Big Compute Outsourcing risks: • Loss of Control • Platform Reliability • Privacy • Security Which data and sources? Too much to handle Some or all? All vendors have a Big Data suite Which one? How to query the data Skills needed Machine Learning
  • 8. Big Data and Cloud Computing Commodity computing to execute distributed queries across multiple data sets Rent commodity server instances to execute computation remotely Cloud hosting for $10/TB/Mo ©2014 Halo Business Intelligence | All Rights Reserved
  • 9. Traditional BI Architecture On Premise Or Cloud Operational Data • Data Volumes = 100 GB – 5 TB • Manageable on-premise or in the cloud ©2014 Halo Business Intelligence | All Rights Reserved Data Warehouse
  • 10. ©2014 Halo Business Intelligence | All Rights Reserved Add Big Data Data volumes = 6 TB + image Big Data Logs Cluster
  • 11. Big Data in the Cloud The “Traditional” Approach Data Platform Commodity Storage Traditional RDMS ©2014 Halo Business Intelligence | All Rights Reserved Client Familiar BI Tools MPP SQL SSAS Sharepoint BI Stream Machine Learning Browser SQOOP HIVE ODBC • Use Amazon Redshift, Azure HDInsight or similar • Use Blob storage to persist big data • Spin up Compute clusters as needed • Keep Data in Cloud perpetually
  • 12. Big Data Storage Costs Cost per TB per year Sources: http://calculator.s3.amazonaws.com/index.html http://azure.microsoft.com/en-us/pricing/calculator/ As of Nov 2014 Provider Type ©2014 Halo Business Intelligence | All Rights Reserved Cost per PB per year Amazon EBS SSD storage $ 1,229 $ 1,258,291 Amazon EBS Magnetic Storage $ 614 $ 629,145 Amazon S3 Storage $ 411 $ 420,372 Azure Tables & Queues $ 792 $ 811,302 Azure Blob Storage $ 288 $ 294,912
  • 13. Hosting Considerations • What if you host big data on-premise? • Cost of managing hundreds of servers, expensive processing power • Costs can be hidden in data center budget until too late • What is your Big Data output? • Beyond about 25 TB of data, cloud hosting costs become significant. • Data Transfer costs must be considered as well • Inbound is usually free • Outbound can be $1,000’s per month • Direct connect or physically ship • For audit purposes, data may need to be kept for up to 7 years • Factor this into your storage costs • Location • Will regulations impact ability to store or process on machines in different countries © 2014 Halo Business Intelligence | All Rights Reserved
  • 14. Big Data Considerations Databases • High speed analysis of transactional data • Multi-step computations • Interactive querying • Lots of updates (adds/deletes/mods) MapReduce HDFS • Low cost storage and compute • High performance queries on large data • Complex data simple query • Simple scaling Note: Ideas in this slide are borrowed and adapted from “Running, Managing, and Adapting Hadoop at Sears,” by Andy McNalis, Senior Manager, Hadoop Infrastructure, Sears Holdings. © 2014 Halo Business Intelligence | All Rights Reserved
  • 15. Cloud Considerations • Big Data needs Big Compute • Which cloud services will you choose? • Time, effort and skills will vary considerably • Microsoft Azure • Amazon EC2 • Google Cloud Platform • Verizon Cloud • Rackspace http://online.wsj.com/articles/little-space-remains-for-rackspace-ahead-of- the-tape-1415557510 ©2014 Halo Business Intelligence | All Rights Reserved
  • 16. Big Data in the Cloud The “Traditional” Approach Data Platform Traditional RDMS Commodity Storage Client ©2014 Halo Business Intelligence | All Rights Reserved Familiar BI Tools MPP SQL SSAS Sharepoint BI Stream Machine Learning Browser SQOOP HIVE ODBC
  • 17. ©2014 Halo Business Intelligence | All Rights Reserved Big Data in the Cloud Premise-Cloud Hybrid Approach Data Platform Traditional RDMS Commodity Storage Client Familiar BI Tools MPP SQL SSAS Sharepoint BI Stream Machine Learning Browser SQOOP HIVE ODBC ETL and Pre-aggregate on-premise Analyze Visualize in Cloud
  • 18. Enterprise Data Hub ©2014 Halo Business Intelligence | All Rights Reserved On-premise Hadoop Clusters Data Warehouse Accelerator Cloud Hosting Cloud BI Reporting and Analytics
  • 19. ROI Strategies Finding critical applications Cost of Labor  Use lower skill-lower cost resources  Avoid extra headcount  Share experiences among plants  Move experienced talent to higher value activity Cost of Capital  Use under-resourced equipment / assets more efficiently  Make equipment last longer, run more efficiently  Avoid more equipment purchases Cost of Materials  User fewer raw materials  Improve quality of raw materials sourced  Improve delivery and inventory Cost of Overheads  Reduce transportation costs  Reduce or optimize energy and resource costs  Reduce management layers ©2014 Halo Business Intelligence | All Rights Reserved Cost of Lost Opportunities  Reduce time to market  Improve product end-of-life  Reduce downtime  Reduce order to cash Cost of Reputation  Reduce product defects  Anticipate customer reactions  Tailor service and response profiles More available: info@halobi.com
  • 20. Warehouse Operations Machine sensor data for inventory and labor optimization $300K Cases per man hour Picking accuracy ©2014 Halo Business Intelligence | All Rights Reserved
  • 21. Drought Management for Growers Smarter water use ©2014 Halo Business Intelligence | All Rights Reserved $475K potential Water per output
  • 22. Retail promotions Demand forecasting, sentiment analysis, and pricing ©2014 Halo Business Intelligence | All Rights Reserved $6.2M Sales per Square Foot Returns Rate
  • 23. Summary • The value of investing in Big Data in the Cloud depends on your use case • Cost is an issue – 25 TB • Skills are an issue – steep learning curves • Process is an issue – requires change in the way people think and operate • Partners are an issue – do you want a large or niche provider • Database design is important ©2014 Halo Business Intelligence | All Rights Reserved

Notes de l'éditeur

  1. Static Slide Big Data Is… Being collected in ever escalating volumes From more and more sources, such as Internet Of Things In increasingly high velocities With a widening variety of unstructured formats and semantic contexts Big Data Is Not Really Useful unless insight can be gleaned through analytics!
  2. Static Slide Big Data Is… Being collected in ever escalating volumes From more and more sources, such as Internet Of Things In increasingly high velocities With a widening variety of unstructured formats and semantic contexts Big Data Is Not Really Useful unless insight can be gleaned through analytics!
  3. Google 1 Trillion web pages per year Facebook 1 M GB of dis storage Youtube 20 Petabyes to new video per year That a user confifgures and controls Rather than on a local desktop Cloud providers charge un $.10 per CPU hour for renting MIPS memory space
  4. Google 1 Trillion web pages per year Facebook 1 M GB of dis storage Youtube 20 Petabyes to new video per year That a user confifgures and controls Rather than on a local desktop Cloud providers charge un $.10 per CPU hour for renting MIPS memory space
  5. Static Slide
  6. Static Slide
  7. The advantage of using non-relational dbs to handle both types of data. But unstructured could be much harder to use long term. Hard choices about converting unstructured to structured. Initial DB designs wont support Have to load maintain and power hudnresd of servers if not in cloud. Jprocessing powerwill be expensive. Because cost rolled into DC, supriese Different in technology. Means amount of time effort and expertise will vary considerable.