SlideShare une entreprise Scribd logo
1  sur  26
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75551
BI for Big Data
Beyond the Hype
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75552
Pentaho Mission
The Future of Analytics: Big Data Exploration without Boundaries
Modern, unified data integration and business
analytics platform
• Native integration into big data ecosystem
• Embeddable, cloud-ready analytics
Fast and Broad Innovation
• Open source development model
Critical mass achieved
• Over 1,000 commercial customers
• Over 10,000 production deployments
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75553 3
Ian Fyfe
Big Data Solutions Engineering, Pentaho
Ian brings over 20 years of experience in the business analytics software market
with roles spanning consulting services, pre-sales engineering, product
management and product marketing. Ian started his career by co-founding a
business intelligence startup and has worked at Business Objects, Informix,
Epiphany, PeopleSoft and Jaspersoft.
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75554
Common Use Cases
4
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75555
The Value of Big Data for our Customers
Big opportunities
Improve operational effectiveness
• Machines/sensors: predict failures, network attacks
• Financial risk management: reduce fraud, increase security
Reduce data warehouse cost
• Integrate new data sources without increased database cost
• Provide online access to ‘dark data’
Drive incremental revenue
• Predict customer behavior across all channels
• Understand and monetize customer behavior
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75556
© 2010, Pentaho. All Rights Reserved. www.pentaho.com. US and Worldwide: +1 (866) 660-7555 | Slide
Example Use Cases Today
Transactional
•Fraud detection
•Financial services / stock
markets
Sub-Transactional
•Weblogs
•Social/online media
•Telecoms events
Non-Transactional
•Web pages, blogs etc
•Documents
•Physical events
•Application events
•Machine events
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75557
Click Stream Analytics
From buying patterns to revenue
Business Challenge
• Monetize buying patterns hidden in billions of
data points
• Quickly analyze multi-channel click stream data
Pentaho Benefits
• Reduced ETL time to analyze blended data
from Hadoop, Hbase & data warehouse
• Use of big data analytics to grow revenue from
targeted campaigns
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75558
Device Data Analytics
Big Data for Fortune 100 Enterprise Storage provider
Business Challenge
• Affordably scale machine data from storage
devices for customer support app
• Predict device failure
• Enhance product performance
Pentaho Benefits
• Easy to use ETL & analysis for Hadoop, Hbase,
& Oracle data sources
• 15x cost improvement
• Stronger performance against customer SLA’s
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75559
Healthcare
Embedded Pentaho to better
patient care & compliance through
analysis of unstructured digital pen
data stored in CouchDB
Online Retailer
Understanding the buying patterns
of 5 million users from click stream
data stored in Hadoop & HBase
Gaming
Better monetization of premium
game features through analyzing
large volumes of player data -
stored in MongoDB & Infobright
Social Commerce
Better campaign performance
through monitoring social media,
page clicks and email marketing
data stored in HP Vertica
Travel & Entertainment
Helping thousands of travel
partners like expedia.co.uk and
thomascook.fr improve promotional
targeting using Hbase and Hadoop
Mobile & Digital Media
Embedded Pentaho to measure
massive volumes of mobile and
event data generated from mobile
devices stored in MongoDB
Innovative Organizations Use Pentaho
to Unlock Value from Big Data Stores
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755510
Pentaho Embedded Analytics
New Revenue Stream in Eight Weeks
Business Challenge
• Gain new revenue source from add-on
module with reporting, analysis & dashboards
• Get to market fast to differentiate
Pentaho Benefits
• Easy to embed & brand
• Broad capabilities result in new revenue stream
• Increased functionality & compelling
visualizations
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755511
Embedded Analytics
Pentaho Uniquely Positioned to Win
Dashboard Framework
Dashboard Designer
Why We Win in Embedded:
• Architectural ‘sweet spot’ for Pentaho
platform
• Flexible pricing, adaptable to fit partner
pricing
• Open source and innovation
• Fastest time-to-market for embedded
analytics
Continued Leadership:
• Cloud & multi-tenancy ease-of-use
• Simplified REST services for ISVs
• BI Platform SDK enhancements – deep
solution examples, tutorials and training
• Continued focus on standards and
extensibility
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755512
Big Data Technologies
BI Strengths and Weaknesses
© 2012, Pentaho. All Rights Reserved.
12
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755513
The Current Solutions
10,000
2005 20152010
5,000
0
Current Database Solutions are designed for
structured data.
• Optimized to answer known questions quickly
• Schemas dictate form/context
• Difficult to adapt to new data types and new
questions
• Expensive at petabyte scale
STRUCTURED DATA UNSTRUCTURED DATA
GIGABYTESOFDATACREATED(INBILLIONS)
10%
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755514
Main Big Data Technologies
Hadoop NoSQL Databases Analytic Databases
Hadoop
• Low cost, reliable
scale-out architecture
• Distributed computing
Proven success in
Fortune 500
companies
• Exploding interest
NoSQL Databases
• Huge horizontal scaling
and high availability
• Highly optimized for
retrieval and appending
• Types
• Document stores
• Key Value stores
• Graph databases
Analytic RDBMS
• Optimized for bulk-load
and fast aggregate
query workloads
• Types
• Column-oriented
• MPP
• In-memory
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755515
© 2010, Pentaho. All Rights Reserved. www.pentaho.com.
Hadoop Core Components
HADOOP DISTRIBUTED FILE SYSTEM (HDFS)
❯ Massive redundant storage across a commodity
cluster
MAPREDUCE
❯ Map: distribute a computational problem
across a cluster
❯ Reduce: Master node collects the answers to
all the sub-problems and combines them
MANY DISTROS AVAILABLE
US and Worldwide: +1 (866) 660-7555 | Slide
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755516
Major Hadoop Utilities
Apache Hive
Apache Pig
Apache HBase
Sqoop
Oozie
Hue
Flume
Apache Whirr
Apache Zookeeper
SQL-like language and
metadata repository
High-level language
for expressing data
analysis programs
The Hadoop database.
Random, real -time
read/write access
Highly reliable
distributed
coordination service
Library for running
Hadoop in the cloud
Distributed service for
collecting and
aggregating log and
event data
Browser-based
desktop interface for
interacting with
Hadoop
Server-based
workflow engine for
Hadoop activities
Integrating Hadoop
with RDBMS
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755517
Hadoop & Databases
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755518
“The working conditions can
be are shocking”
ETL Developer
Big Data Platform Challenges
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755519
Challenges
1. Somewhat immature
2. Lack of tooling
3. Steep technical learning curve
4. Hiring qualified people
5. Availability of enterprise-ready products and tools
6. High latency (Hadoop)
7. Running inside the cluster
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755520
Challenges
WOULD YOU RATHER DO THIS?
Scheduling
Modeling
Ingestion / Manipulation /
Integration
… OR THIS?
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755521
Investigating
BI & Big Data Solutions
21
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755522
Questions to Ask
Business Drivers
1. Mandate to reduce EDW costs?
2. Clear use case that you need to solve?
3. Do you have access to technical skill set?
Technical
1. Do you have more than one kind of big data store, for example Hadoop as well as HBase,
MongoDB or Cassandra?
2. Would you prefer to use the same tool for big data stores in addition to your traditional relational
data stores?
3. Are you ok waiting minutes or even hours to access your big data?
4. Are you ok using a spreadsheet-like interface to access and analyze your data?
5. Do you need complete BI capabilities, including reporting, interactive visualization, and predictive
analytics?
6. Do you need to enrich your big data with data from outside of the big data platform?
7. Is the big data you want to analyze bigger than the amount of memory you have available?
http://blog.pentaho.com/tag/ian-fyfe/
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755523
Demo
© 2012, Pentaho. All Rights Reserved.
23
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755524
Data Ingestion
Manipulation
Integration
Enterprise &
Ad Hoc Reporting
Data Discovery
Visualization
Predictive Analytics
Complete Big Data Analytics &
Visual Data Management
RelationalHadoop NoSQL
Analytic
Databases
Pentaho Big Data Analytics
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755525
Open
Discussion
© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755526
Thank You
blog.pentaho.com
@Pentaho
Facebook.com/Pentaho
Pentaho Business Analytics
JOIN THE CONVERSATION. YOU CAN FIND US ON:

Contenu connexe

Tendances

Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data DiscoveryHarald Erb
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoopDr. Wilfred Lin (Ph.D.)
 
Big Data Solutions Executive Overview
Big Data Solutions Executive OverviewBig Data Solutions Executive Overview
Big Data Solutions Executive OverviewRCG Global Services
 
Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introductionmattcasters
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata Hortonworks
 
MongoDB IoT City Tour STUTTGART: Analysing the Internet of Things. By, Pentaho
MongoDB IoT City Tour STUTTGART: Analysing the Internet of Things. By, PentahoMongoDB IoT City Tour STUTTGART: Analysing the Internet of Things. By, Pentaho
MongoDB IoT City Tour STUTTGART: Analysing the Internet of Things. By, PentahoMongoDB
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data LakeVMware Tanzu
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseJeffrey T. Pollock
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?Hortonworks
 
SAP HANA in Healthcare: Real-Time Big Data Analysis
SAP HANA in Healthcare: Real-Time Big Data AnalysisSAP HANA in Healthcare: Real-Time Big Data Analysis
SAP HANA in Healthcare: Real-Time Big Data AnalysisSAP Technology
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopHortonworks
 
BI congres 2016-2: Diving into weblog data with SAS on Hadoop - Lisa Truyers...
BI congres 2016-2: Diving into weblog data with SAS on Hadoop -  Lisa Truyers...BI congres 2016-2: Diving into weblog data with SAS on Hadoop -  Lisa Truyers...
BI congres 2016-2: Diving into weblog data with SAS on Hadoop - Lisa Truyers...BICC Thomas More
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalHortonworks
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessDataWorks Summit/Hadoop Summit
 

Tendances (20)

Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data Discovery
 
6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop6 enriching your data warehouse with big data and hadoop
6 enriching your data warehouse with big data and hadoop
 
Big Data Solutions Executive Overview
Big Data Solutions Executive OverviewBig Data Solutions Executive Overview
Big Data Solutions Executive Overview
 
Pentaho Data Integration Introduction
Pentaho Data Integration IntroductionPentaho Data Integration Introduction
Pentaho Data Integration Introduction
 
Ask bigger questions
Ask bigger questionsAsk bigger questions
Ask bigger questions
 
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata The Value of the Modern Data Architecture with Apache Hadoop and Teradata
The Value of the Modern Data Architecture with Apache Hadoop and Teradata
 
MongoDB IoT City Tour STUTTGART: Analysing the Internet of Things. By, Pentaho
MongoDB IoT City Tour STUTTGART: Analysing the Internet of Things. By, PentahoMongoDB IoT City Tour STUTTGART: Analysing the Internet of Things. By, Pentaho
MongoDB IoT City Tour STUTTGART: Analysing the Internet of Things. By, Pentaho
 
Oracle big data discovery 994294
Oracle big data discovery   994294Oracle big data discovery   994294
Oracle big data discovery 994294
 
10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake10 Amazing Things To Do With a Hadoop-Based Data Lake
10 Amazing Things To Do With a Hadoop-Based Data Lake
 
Big data/Hadoop/HANA Basics
Big data/Hadoop/HANA BasicsBig data/Hadoop/HANA Basics
Big data/Hadoop/HANA Basics
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 
How Old Is Your Data? Don't Settle For Bad Data!
How Old Is Your Data? Don't Settle For Bad Data!How Old Is Your Data? Don't Settle For Bad Data!
How Old Is Your Data? Don't Settle For Bad Data!
 
Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration Hortonworks Oracle Big Data Integration
Hortonworks Oracle Big Data Integration
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?IDC Retail Insights - What's Possible with a Modern Data Architecture?
IDC Retail Insights - What's Possible with a Modern Data Architecture?
 
SAP HANA in Healthcare: Real-Time Big Data Analysis
SAP HANA in Healthcare: Real-Time Big Data AnalysisSAP HANA in Healthcare: Real-Time Big Data Analysis
SAP HANA in Healthcare: Real-Time Big Data Analysis
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
BI congres 2016-2: Diving into weblog data with SAS on Hadoop - Lisa Truyers...
BI congres 2016-2: Diving into weblog data with SAS on Hadoop -  Lisa Truyers...BI congres 2016-2: Diving into weblog data with SAS on Hadoop -  Lisa Truyers...
BI congres 2016-2: Diving into weblog data with SAS on Hadoop - Lisa Truyers...
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awareness
 

Similaire à Big Data for BI - Beyond the Hype - Pentaho

Advanced Reporting and ETL for MongoDB: Easily Build a 360-Degree View of You...
Advanced Reporting and ETL for MongoDB: Easily Build a 360-Degree View of You...Advanced Reporting and ETL for MongoDB: Easily Build a 360-Degree View of You...
Advanced Reporting and ETL for MongoDB: Easily Build a 360-Degree View of You...MongoDB
 
How advanced analytics is impacting the banking sector
How advanced analytics is impacting the banking sectorHow advanced analytics is impacting the banking sector
How advanced analytics is impacting the banking sectorMichael Haddad
 
Pentaho Analytics on MongoDB
Pentaho Analytics on MongoDBPentaho Analytics on MongoDB
Pentaho Analytics on MongoDBMark Kromer
 
Pentaho Analytics at Tampa Analytics September Meetup
Pentaho Analytics at Tampa Analytics September MeetupPentaho Analytics at Tampa Analytics September Meetup
Pentaho Analytics at Tampa Analytics September MeetupMark Kromer
 
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...MongoDB
 
Open Analytics 2014 - Pedro Alves - Innovation though Open Source
Open Analytics 2014 - Pedro Alves - Innovation though Open SourceOpen Analytics 2014 - Pedro Alves - Innovation though Open Source
Open Analytics 2014 - Pedro Alves - Innovation though Open SourceOpenAnalytics Spain
 
Pentaho roadmap 061314
Pentaho roadmap 061314Pentaho roadmap 061314
Pentaho roadmap 061314Stratebi
 
BI congres 2014-5: from BI to big data - Jan Aertsen - Pentaho
BI congres 2014-5: from BI to big data - Jan Aertsen - PentahoBI congres 2014-5: from BI to big data - Jan Aertsen - Pentaho
BI congres 2014-5: from BI to big data - Jan Aertsen - PentahoBICC Thomas More
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopMark Kromer
 
Big Data for Product Managers
Big Data for Product ManagersBig Data for Product Managers
Big Data for Product ManagersPentaho
 
Knowage 8 presentation
Knowage 8   presentationKnowage 8   presentation
Knowage 8 presentationKNOWAGE
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoptionHortonworks
 
Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Pactera_US
 
Business Intelligence with Oracle Database Applicance
Business Intelligence with Oracle Database Applicance Business Intelligence with Oracle Database Applicance
Business Intelligence with Oracle Database Applicance Christophe De Greve
 
Why Your Product Needs A Data & Analytics Strategy
Why Your Product Needs A Data & Analytics StrategyWhy Your Product Needs A Data & Analytics Strategy
Why Your Product Needs A Data & Analytics StrategyAIPMM Administration
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Precisely
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Inside Analysis
 

Similaire à Big Data for BI - Beyond the Hype - Pentaho (20)

Advanced Reporting and ETL for MongoDB: Easily Build a 360-Degree View of You...
Advanced Reporting and ETL for MongoDB: Easily Build a 360-Degree View of You...Advanced Reporting and ETL for MongoDB: Easily Build a 360-Degree View of You...
Advanced Reporting and ETL for MongoDB: Easily Build a 360-Degree View of You...
 
How advanced analytics is impacting the banking sector
How advanced analytics is impacting the banking sectorHow advanced analytics is impacting the banking sector
How advanced analytics is impacting the banking sector
 
Pentaho Analytics on MongoDB
Pentaho Analytics on MongoDBPentaho Analytics on MongoDB
Pentaho Analytics on MongoDB
 
Pentaho Analytics at Tampa Analytics September Meetup
Pentaho Analytics at Tampa Analytics September MeetupPentaho Analytics at Tampa Analytics September Meetup
Pentaho Analytics at Tampa Analytics September Meetup
 
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
MongoDB IoT City Tour EINDHOVEN: Analysing the Internet of Things: Davy Nys, ...
 
Open Analytics 2014 - Pedro Alves - Innovation though Open Source
Open Analytics 2014 - Pedro Alves - Innovation though Open SourceOpen Analytics 2014 - Pedro Alves - Innovation though Open Source
Open Analytics 2014 - Pedro Alves - Innovation though Open Source
 
Pentaho roadmap 061314
Pentaho roadmap 061314Pentaho roadmap 061314
Pentaho roadmap 061314
 
BI congres 2014-5: from BI to big data - Jan Aertsen - Pentaho
BI congres 2014-5: from BI to big data - Jan Aertsen - PentahoBI congres 2014-5: from BI to big data - Jan Aertsen - Pentaho
BI congres 2014-5: from BI to big data - Jan Aertsen - Pentaho
 
Pentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and HadoopPentaho Big Data Analytics with Vertica and Hadoop
Pentaho Big Data Analytics with Vertica and Hadoop
 
Big data for product managers
Big data for product managersBig data for product managers
Big data for product managers
 
Big Data for Product Managers
Big Data for Product ManagersBig Data for Product Managers
Big Data for Product Managers
 
Knowage 8 presentation
Knowage 8   presentationKnowage 8   presentation
Knowage 8 presentation
 
Filling the Data Lake
Filling the Data LakeFilling the Data Lake
Filling the Data Lake
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks
 
Business Intelligence with Oracle Database Applicance
Business Intelligence with Oracle Database Applicance Business Intelligence with Oracle Database Applicance
Business Intelligence with Oracle Database Applicance
 
Why Your Product Needs A Data & Analytics Strategy
Why Your Product Needs A Data & Analytics StrategyWhy Your Product Needs A Data & Analytics Strategy
Why Your Product Needs A Data & Analytics Strategy
 
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: To...
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
 

Big Data for BI - Beyond the Hype - Pentaho

  • 1. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75551 BI for Big Data Beyond the Hype
  • 2. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75552 Pentaho Mission The Future of Analytics: Big Data Exploration without Boundaries Modern, unified data integration and business analytics platform • Native integration into big data ecosystem • Embeddable, cloud-ready analytics Fast and Broad Innovation • Open source development model Critical mass achieved • Over 1,000 commercial customers • Over 10,000 production deployments
  • 3. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75553 3 Ian Fyfe Big Data Solutions Engineering, Pentaho Ian brings over 20 years of experience in the business analytics software market with roles spanning consulting services, pre-sales engineering, product management and product marketing. Ian started his career by co-founding a business intelligence startup and has worked at Business Objects, Informix, Epiphany, PeopleSoft and Jaspersoft.
  • 4. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75554 Common Use Cases 4
  • 5. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75555 The Value of Big Data for our Customers Big opportunities Improve operational effectiveness • Machines/sensors: predict failures, network attacks • Financial risk management: reduce fraud, increase security Reduce data warehouse cost • Integrate new data sources without increased database cost • Provide online access to ‘dark data’ Drive incremental revenue • Predict customer behavior across all channels • Understand and monetize customer behavior
  • 6. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75556 © 2010, Pentaho. All Rights Reserved. www.pentaho.com. US and Worldwide: +1 (866) 660-7555 | Slide Example Use Cases Today Transactional •Fraud detection •Financial services / stock markets Sub-Transactional •Weblogs •Social/online media •Telecoms events Non-Transactional •Web pages, blogs etc •Documents •Physical events •Application events •Machine events
  • 7. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75557 Click Stream Analytics From buying patterns to revenue Business Challenge • Monetize buying patterns hidden in billions of data points • Quickly analyze multi-channel click stream data Pentaho Benefits • Reduced ETL time to analyze blended data from Hadoop, Hbase & data warehouse • Use of big data analytics to grow revenue from targeted campaigns
  • 8. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75558 Device Data Analytics Big Data for Fortune 100 Enterprise Storage provider Business Challenge • Affordably scale machine data from storage devices for customer support app • Predict device failure • Enhance product performance Pentaho Benefits • Easy to use ETL & analysis for Hadoop, Hbase, & Oracle data sources • 15x cost improvement • Stronger performance against customer SLA’s
  • 9. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-75559 Healthcare Embedded Pentaho to better patient care & compliance through analysis of unstructured digital pen data stored in CouchDB Online Retailer Understanding the buying patterns of 5 million users from click stream data stored in Hadoop & HBase Gaming Better monetization of premium game features through analyzing large volumes of player data - stored in MongoDB & Infobright Social Commerce Better campaign performance through monitoring social media, page clicks and email marketing data stored in HP Vertica Travel & Entertainment Helping thousands of travel partners like expedia.co.uk and thomascook.fr improve promotional targeting using Hbase and Hadoop Mobile & Digital Media Embedded Pentaho to measure massive volumes of mobile and event data generated from mobile devices stored in MongoDB Innovative Organizations Use Pentaho to Unlock Value from Big Data Stores
  • 10. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755510 Pentaho Embedded Analytics New Revenue Stream in Eight Weeks Business Challenge • Gain new revenue source from add-on module with reporting, analysis & dashboards • Get to market fast to differentiate Pentaho Benefits • Easy to embed & brand • Broad capabilities result in new revenue stream • Increased functionality & compelling visualizations
  • 11. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755511 Embedded Analytics Pentaho Uniquely Positioned to Win Dashboard Framework Dashboard Designer Why We Win in Embedded: • Architectural ‘sweet spot’ for Pentaho platform • Flexible pricing, adaptable to fit partner pricing • Open source and innovation • Fastest time-to-market for embedded analytics Continued Leadership: • Cloud & multi-tenancy ease-of-use • Simplified REST services for ISVs • BI Platform SDK enhancements – deep solution examples, tutorials and training • Continued focus on standards and extensibility
  • 12. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755512 Big Data Technologies BI Strengths and Weaknesses © 2012, Pentaho. All Rights Reserved. 12
  • 13. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755513 The Current Solutions 10,000 2005 20152010 5,000 0 Current Database Solutions are designed for structured data. • Optimized to answer known questions quickly • Schemas dictate form/context • Difficult to adapt to new data types and new questions • Expensive at petabyte scale STRUCTURED DATA UNSTRUCTURED DATA GIGABYTESOFDATACREATED(INBILLIONS) 10%
  • 14. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755514 Main Big Data Technologies Hadoop NoSQL Databases Analytic Databases Hadoop • Low cost, reliable scale-out architecture • Distributed computing Proven success in Fortune 500 companies • Exploding interest NoSQL Databases • Huge horizontal scaling and high availability • Highly optimized for retrieval and appending • Types • Document stores • Key Value stores • Graph databases Analytic RDBMS • Optimized for bulk-load and fast aggregate query workloads • Types • Column-oriented • MPP • In-memory
  • 15. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755515 © 2010, Pentaho. All Rights Reserved. www.pentaho.com. Hadoop Core Components HADOOP DISTRIBUTED FILE SYSTEM (HDFS) ❯ Massive redundant storage across a commodity cluster MAPREDUCE ❯ Map: distribute a computational problem across a cluster ❯ Reduce: Master node collects the answers to all the sub-problems and combines them MANY DISTROS AVAILABLE US and Worldwide: +1 (866) 660-7555 | Slide
  • 16. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755516 Major Hadoop Utilities Apache Hive Apache Pig Apache HBase Sqoop Oozie Hue Flume Apache Whirr Apache Zookeeper SQL-like language and metadata repository High-level language for expressing data analysis programs The Hadoop database. Random, real -time read/write access Highly reliable distributed coordination service Library for running Hadoop in the cloud Distributed service for collecting and aggregating log and event data Browser-based desktop interface for interacting with Hadoop Server-based workflow engine for Hadoop activities Integrating Hadoop with RDBMS
  • 17. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755517 Hadoop & Databases
  • 18. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755518 “The working conditions can be are shocking” ETL Developer Big Data Platform Challenges
  • 19. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755519 Challenges 1. Somewhat immature 2. Lack of tooling 3. Steep technical learning curve 4. Hiring qualified people 5. Availability of enterprise-ready products and tools 6. High latency (Hadoop) 7. Running inside the cluster
  • 20. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755520 Challenges WOULD YOU RATHER DO THIS? Scheduling Modeling Ingestion / Manipulation / Integration … OR THIS?
  • 21. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755521 Investigating BI & Big Data Solutions 21
  • 22. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755522 Questions to Ask Business Drivers 1. Mandate to reduce EDW costs? 2. Clear use case that you need to solve? 3. Do you have access to technical skill set? Technical 1. Do you have more than one kind of big data store, for example Hadoop as well as HBase, MongoDB or Cassandra? 2. Would you prefer to use the same tool for big data stores in addition to your traditional relational data stores? 3. Are you ok waiting minutes or even hours to access your big data? 4. Are you ok using a spreadsheet-like interface to access and analyze your data? 5. Do you need complete BI capabilities, including reporting, interactive visualization, and predictive analytics? 6. Do you need to enrich your big data with data from outside of the big data platform? 7. Is the big data you want to analyze bigger than the amount of memory you have available? http://blog.pentaho.com/tag/ian-fyfe/
  • 23. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755523 Demo © 2012, Pentaho. All Rights Reserved. 23
  • 24. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755524 Data Ingestion Manipulation Integration Enterprise & Ad Hoc Reporting Data Discovery Visualization Predictive Analytics Complete Big Data Analytics & Visual Data Management RelationalHadoop NoSQL Analytic Databases Pentaho Big Data Analytics
  • 25. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755525 Open Discussion
  • 26. © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-755526 Thank You blog.pentaho.com @Pentaho Facebook.com/Pentaho Pentaho Business Analytics JOIN THE CONVERSATION. YOU CAN FIND US ON:

Notes de l'éditeur

  1. * Not many companies have transactional data that classifies as Big Data. Credit card companies, and financial services companies are about it. * With stock market data were are talking about every stock trade and the bid and ask prices between the transactions - for every stock on multiple markets for a significant time period. For many other companies the Big Data is sub-transactional - it is the events that lead up to transactions * Weblogs are semi/badly structured. Consider the number of weblog entries created as you look for a book online - researching 5-10 books, reading reviews and comments. You might generate 1000 entries and may or may not buy a book - potentially lots of entries for no transaction. We also want to enrich this data with metadata about the URLs and information about the location of user * In an online game or world every interaction between participants and the system and between each other is logged. An individual participant might generate > 1 million events for their 1 monthly transaction * A single phone call or text message generates many events within a telecoms company
  2. TAKE-AWAYS Pentaho has many big data customers across a range of industries and big data platforms.
  3. TAKE-AWAYS Pentaho provides complete integrated DI+BI for every leading big data platform.
  4. Big Data solutions are not databases. They don’t provide the capabilities that BI toolsets expect of a database. Hadoop also has a high latency. This means the smallest query possible has an execution time that is much slower than that of a database Hadoop is optimized for executing very intensive data processing tasks on very large amounts of data. It is not optimized for quick queries. Some Hadoop experts recommend configuring the workloads so that Hadoop jobs take an hour or more. This conflicts with OLAP performance criteria of 5-10 seconds per query. There are database implementations within the Hadoop world, Hive, HBase etc.
  5. Unfortunately for developers who are used to working with data transformation tools, the productivity within the Hadoop environment is not what they are used to.
  6. TAKE-AWAYS The better choice is obviously visual development