SlideShare une entreprise Scribd logo
1  sur  21
CONSULTING SOLUTIONS OUTSOURCING
PARTNER FOR A NEW
ERA
Transform Your Business with
Big Data and Hortonworks
Tom Kersnick – Pactera – Director Big Data Solutions
Robby Richardson – Hortonworks – Enterprise Account Manager
Topics
© Pactera. Confidential. All Rights Reserved.
2 Who is Hortonworks?
3 Hortonworks HDP: Enterprise Hadoop Distribution
4
5 Pactera Intro
6 Big Data Deep Dive
Hadoop 2.0: The Enterprise Generation
1 Hortonworks Intro
2
Hortonworks Snapshot
• We distribute the only 100%
Open Source Enterprise
Hadoop Distribution:
Hortonworks Data Platform
• We engineer, test & certify HDP
for enterprise usage
• We employ the core
architects, builders and
operators of Apache Hadoop
• We drive innovation within
Apache Software Foundation
projects
• We are uniquely positioned to
deliver the highest quality of
Hadoop support
• We enable the ecosystem to
work better with Hadoop
Develop Distribute Support
We develop, distribute and support
the ONLY 100% open source
Enterprise Hadoop distribution
Endorsed by Strategic Partners
Headquarters: Palo Alto, CA
Employees: 200+ and growing
Investors: Benchmark, Index, Yahoo
3© Pactera. Confidential. All Rights Reserved. 3
Rapid Customer Growth
4© Pactera. Confidential. All Rights Reserved. 4
Hortonworks HDP: Enterprise Hadoop 1.x Distribution
© Pactera. Confidential. All Rights Reserved.
OS Cloud VM Appliance
PLATFORM
SERVICES
HADOOP
CORE
Enterprise Readiness
High Availability, Disaster Recovery,
Security and Snapshots
HORTONWORKS
DATA PLATFORM (HDP)
OPERATIONAL
SERVICES
DATA
SERVICES
HIVE
(HCATALOG)
PIG HBASE
OOZIE
AMBARI
HDFS
MAP REDUCE
Hortonworks
Data Platform (HDP)
Enterprise Hadoop
• The ONLY 100% open source and
complete distribution
• Enterprise grade, proven and
tested at scale
• Ecosystem endorsed to ensure
interoperability
SQOOP
FLUME
NFS
LOAD &
EXTRACT
WebHDFS
5
Hadoop 2.0… The Enterprise Generation
© Pactera. Confidential. All Rights Reserved.
Business Value
Big Data
Transactions, Interacti
ons, Observations
Single Platform
Multiple Use
BATCH
INTERACTIVE
ONLINE
1.0 Architected for the Large Web Properties
2.0 Architected for the Broad Enterprise
Enterprise Requirements Hadoop 2.0 Features
Mixed workloads YARN
Interactive Query Hive on Tez
Reliability Full Stack HA
Point in time Recovery Snapshots
Multi Data Center Disaster Recovery
ZERO downtime Rolling Upgrades
Security Knox Gateway
6
HDP: Enterprise Hadoop 2.0 Distribution
© Pactera. Confidential. All Rights Reserved.
OS/VM Cloud Appliance
PLATFORM
SERVICES
HADOOP
CORE
Enterprise Readiness
High Availability, Disaster
Recovery, Rolling
Upgrades, Security and
Snapshots
HORTONWORKS
DATA PLATFORM (HDP)
OPERATIONAL
SERVICES
DATA
SERVICES
HIVE &
HCATALOG
PIG HBASE
HDFS
MAP
Hortonworks
Data Platform (HDP)
Enterprise Hadoop
• The ONLY 100% open source and
complete distribution
• Enterprise grade, proven and
tested at scale
• Ecosystem endorsed to ensure
interoperability
SQOOP
FLUME
NFS
LOAD &
EXTRACT
WebHDFS
KNOX*
OOZIE
AMBARI
FALCON*
YARN*
TEZ* OTHERREDUCE
7
Seamless Interoperability with Microsoft Tools
© Pactera. Confidential. All Rights Reserved.
• Integrated with Microsoft
tools for native big data
analysis
» Bi-directional connectors for SQL
Server and SQL Azure through
SQOOP
» Excel ODBC integration through
Hive
• Addressing demand for
Hadoop on Windows
» Ideal for Windows customers with
Hadoop operational experience
• Enables most common
Hadoop workloads in the
Enterprise
» Data refinement and ETL offload
for high-volume data landing
» Data exploration for discovery of
new business opportunities
» Data enrichment for fined tuned
delivery and recommendation
engines
APPLICATIONSDATASYSTEMS
Microsoft Applications
HORTONWORKS
DATA PLATFORM
For Windows
DATASOURCES
MOBILE
DATA
OLTP, PO
S
SYSTEMS
Traditional Sources
(RDBMS, OLTP, OLAP)
New Sources
(web logs, email, sensor data, social media)
8
Transferring Our Hadoop Expertise to You
© Pactera. Confidential. All Rights Reserved.
The expert source for
Apache Hadoop training & certification
• World class training programs designed to help you learn
fast
• Role-based hands on classes with 50% lab time
• Certification to demonstrate Hadoop Expertise in
Development and Administration
• Expert consulting services
• Programs designed to transfer knowledge
• Industry leading Hadoop Sandbox
• Free download
• Fastest way to learn Apache Hadoop
• Personal, portable Hadoop environment
9
Hortonworks Summary
© Pactera. Confidential. All Rights Reserved.
• Leading the Innovation in Core Hadoop
• Addressing the requirements for Enterprise usage
• Enabling interoperability of the ecosystem
• No lock-in. 100% Open Source.
• Best in industry support with flexible pricing model
• Find out moreworks.com
» www.hortonworks.com/hadoop-training/
» www.hortonworks.com/sandbox
10
Big Data is Critical
© Pactera. Confidential. All Rights Reserved.
Challenges to Using Big Data
Given that nearly less than one-third of businesses are in the dark about their
available data, it makes sense that silos are the primary hurdle in using this
information.
Lack of
sharing data is
an obstacle to
measuring
marketing ROI
Not using data
effectively to
personalize
marketing
communications
Not able to
link data
together at
the individual
customer level
Data collected
infrequently or
not quickly
enough
Too little or no
customer/
consumer data
51% 45% 42% 39% 29%
11
What Initiatives Are Using Big Data
© Pactera. Confidential. All Rights Reserved. 12
Obstacles to Define Big Data ROI
© Pactera. Confidential. All Rights Reserved.
Not enough skilled resources for adaptation
• Advance competencies
Traditional IT Architectures cause limitations
• Identifying the right technologies
• Adapting to particular needs
• Assemble business use cases
• Silos
Optimizing Solutions
• Strong internal use cases
• Inability to effectively automate data
13
Keys to a Successful Big Data Initiative
© Pactera. Confidential. All Rights Reserved.
Define the Impact
• Short term VS. Long term measures
What cannot be answered today?
• This is your starting point
Create User Centric Internal Applications
• Decision support framework
Predicting the Consumer
• Algorithms, Models, Testing, and
More Testing!
14
Solution Architecture using Multiple Ecosystems
© Pactera. Confidential. All Rights Reserved.
incoming
outgoing
Real Time In-Memory
Solution
EDW
Hadoop
Sand
box
2
3
4
7
8
9
6
5
Models
Algorithms
Simulations
1. Data Feeds into a Real-Time Memory solution that will ingest data into EDW, Hadoop, and other platforms as
mobile, API’s, etc.
2. ELT streaming into In-Memory Solution to provide visibility to Real-Time Social, Mobile, and Shell approaches to
Algorithms, Models, and Simulations
3. In-Memory Real-Time Solution such as YARN or Storm to digest data to EDW, Hadoop, Social Media, and other such
platforms.
4. EDW for Structured Information from Sources in 1.
5. Hadoop for semi-structured and unstructured data. Solution architecture including Sand Box availability.
6. Shell UI Interfaces utilizing data from Real-Time in memory solution as well as EDW, Hadoop, etc. for
Models, Algorithms, and Simulations.
7. Structured and Unstructured Reporting in reporting interfaces
8. Deep Dive analytics in Hadoop and Real-time Streaming
9. Real-Time customer interaction for Social and other similar platforms.
1
15
Predictive
Analysis
Use Case
for Online Travel
Company
16© Pactera. Confidential. All Rights Reserved.
Flight Cost by Variants Determination
Data Feeds utilize real-time in-memory streaming to execute matching algorithms.
Used in order to determine views within a session of certain one-way and round trip
flights viewed by users.
Predictive Analytics algorithms determine how to increase/decrease prices based on
views, market pricing, time, and availability.
© Pactera. Confidential. All Rights Reserved.
http
logs
partners
custom
incoming
outgoing
destinations
rdbms
hadoop
application
mobile
Real Time In-Memory Solution
(Storm)
17
Solution Architecture using YARN
© Pactera. Confidential. All Rights Reserved.
• Created to manage resource needs across all uses
• Ensures predictable performance & QoS for all apps
• Enables apps to run “IN” Hadoop rather than “ON”
» Key to leveraging all other common services of the Hadoop platform:
security, data lifecycle management, etc.
Applications Run Natively IN Hadoop
HDFS2 (Redundant, Reliable Storage)
YARN (Cluster Resource Management)
BATCH
(MapReduce)
INTERACTIVE
(Tez)
STREAMING
(Storm, S4,…)
GRAPH
(Giraph)
IN-MEMORY
(Spark)
HPC MPI
(OpenMPI)
ONLINE
(HBase)
OTHER
(Search)
(Weave…)
18
Pactera Big Data Capability
© Pactera. Confidential. All Rights Reserved.
Big Data Solution Architecture
 In-Memory Solutions
 Scalable Distributed Platforms
Next Generation Analytics
 Models, Algorithms, and Simulations
 Visualization
Improving Operational Ability
 Help companies drive more operational efficiencies from existing
investments.
 Moving from the realm of data scientists into everyday business transactions
and encounters.
New Business Processes
 Impact on both customer intelligence and operational efficiency by making
everything immediately actionable.
 Armed with immediate decision-making capability and intelligence,
companies will be able to implement new business processes that will
change how business is done.
 We ask the Right Questions
19
How Pactera can help with Big Data
Implementation and Architecture
Benchmark and Monitoring
Implementation and Architecture
POC (2-4 Weeks)
© Pactera. Confidential. All Rights Reserved.
Executive Workshop
Strategies, Planning, and Expectations
• Big Data strategy on what tomorrow will look like
• Using Big Data to establish market dominance
• Big Data project takeaways
• Roadblocks to implementing Big Data analytics
• Defining an ROI for Big Data
• Getting the right ROI on Big Data
Workshop
(4 Hours)
Proof of Concept
(2-4 Weeks)
Projects:
•Benchmark & Monitoring
•Integrations & Migrations
•Implementation & Architecture
•Project Management
•Analytics
•Reporting
Technical Workshop
End-To-End Management
• System tuning/auto-tuning and configuration management
• Dealing with both structured and unstructured data
• Monitoring, diagnosis, and automated behavior detection
Solution Architecture
• Processor, memory, and system architectures for data analysis
• Benchmarks, metrics, and workload characterization for big
data
• Availability, fault tolerance and recovery issues
• Data management and analytics for vast amounts of
unstructured data
20
© Pactera. Confidential. All Rights Reserved.
Thank You
Tom Kersnick
tom.kersnick@pactera.com
Robby Richardson
rrichardson@hortonworks.com

Contenu connexe

Tendances

Hortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data London
Hortonworks
 

Tendances (20)

Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?Big Data Analytics - Is Your Elephant Enterprise Ready?
Big Data Analytics - Is Your Elephant Enterprise Ready?
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
Democratizing Big Data with Microsoft Azure HDInsight
Democratizing Big Data with Microsoft Azure HDInsightDemocratizing Big Data with Microsoft Azure HDInsight
Democratizing Big Data with Microsoft Azure HDInsight
 
Hortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptxHortonworks sqrrl webinar v5.pptx
Hortonworks sqrrl webinar v5.pptx
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 
Yahoo! Hack Europe
Yahoo! Hack EuropeYahoo! Hack Europe
Yahoo! Hack Europe
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
 
Bridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven WorldBridging the Big Data Gap in the Software-Driven World
Bridging the Big Data Gap in the Software-Driven World
 
Hortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data LondonHortonworks Presentation at Big Data London
Hortonworks Presentation at Big Data London
 
Webinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_finalWebinar turbo charging_data_science_hawq_on_hdp_final
Webinar turbo charging_data_science_hawq_on_hdp_final
 
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
Discover Enterprise Security Features in Hortonworks Data Platform 2.1: Apach...
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
 
Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'Don't Let Security Be The 'Elephant in the Room'
Don't Let Security Be The 'Elephant in the Room'
 
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.nextDiscover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
Discover HDP 2.2: Even Faster SQL Queries with Apache Hive and Stinger.next
 
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
Accelerate Big Data Application Development with Cascading and HDP, Hortonwor...
 

En vedette

How Executives Work with Private Equity
How Executives Work  with Private EquityHow Executives Work  with Private Equity
How Executives Work with Private Equity
John A. Bova
 
Pactera Big Data Solutions for Retail
Pactera Big Data Solutions for Retail Pactera Big Data Solutions for Retail
Pactera Big Data Solutions for Retail
Pactera_US
 

En vedette (18)

Actividad virtual 2
Actividad virtual 2Actividad virtual 2
Actividad virtual 2
 
Eval q1
Eval q1 Eval q1
Eval q1
 
Creative education summit
Creative education summitCreative education summit
Creative education summit
 
EPA's Environmental Justice Strategic Plan 2016-2020
EPA's Environmental Justice Strategic Plan 2016-2020EPA's Environmental Justice Strategic Plan 2016-2020
EPA's Environmental Justice Strategic Plan 2016-2020
 
universality
universalityuniversality
universality
 
Paisjet dalese te kompjuterit
Paisjet dalese te kompjuteritPaisjet dalese te kompjuterit
Paisjet dalese te kompjuterit
 
#Glocal16 Che aria tira sui dati ambientali in Italia
#Glocal16 Che aria tira sui dati ambientali in Italia #Glocal16 Che aria tira sui dati ambientali in Italia
#Glocal16 Che aria tira sui dati ambientali in Italia
 
Birdwatching: an introduction
Birdwatching: an introduction Birdwatching: an introduction
Birdwatching: an introduction
 
Evaluation Activity 6
Evaluation Activity  6Evaluation Activity  6
Evaluation Activity 6
 
How Executives Work with Private Equity
How Executives Work  with Private EquityHow Executives Work  with Private Equity
How Executives Work with Private Equity
 
Bases de datos scopus
Bases de datos scopusBases de datos scopus
Bases de datos scopus
 
Pactera Big Data Solutions for Retail
Pactera Big Data Solutions for Retail Pactera Big Data Solutions for Retail
Pactera Big Data Solutions for Retail
 
[Pem zhipeng Xie] reading notes《自控力》读书笔记
[Pem zhipeng Xie] reading notes《自控力》读书笔记[Pem zhipeng Xie] reading notes《自控力》读书笔记
[Pem zhipeng Xie] reading notes《自控力》读书笔记
 
Ghhh
GhhhGhhh
Ghhh
 
Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!Big Data with KNIME is as easy as 1, 2, 3, ...4!
Big Data with KNIME is as easy as 1, 2, 3, ...4!
 
Asus eee-pc-1101 ha-netbook-manual
Asus eee-pc-1101 ha-netbook-manualAsus eee-pc-1101 ha-netbook-manual
Asus eee-pc-1101 ha-netbook-manual
 
Ap ps tecnologicas
Ap ps tecnologicasAp ps tecnologicas
Ap ps tecnologicas
 
Having Fun Building Web Applications (Day 1 Slides)
Having Fun Building Web Applications (Day 1 Slides)Having Fun Building Web Applications (Day 1 Slides)
Having Fun Building Web Applications (Day 1 Slides)
 

Similaire à Transform Your Business with Big Data and Hortonworks

Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Hortonworks
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
inevitablecloud
 

Similaire à Transform Your Business with Big Data and Hortonworks (20)

A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
A Comprehensive Approach to Building your Big Data - with Cisco, Hortonworks ...
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?Hortonworks - What's Possible with a Modern Data Architecture?
Hortonworks - What's Possible with a Modern Data Architecture?
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop SolutionHitachi Data Systems Hadoop Solution
Hitachi Data Systems Hadoop Solution
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
OOP 2014
OOP 2014OOP 2014
OOP 2014
 
Why Hadoop as a Service?
Why Hadoop as a Service?Why Hadoop as a Service?
Why Hadoop as a Service?
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
Intro to Big Data and Apache Hadoop by Dr. Amr Awadallah at CLOUD WEEKEND '13...
 
Cw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-clouderaCw13 big data and apache hadoop by amr awadallah-cloudera
Cw13 big data and apache hadoop by amr awadallah-cloudera
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Apresentação Hadoop
Apresentação HadoopApresentação Hadoop
Apresentação Hadoop
 

Plus de Pactera_US

Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks
Pactera_US
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data
Pactera_US
 
Predicting Customer Behavior With Big Data
Predicting Customer Behavior With Big Data Predicting Customer Behavior With Big Data
Predicting Customer Behavior With Big Data
Pactera_US
 
How do you monitor your Basel III compliance?
How do you monitor your Basel III compliance? How do you monitor your Basel III compliance?
How do you monitor your Basel III compliance?
Pactera_US
 

Plus de Pactera_US (10)

How to Achieve Measurable Benefits Through Project and Organizational Change
How to Achieve Measurable Benefits Through Project and Organizational ChangeHow to Achieve Measurable Benefits Through Project and Organizational Change
How to Achieve Measurable Benefits Through Project and Organizational Change
 
Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data
 
Predicting Customer Behavior With Big Data
Predicting Customer Behavior With Big Data Predicting Customer Behavior With Big Data
Predicting Customer Behavior With Big Data
 
Big Data - How to Get Started
Big Data - How to Get Started Big Data - How to Get Started
Big Data - How to Get Started
 
Siebel to Salesforce
Siebel to Salesforce Siebel to Salesforce
Siebel to Salesforce
 
Big Data Webinar
Big Data WebinarBig Data Webinar
Big Data Webinar
 
Business Process Management - Enabling The Business Drivers
Business Process Management - Enabling The Business DriversBusiness Process Management - Enabling The Business Drivers
Business Process Management - Enabling The Business Drivers
 
China IT Outsourcing
China IT Outsourcing China IT Outsourcing
China IT Outsourcing
 
How do you monitor your Basel III compliance?
How do you monitor your Basel III compliance? How do you monitor your Basel III compliance?
How do you monitor your Basel III compliance?
 

Dernier

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Dernier (20)

Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Transform Your Business with Big Data and Hortonworks

  • 1. CONSULTING SOLUTIONS OUTSOURCING PARTNER FOR A NEW ERA Transform Your Business with Big Data and Hortonworks Tom Kersnick – Pactera – Director Big Data Solutions Robby Richardson – Hortonworks – Enterprise Account Manager
  • 2. Topics © Pactera. Confidential. All Rights Reserved. 2 Who is Hortonworks? 3 Hortonworks HDP: Enterprise Hadoop Distribution 4 5 Pactera Intro 6 Big Data Deep Dive Hadoop 2.0: The Enterprise Generation 1 Hortonworks Intro 2
  • 3. Hortonworks Snapshot • We distribute the only 100% Open Source Enterprise Hadoop Distribution: Hortonworks Data Platform • We engineer, test & certify HDP for enterprise usage • We employ the core architects, builders and operators of Apache Hadoop • We drive innovation within Apache Software Foundation projects • We are uniquely positioned to deliver the highest quality of Hadoop support • We enable the ecosystem to work better with Hadoop Develop Distribute Support We develop, distribute and support the ONLY 100% open source Enterprise Hadoop distribution Endorsed by Strategic Partners Headquarters: Palo Alto, CA Employees: 200+ and growing Investors: Benchmark, Index, Yahoo 3© Pactera. Confidential. All Rights Reserved. 3
  • 4. Rapid Customer Growth 4© Pactera. Confidential. All Rights Reserved. 4
  • 5. Hortonworks HDP: Enterprise Hadoop 1.x Distribution © Pactera. Confidential. All Rights Reserved. OS Cloud VM Appliance PLATFORM SERVICES HADOOP CORE Enterprise Readiness High Availability, Disaster Recovery, Security and Snapshots HORTONWORKS DATA PLATFORM (HDP) OPERATIONAL SERVICES DATA SERVICES HIVE (HCATALOG) PIG HBASE OOZIE AMBARI HDFS MAP REDUCE Hortonworks Data Platform (HDP) Enterprise Hadoop • The ONLY 100% open source and complete distribution • Enterprise grade, proven and tested at scale • Ecosystem endorsed to ensure interoperability SQOOP FLUME NFS LOAD & EXTRACT WebHDFS 5
  • 6. Hadoop 2.0… The Enterprise Generation © Pactera. Confidential. All Rights Reserved. Business Value Big Data Transactions, Interacti ons, Observations Single Platform Multiple Use BATCH INTERACTIVE ONLINE 1.0 Architected for the Large Web Properties 2.0 Architected for the Broad Enterprise Enterprise Requirements Hadoop 2.0 Features Mixed workloads YARN Interactive Query Hive on Tez Reliability Full Stack HA Point in time Recovery Snapshots Multi Data Center Disaster Recovery ZERO downtime Rolling Upgrades Security Knox Gateway 6
  • 7. HDP: Enterprise Hadoop 2.0 Distribution © Pactera. Confidential. All Rights Reserved. OS/VM Cloud Appliance PLATFORM SERVICES HADOOP CORE Enterprise Readiness High Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots HORTONWORKS DATA PLATFORM (HDP) OPERATIONAL SERVICES DATA SERVICES HIVE & HCATALOG PIG HBASE HDFS MAP Hortonworks Data Platform (HDP) Enterprise Hadoop • The ONLY 100% open source and complete distribution • Enterprise grade, proven and tested at scale • Ecosystem endorsed to ensure interoperability SQOOP FLUME NFS LOAD & EXTRACT WebHDFS KNOX* OOZIE AMBARI FALCON* YARN* TEZ* OTHERREDUCE 7
  • 8. Seamless Interoperability with Microsoft Tools © Pactera. Confidential. All Rights Reserved. • Integrated with Microsoft tools for native big data analysis » Bi-directional connectors for SQL Server and SQL Azure through SQOOP » Excel ODBC integration through Hive • Addressing demand for Hadoop on Windows » Ideal for Windows customers with Hadoop operational experience • Enables most common Hadoop workloads in the Enterprise » Data refinement and ETL offload for high-volume data landing » Data exploration for discovery of new business opportunities » Data enrichment for fined tuned delivery and recommendation engines APPLICATIONSDATASYSTEMS Microsoft Applications HORTONWORKS DATA PLATFORM For Windows DATASOURCES MOBILE DATA OLTP, PO S SYSTEMS Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, email, sensor data, social media) 8
  • 9. Transferring Our Hadoop Expertise to You © Pactera. Confidential. All Rights Reserved. The expert source for Apache Hadoop training & certification • World class training programs designed to help you learn fast • Role-based hands on classes with 50% lab time • Certification to demonstrate Hadoop Expertise in Development and Administration • Expert consulting services • Programs designed to transfer knowledge • Industry leading Hadoop Sandbox • Free download • Fastest way to learn Apache Hadoop • Personal, portable Hadoop environment 9
  • 10. Hortonworks Summary © Pactera. Confidential. All Rights Reserved. • Leading the Innovation in Core Hadoop • Addressing the requirements for Enterprise usage • Enabling interoperability of the ecosystem • No lock-in. 100% Open Source. • Best in industry support with flexible pricing model • Find out moreworks.com » www.hortonworks.com/hadoop-training/ » www.hortonworks.com/sandbox 10
  • 11. Big Data is Critical © Pactera. Confidential. All Rights Reserved. Challenges to Using Big Data Given that nearly less than one-third of businesses are in the dark about their available data, it makes sense that silos are the primary hurdle in using this information. Lack of sharing data is an obstacle to measuring marketing ROI Not using data effectively to personalize marketing communications Not able to link data together at the individual customer level Data collected infrequently or not quickly enough Too little or no customer/ consumer data 51% 45% 42% 39% 29% 11
  • 12. What Initiatives Are Using Big Data © Pactera. Confidential. All Rights Reserved. 12
  • 13. Obstacles to Define Big Data ROI © Pactera. Confidential. All Rights Reserved. Not enough skilled resources for adaptation • Advance competencies Traditional IT Architectures cause limitations • Identifying the right technologies • Adapting to particular needs • Assemble business use cases • Silos Optimizing Solutions • Strong internal use cases • Inability to effectively automate data 13
  • 14. Keys to a Successful Big Data Initiative © Pactera. Confidential. All Rights Reserved. Define the Impact • Short term VS. Long term measures What cannot be answered today? • This is your starting point Create User Centric Internal Applications • Decision support framework Predicting the Consumer • Algorithms, Models, Testing, and More Testing! 14
  • 15. Solution Architecture using Multiple Ecosystems © Pactera. Confidential. All Rights Reserved. incoming outgoing Real Time In-Memory Solution EDW Hadoop Sand box 2 3 4 7 8 9 6 5 Models Algorithms Simulations 1. Data Feeds into a Real-Time Memory solution that will ingest data into EDW, Hadoop, and other platforms as mobile, API’s, etc. 2. ELT streaming into In-Memory Solution to provide visibility to Real-Time Social, Mobile, and Shell approaches to Algorithms, Models, and Simulations 3. In-Memory Real-Time Solution such as YARN or Storm to digest data to EDW, Hadoop, Social Media, and other such platforms. 4. EDW for Structured Information from Sources in 1. 5. Hadoop for semi-structured and unstructured data. Solution architecture including Sand Box availability. 6. Shell UI Interfaces utilizing data from Real-Time in memory solution as well as EDW, Hadoop, etc. for Models, Algorithms, and Simulations. 7. Structured and Unstructured Reporting in reporting interfaces 8. Deep Dive analytics in Hadoop and Real-time Streaming 9. Real-Time customer interaction for Social and other similar platforms. 1 15
  • 16. Predictive Analysis Use Case for Online Travel Company 16© Pactera. Confidential. All Rights Reserved.
  • 17. Flight Cost by Variants Determination Data Feeds utilize real-time in-memory streaming to execute matching algorithms. Used in order to determine views within a session of certain one-way and round trip flights viewed by users. Predictive Analytics algorithms determine how to increase/decrease prices based on views, market pricing, time, and availability. © Pactera. Confidential. All Rights Reserved. http logs partners custom incoming outgoing destinations rdbms hadoop application mobile Real Time In-Memory Solution (Storm) 17
  • 18. Solution Architecture using YARN © Pactera. Confidential. All Rights Reserved. • Created to manage resource needs across all uses • Ensures predictable performance & QoS for all apps • Enables apps to run “IN” Hadoop rather than “ON” » Key to leveraging all other common services of the Hadoop platform: security, data lifecycle management, etc. Applications Run Natively IN Hadoop HDFS2 (Redundant, Reliable Storage) YARN (Cluster Resource Management) BATCH (MapReduce) INTERACTIVE (Tez) STREAMING (Storm, S4,…) GRAPH (Giraph) IN-MEMORY (Spark) HPC MPI (OpenMPI) ONLINE (HBase) OTHER (Search) (Weave…) 18
  • 19. Pactera Big Data Capability © Pactera. Confidential. All Rights Reserved. Big Data Solution Architecture  In-Memory Solutions  Scalable Distributed Platforms Next Generation Analytics  Models, Algorithms, and Simulations  Visualization Improving Operational Ability  Help companies drive more operational efficiencies from existing investments.  Moving from the realm of data scientists into everyday business transactions and encounters. New Business Processes  Impact on both customer intelligence and operational efficiency by making everything immediately actionable.  Armed with immediate decision-making capability and intelligence, companies will be able to implement new business processes that will change how business is done.  We ask the Right Questions 19
  • 20. How Pactera can help with Big Data Implementation and Architecture Benchmark and Monitoring Implementation and Architecture POC (2-4 Weeks) © Pactera. Confidential. All Rights Reserved. Executive Workshop Strategies, Planning, and Expectations • Big Data strategy on what tomorrow will look like • Using Big Data to establish market dominance • Big Data project takeaways • Roadblocks to implementing Big Data analytics • Defining an ROI for Big Data • Getting the right ROI on Big Data Workshop (4 Hours) Proof of Concept (2-4 Weeks) Projects: •Benchmark & Monitoring •Integrations & Migrations •Implementation & Architecture •Project Management •Analytics •Reporting Technical Workshop End-To-End Management • System tuning/auto-tuning and configuration management • Dealing with both structured and unstructured data • Monitoring, diagnosis, and automated behavior detection Solution Architecture • Processor, memory, and system architectures for data analysis • Benchmarks, metrics, and workload characterization for big data • Availability, fault tolerance and recovery issues • Data management and analytics for vast amounts of unstructured data 20
  • 21. © Pactera. Confidential. All Rights Reserved. Thank You Tom Kersnick tom.kersnick@pactera.com Robby Richardson rrichardson@hortonworks.com

Notes de l'éditeur

  1. Big Data is extremely critical in organizations just to keep up with the masses.In most retail organizations, internal data is very challenging to comprehend in understanding your customer as well as demand.Publications state that 1/3 of retailers are in the dark regarding data that could be available to them. The Silo approach within organizations is the primary cause of the broken data pipeline.The primary reasons as of why this is a hurdle are due to:*The lack of sharing data – definitely a major obstacle in measuring ROI*Misuse of available data in marketing communications – not able to personalize directly to your customer*Linking data at the customer level – this is needed to thoroughly understand user behavior*Infrequent data collection – only extracting from logs and online serving systems used within your traditional reporting ecosystem*Not enough customer data – not capturing the details of the customer (includes proper timings of viewed product, key indicators on why a user looks at one product versus another and so on)
  2. Flight Cost Variant Determination Flight Cost is one of the algorithm methods being used to increase/decrease revenue based on page views, consumer marketing, and time spent on a particular one-way or round-trip flight by a consumer. The goal is to provide not only alternatives, but increase/decrease cost while other consumers are also viewing the same flights. This is determined by sales from all related airlines and competitors during the flight availability. This method can be extended to use other sources as well.Destinations:web applicationsmobile applicationshadooprdbmsIn the solution architecture shown, the in memory solution processes views, marketing, customer behavior, time, and competitor results to derive a increased or decreased price for a given one-way or round trip flight. This allows this travel company to determine the proper pricing based on these measures within an algorithm. The architecture shown also allows this travel company to try out other predictive models at any given point in time to see if one model out performs another. They could be utilizing similar measures and outcomes as well as new derived measures from their predictive models. Overall, this is a win for the travel company. Never losing revenue from the original ‘bread and butter’ model they always apply. Fascinating right?As you can see in the outgoing destinations, this provides consistent results in all platforms allowing a finite understanding of how the travel company is generating results overall. The solution can provide endless results based on predictive models that can be applied in real-time. Any day, any time, any millisecond.
  3. Pactera offers a complete life cycle solutions within your organization. We offer a free 4 hour executive and technical workshop within your organization. We just ask for you to fill out a 1 page questionnaire to help us understand your expectations.The executive workshop entails strategy, planning, and your current and future goals.The technical workshop is a deep dive involving end to end management and a proper solution architecture based on your current and up and coming goals. Once the workshops is complete, we will provide you an assessment of the outcome.We also offer a 2-4 week proof of concept to ensure your project is put into action. And finally, we offer Full lifecycle in the following:Benchmark & MonitoringIntegrations & MigrationsImplementation & ArchitectureProject ManagementAnalyticsReporting