Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it All!

Powering the Connected Data
Platform With ETL Onboarding
@Scott_Gnau
CTO, Hortonworks
@TenduYogurtcu
Big Data GM, Syncsort

Global Leader in Big Iron to Big Data Solutions
2Syncsort Confidential and Proprietary - do not copy or distribute
• Provider of enterprise software and leader in Big Iron to Big Data solutions
in more than 85 countries around the world
• Global presence in 87% of enterprise Fortune 500 companies
• High performance & scalable software harnessing valuable data assets to
power business and operational analytics, while dramatically reducing the
cost of mainframe and legacy systems
• Unique focus on customer value through cost-effective solutions and
unparalleled support; trusted leader for nearly 50 years
WOODCLIFF LAKE, NJ
JAPAN
SINGAPORE
2
Global customer base of leaders and
emerging businesses across all major
industries
Strategic partnerships in Big Iron and Big Data
ecosystems

Meet Today’s Presenters
Scott Gnau
CTO, Hortonworks
Tendu Yogurtcu, PhD
GM, Big Data, Syncsort

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Open and Connected Data Platforms
DATA AT
REST
DATA IN
MOTION
ACTIONABLE
INTELLIGENCE
The Future of the Enterprise is About All Data
Modern Data Applications

Modern Data Applications
Modern Data Architecture
• ALL Data: Data-at-Rest
& Data-in-Motion
• Cloud & Data Center
• Powered by Open
Source
Big Data Analytics & IoT
Next Generation
Data Use-Cases:
• Predictive Retail
• Factory Automation
• Connected Cars
• Predictive Analytics
• Artificial Intelligence
The Shift to the
Modern Data Architecture
System-centric User-centric
Relational Database
Mainframe Client/Server Web & SaaS
IDMS
Data at
Rest
Data in
Motion
ACTIONABLE INTELLIGENCE
Modern Data
Applications

Connected Data Platforms Enable Enterprise Transformations
Data in
Motion
Data in
Motion
Data at
Rest
Data at
Rest
Machine
Learning
Deep Historical
Analysis
CLOUD
DATA CENTER
Stream Analytics
Edge
Data
Edge
Data
Edge
Analytics

Data is the new Raw Material for Commerce
 Easy Onboarding of New Data from New Sources
 Access to Data from Legacy Systems and Apps
 Successful Modern Data Apps
 New Business and Revenue models
All Data

Data – Raw Material for Advanced Analytics
8

Syncsort Makes ALL Data Accessible & Usable – Ready for Analytics
9

Our Strategy: Simplify Big Data Integration
• Deploy on premise or in the cloud
• Choose among multiple execution frameworks – Hadoop, Spark, Linux, Unix,
Windows
• Integrate streaming and batch data with a single data pipeline for innovative
applications, like IoT
• Future-proof applications to avoid re-writing jobs in order to take advantage of
innovations in new execution frameworks
• Access and integrate ALL enterprise data sources – including mainframe – for
advanced analytics
10

Three Commitments Underpin Our Big Data Integration Strategy
Syncsort Confidential and Proprietary - do not copy or distribute 12
Light footprint
Self-tuning
engine
Single install.
No 3rd party
dependencies
World-class data processing,
mainframe expertise
JIRA:
MAPREDUCE-2454
MAPREDUCE-4807
MAPREDUCE-4049
MAPREDUCE-5455
HIVE-8347
SQOOP-1272
PARQUET-134
Spark-packages
and more!
Ongoing Contributions to the
Open Source Community
1
Leverage Syncsort Technology
Innovations & Mainframe Heritage
2
Strong Partnerships with Strategic
Big Data & Hadoop Players
3

ETL Onboarding with Syncsort
13

Insurance: Easy Access to ALL Data for Better Analytics
• Challenge: Needed hard-to-access operational data for
advanced analytics
• Solution:
• Quickly load ~1000 database tables into HDP with the
click of a button
• Access & integrate complex Mainframe VSAM files, data
from DB2/z, Oracle & SQL Server
• Track changes & keep data up to date
• Benefits:
• Insight: Better and faster analytics
• Agility: Reclaim development time; single tool to ingest, detect changes and populate the data lake
• Compliance: Build audit trails, keep EDW current
• Productivity: No need for deep understanding of Hadoop

Leading Media Company: Accelerate New Business Initiatives
• Challenge: Build scalable platform to support new business
initiatives & scale for double-digit data growth, while reducing
escalating EDW & ELT Costs
• Solution:
• Shift data storage & processing out of the EDW into
Hadoop
• Migrate 500+ SQL ELT workloads to DMX-h on HDP
• Benefits:
• Agility: Scalable architecture to deploy new business initiatives – analyze more set top box data,
blend website user activity data, etc.
• Cost: Millions of dollars in savings from EDW, including SQL tuning & maintenance costs
• Productivity: ETL developers can stop coding & tuning, and get up & running on Hadoop quickly

Hotel Chain: Ease of Use, Timely & Up-to-Date Reporting
16
• Challenge: More timely collection & reporting on room
availability, event bookings, inventory and other hotel data
from 4,000+ properties globally
• Solution:
• Near real-time reporting
• DMX-h consumes property updates from Kafka every 10s
• DMX-h processes data on HDP, loading to TD every 30 min
• Deployed on Google Cloud Platform
• Benefits:
• Time to Value: DMX-h ease of use drastically cut development time
• Agility: Reports updated every 30 minutes vs every 24 hours
• Productivity: Leveraging ETL team for Hadoop (Spark), visual understanding of data pipeline
• Insight: Up-to-date data = better business decisions = happier customers

Syncsort DMX-h: Benefits to Business
• Faster Time to Value:
•Faster & better insights with readily-accessible data
• Compliance:
•Secure data access, ability to build audit trails
• Increased Productivity:
•Reclaim development time by automating, optimizing and future-proofing development
•Across platforms, on premise and in the cloud
• Cost:
•Lower archival costs
•Reduced development time
•Reduced Total Cost of Ownership, higher ROI

Syncsort Confidential and Proprietary - do not copy or distribute
18
See For Yourself!
***
Take a 30-day Free Trial @
www.syncsort.com/try

Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it All!

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (20)

Similaire à Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it All!

Similaire à Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it All! (20)

Plus de Precisely

Plus de Precisely (20)

Dernier

Dernier (20)

Use Cases from Batch to Streaming, MapReduce to Spark, Mainframe to Cloud: Today's ETL Does it All!

Notes de l'éditeur