More Related Content Similar to Spark Summit presentation by Ken Tsai (20) More from Spark Summit (20) Spark Summit presentation by Ken Tsai1. Spark Usage in
Enterprise Business
Operations
Ken Tsai
VP, Data Management & Platform-as-Services
SAP
@kentsaiSAP
2.17.16: Spark Summit, NYC
2. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express permission of SAP SE or an
SAP affiliate company.
SAP and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP SE
(or an SAP affiliate company) in Germany and other countries. Please see http://global12.sap.com/corporate-en/legal/copyright/index.epx for additional trademark
information and notices.
Some software products marketed by SAP SE and its distributors contain proprietary software components of other software vendors.
National product specifications may vary.
These materials are provided by SAP SE or an SAP affiliate company for informational purposes only, without representation or warranty of any kind, and SAP SE or its
affiliated companies shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP SE or
SAP affiliate company products and services are those that are set forth in the express warranty statements accompanying such products and
services, if any. Nothing herein should be construed as constituting an additional warranty.
In particular, SAP SE or its affiliated companies have no obligation to pursue any course of business outlined in this document or any related presentation, or to develop or
release any functionality mentioned therein. This document, or any related presentation, and SAP SE’s or its affiliated companies’ strategy and possible future
developments, products, and/or platform directions and functionality are all subject to change and may be changed by SAP SE or its affiliated companies at any time for
any reason without notice. The information in this document is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. All forward-
looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place
undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
3. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
SAP – Our Quick Snapshot in the Enterprise Computing World
74% of the world’s
transaction revenue
touches an SAP system.
SAP’s product focus:
Enterprise Applications
Business Networks
Platforms – 15 yrs on IMC
SAP customers represent
87% of Forbes Global
2,000 companies.
SAP touches
$16 trillion of world
consumer purchases.
4. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
SAP HANA – An In-Memory Platform to Enable New Business Scenarios
Previously Not Feasible
COSPCOEPCOBKBKPF BSEG BSEG BSEG BSIS BSIS BSIK BSET LFC1 GLT0 GLT0 GLT0
SAP Simple Finance 4 0
updatesinserts
SAP Finance with aggregates and indices 10 5
no indices no aggregates no redundancies
CORE DATA STRUCTURE
REMAINS UNCHANGED
• Soft financial close anytime
• Real-time revenue and cost analysis
• Real-time liquidity forecasts
• Real-time alerts and blocks on suspicious
transactions
5. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Distributed Big Data Is Everywhere
How to better use it in core enterprise business applications?
~79% of Data
Reservoirs/Lakes are still
disconnected from core
business operations
How do I embed big data signal
into my business applications
and enterprise analytics?
53 Difficulty integrating
with CRM and/or
other systems
%
49
Unable to apply or integrate
external data quickly
enough to inform real-time
decision making
%
59 Only a few analysts with
specialized training can
analyze big data
%
Harvard Business Review Analytic Services, Global Survey of 251 Respondents, Sept. 2015
6. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Introducing SAP HANA Vora
An in-memory query engine that extends the Apache Spark execution framework
to enrich the interactive analytics experiences on massively distributed computing clusters
• OLAP processing
• In-Memory
Computing for high
performance
• Connecting to
Enterprise
Systems
• Unified System
Management
SAP HANA
ERP DATA BIG DATA
Parallelized
Queries
Vora
7. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Key Open Source Contribution to Apache Spark Ecosystem
Spark to HANA Push-downs & Data Hierarchies
scala> val hierarchy = sqlContext.sql( s"""
SELECT
LVL, COUNT(*), ROUND( AVG(P_RETAILPRICE), 2)
FROM (
SELECT LEVEL(node) AS LVL, P_RETAILPRICE
FROM
HIERARCHY(
USING PART_HIERARCHY AS c
JOIN PARENT p ON c.P_PARENT = p.P_PARTKEY
SEARCH BY
P_PARTKEY ASC
START WHERE
P_PARTKEY = 1
SET node ) AS H0
) T1 GROUP BY LVL
""".stripMargin ).collect().foreach(println)
90
1
90
3
91
3
91
2
90
4
91
1
+---+---+------------+
|LEVEL|COUNT|AVG(P_RETAILPRICE)|
+-----+-----+------------------+
| 0 | 1 | 901 |
| 1 | 2 | 903.5 |
| 2 | 3 | 912 |
+-----+-----+------------------+
val options = Map("dbschema" -> config.user,"host" ->
config.host,"instance" -> config.instance)
# HANA Live CustomerBasicData Virtual Data Model
val custConf = options + ("path" ->
s"""sap.hba.ecc/CustomerBasicData""")
val cust =
sqlContext.read.format("com.sap.spark.hana").options(custConf).load()
cust.registerTempTable("customer")
# HANA Live SalesOrderHeader VDM
val sohConf = options + ("path" ->
s"""sap.hba.ecc/SalesOrderHeader""")
val soh =
sqlContext.read.format("com.sap.spark.hana").options(sohConf).load()
soh.registerTempTable(soh)
# Top 5 Countries by Sales Order Volume
salesOrder = sqlContext.sql("select "Country",count(*) as Frequency
from salesOrder as s LEFT OUTER JOIN customer as c on
s.soldToParty = c.Customer
GROUP BY Country ORDER BY Frequency desc”)
8. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Airline Use Case – Optimize MRO scheduling with Sensor Data
Challenges
• $10,000 loss for every hour spent
on maintenance, repair, and
overhaul (MRO)
• Predictive MRO generates TB of
sensor data per flight
Solution
• SAP HANA Vora rapidly processes
sensor data in HDFS and
combines it with flight schedule
and staffing data in SAP HANA to
prioritize maintenance jobs and
accelerate MRO
Why SAP
HANA Vora
• Optimize MRO operations with
interactive, on-demand drill down
by airport, flight route, etc.
© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
9. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Utility Use Case – CenterPoint Energy
Challenge
• Smart meters generate TBs of
data/month
• Regulatory requirement to retain
data for 10 years
• Current storage solution full by
end-2016
• Need to leverage HDFS as an
additional tier for storage
Solution
• SAP HANA for most recent sensor
signal and operational data,
Dynamic Tiering for 1~2yrs old
data, HDFS for historical sensor
data
• SAP HANA Vora accesses and
queries data across all tiers
Why SAP
HANA Vora
• SAP HANA Vora provides
enterprise analytics & OLAP like
experience across data
warehouse and HDFS.
© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
10. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Utility Use Case – How It Works
CenterPoint Energy
Our benchmark tests proved that
SAP HANA paired with SAP
HANA Vora are the right
solutions for us. We expect
immediate cost benefits and to
see competitive differentiation
in the future.”
Gary Hayes,
CIO & SVP at CenterPoint Energy
© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
SAP HANA
MOST RECENT
SENSOR DATA
Dynamie
Tiering
1-2 YR OLD DATA
Parallelized
Queries
HDFS
HISTORICAL SENSOR DATA
Query data within and across tiers
11. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Financial Services Use Case – Extend Fraud Pattern Detection
Challenges
• 100+ million business transactions
daily, 25% growth YoY
• Limited access to archived data
• Difficult to detect patterns in
historical transactions
Solution
• Current transactions in SAP
HANA, historical transactions in
HDFS clusters
• Real-time detection of
abnormalities
Why SAP
HANA Vora
• Real-time, aggregated insights
from current and historical
transactions
© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
12. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
2016 and the Road Ahead
Customers in North
America, APJ, and
EMEA
Dev edition
available on AWS
TODAY
General Availability
Vora Modeler to
build and query
OLAP style cubes on
data
COMING
SOON
Planning (HR, Financial)
Extend engine support
for time series
Transaction
management
Analytics on archived
ERP data in Hadoop
FUTURE
13. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16© 2016 SAP SE or an SAP affiliate company. All rights reserved. Spark Summit New York, 2.17.16
Contribute to Spark Ecosystem, Embrace Best of Community Innovation
Contribution to
Open Source:
Hierarchy capabilities
Connection to ERP: predicate
pushdown to HANA
14. Thank you!
Ken Tsai: ken.tsai@sap.com
@kentsaiSAP
Enter to Win a
GoPro HERO4
Session at
SAP Booth 102
Learn More @
hana.sap.com/vora
Try Dev Edition
bit.ly/1K1qLyo
We’re Hiring: https://spark-summit.org/east-2016/jobs/