Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
SAP HANA Vora
Bridging the gap between Corporate and Big Data
Henrique Pinto, Global HANA COE
July 2016
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 2
The Five Megatrends Driving Our Digitized World
And Thei...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 3
Hadoop and Spark at a Glance
What is it?
• Scalable faul...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 4
What’s Stopping Us?
The Digital Divide between Enterpris...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 5
ENTERPRISE BIG DATA
Bridging the Digital Divide
Introduc...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 6
SAP HANA Vora
What’s Inside and What Does It Do?
Democra...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 7
YARN
HDFS
Enable Precision Decisions
With Contextual Ins...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 8
Democratize Data Access for Data Science Discovery
Exten...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 9
Vora Modeling Tool
• Vora Tools use the Thriftserver to ...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 10
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 11
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 12
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 13
SAP HANA Vora: Use Cases
Fraud
Detection
Get access to ...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 14
Challenges Solution Why Vora
• Current DW with more tha...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 15Internal
SAP HANA Multi-temperature Data Management
Big ...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 16
Data Tiering w/ HANA & Vora
Comparison of the different...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 17
Vora 1.3 Highlights (Beta Program)
• Simplified install...
© 2016 SAP SE or an SAP affiliate company. All rights reserved. 18
Ÿ Graph engine – SAP HANA Vora embeds an in-memory grap...
SAP HANA Platform
The SAP focus: End-to-end value chain
SPATIAL
PROCESSING
ANALYTICS, TEXT,
GRAPH, PREDICTIVE
ENGINES
CONS...
© 2016 SAP SE or an SAP affiliate company. All rights reserved.
Thank You
Henrique Pinto
Director, Global HANA COE
henriqu...
Prochain SlideShare
Chargement dans…5
×

SAP HANA Vora SITMTY 20160707

2 176 vues

Publié le

SAP HANA Vora presentation at the SAP Inside Track Monterrey (#SITMTY) on 07/07/2016

Publié dans : Technologie
  • Soyez le premier à commenter

SAP HANA Vora SITMTY 20160707

  1. 1. SAP HANA Vora Bridging the gap between Corporate and Big Data Henrique Pinto, Global HANA COE July 2016
  2. 2. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 2 The Five Megatrends Driving Our Digitized World And Their Implications for Distributed Big Data Management Hyper Connectivity Everybody has access Super Computing Super computers power everywhere Cloud Computing The cloud is where we compute Smart World Your fridge knows what you want for dinner Cyber- Security High-powered security is now the norm
  3. 3. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 3 Hadoop and Spark at a Glance What is it? • Scalable fault-tolerant and distributed file system • Sits on top of a native file system • HDFS (Hadoop File System) is an append-only file system, designed for batch, not real-time • Splits files in blocks and distributes them to data nodes Why? • Organizations want more business value from Big Data • Hadoop configurations scale and perform at very low cost • Hadoop complements Data Warehouses, Data Integration and Analytics, but doesn’t replace them Data Processing • MapReduce was invented to query data residing in a Hadoop file system • MapReduce was not designed for interactive queries but long running batch jobs • For more details see http://hortonworks.com/hadoop/mapreduce/ • An open source in-memory analytics execution engine for fast, large-scale data processing • Used on top of Hadoop • Does not replace Hadoop • Built to replace MapReduce Hadoop and Spark at a Glance
  4. 4. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 4 What’s Stopping Us? The Digital Divide between Enterprise and Big Data © 2016 SAP SE or an SAP affiliate company. All rights reserved. 4Internal Too Complex Too Slow Unable to Work Together ENTERPRISE BIG DATA
  5. 5. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 5 ENTERPRISE BIG DATA Bridging the Digital Divide Introducing SAP HANA Vora © 2016 SAP SE or an SAP affiliate company. All rights reserved. 5Internal
  6. 6. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 6 SAP HANA Vora What’s Inside and What Does It Do? Democratize Data Access Make Precision Decisions Simplify Big Data Ownership SAP HANA Vora is an in-memory query engine which leverages and extends the Apache Spark execution framework to provide enriched interactive analytics on Hadoop. Drill Downs on HDFS Mashup API Enhancements Compiled Queries HANA-Spark Controller Unified Landscape Open Programming Any Hadoop Clusters
  7. 7. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 7 YARN HDFS Enable Precision Decisions With Contextual Insights In Enterprise Systems Other Apps Files Files Files HANA-Spark Controller for improved performance between distributed systems Gain business coherence with business data and big data Compiled queries enable applications & data analysis to work more efficiently across nodes Familiar OLAP experience on Hadoop to derive business insights from big data such as drill-down into HDFS data Compiled Queries Spark Controller Drill Downs SAP HANA in-memory platform Vora Spark Vora Spark In-Memory Store Application Services Database Services Integration Services Processing Services SAP HANA Platform Vora Spark HANA Smart Data Access Spark Controller
  8. 8. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 8 Democratize Data Access for Data Science Discovery Extensive programming support for Scala, python, C, C++, R, and Java allow data scientists to use their tool of choice, Pursue new inquiries without compromise on data and easily integrate these insights with all data Enable data scientists and developers who prefer Spark R, Spark ML to mash up corporate data with Hadoop/Spark data easily Optionally, leverage HANA’s multiple data processing engines for developing new insights from business and contextual data. Mashup Enhancements Open Programming OptionalUse of SAP HANA for Delegated, multi-engine pre-processing Spark Data-source API enhancement In-Memory Store SAP HANA Platform YARN HDFSFiles Files Files Vora Spark Vora Spark Vora Spark Application Services Database Services Integration Services Processing Services
  9. 9. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 9 Vora Modeling Tool • Vora Tools use the Thriftserver to provide access to the Modeler under http://<DNS_NAME_OF_JUMPBOX_NODE>:9225 • Perspectives: • Data Browser • SQL Editor • Modeler
  10. 10. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 10
  11. 11. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 11
  12. 12. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 12
  13. 13. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 13 SAP HANA Vora: Use Cases Fraud Detection Get access to all your data including historical and contextual trends and current business data to analyze anomalies Risk Mitigation Be assured of more precise data to perform Monte Carlo simulations to produce distributions of possible outcome values with more precise context Targeted Marketing Campaigns React rapidly to customer sentiment and pinpoint targeting for sales and marketing campaigns with a more complete view of customer needs and wants 360° Customer Service Ensure a more complete picture of the customer with analysis of unstructured customer data, such as social media profiles, emails, calls, complaint logs, discussion forums, and website history
  14. 14. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 14 Challenges Solution Why Vora • Current DW with more than 100TB of data at end of life and not cost effective anymore • Regulatory requirement to retain data for 10 years • SAP HANA for most recent data, Hadoop for historical data • SAP HANA Vora accesses and queries data across all tiers • SAP HANA Vora provides enterprise analytics & OLAP like experience across data warehouse and HDFS. • Perform detailed predictive analytics throughout the manufacturing processes based on sensor data • More than 1PB of data • SAP HANA Vora rapidly processes sensor data in HDFS and combines it with data in SAP HANA for predictive analytics • SAP HANA Vora processing of HDFS data combined with HANA data reduced query runtime dramatically • Demand forecast accuracy for flu related products is relatively low • Difficult to detect and react quickly/intelligently to anticipate demand spikes created by outbreaks • Data Lake using Vora combining internal and external data sources (Internal- shipment, External – Weather, Twitter, Google Search, Center of Disease Control) • SAP HANA Vora enables fast analysis and forecasting of all types of data in HDFS DW / Tiering IoT Data Lake Use Cases of Existing Vora Customers
  15. 15. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 15Internal SAP HANA Multi-temperature Data Management Big Data: HANA In-Memory + HANA Dynamic Tiering + Hadoop • Modern in-memory platform • Transact/analyze data in real- time • Native predictive, text, graph and spatial algorithms • Real time analytics on top of streaming data • Disk backed, smart column store • High performance and efficient compression • Transparent for all operations. No changes required for BW operations • Excels at queries on structured data from terabyte to petabyte scale • No data duplication Hot Data HANA In-Memory Warm Data HANA DT • Hadoop virtualization possible with Smart Data Access (read only), via Hive or Spark (SP10+) • Also possible to access HDFS & MR Jobs directly via HANA vUDFs, which can be embedded in SQL queries • Future roadmap and new functionalities available on top of SAP HANA Vora: • Native bi-directional communication between HANA & Hadoop via Spark for fast analytical scenarios • Added ”BI-like” features on top of Hadoop (Hierarchies, UoM & currency conversions, etc.) Hadoop Cold Data
  16. 16. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 16 Data Tiering w/ HANA & Vora Comparison of the different strategies Component Performance Cost Factor Volume Processing HANA In-Memory $$$$ (4 out of 4) Up to several TBs (no technical limit) • ACID compliant • SQL, SQLScript, predictive, time series, spatial, text, … HANA Dynamic Tiering $$$ (3 out of 4) 100s of TB natively integrated in HANA • ACID compliant • SQL Hadoop Vora $$ (2 out of 4) 100s of TB (depending on available memory in Hadoop cluster) • In-memory OLAP engine for Hadoop • Compiled SQL code Hadoop Spark $ (1 out of 4) 100s of PB or more • General Purpose In-memory engine • Transformations and Actions (4 out of 4) (3 out of 4) (1 out of 4) (2 out of 4)
  17. 17. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 17 Vora 1.3 Highlights (Beta Program) • Simplified installation • Enhanced modeler • New engines (graph, time-series, doc store, disk store) • Kerberos support • UoM conversion, currency conversion
  18. 18. © 2016 SAP SE or an SAP affiliate company. All rights reserved. 18 Ÿ Graph engine – SAP HANA Vora embeds an in-memory graph database for real- time graph analysis. The primary focus is on complex read-only analytical queries on very large graphs. Ÿ Time Series – SAP HANA Vora provides a highly-distributed time series analysis engine which supports storing and analyzing time series data. By enabling efficient (memory and speed) time series compression and supporting features like standard aggregation, granularization, and advanced analysis; SAP HANA Vora allows you to join the relational data with series data to build efficient SQL models in Hadoop and other Big Data environments Ÿ Document Store – SAP HANA Vora introduces NoSQL features like storing JSON documents using the new Document Store as part of the SAP HANA Vora 1.3 release. The new DocStore supports schema-less tables, allowing you to flexibly add or remove fields from any documents and helps scale horizontally Ÿ Disk store – SAP HANA Vora provides relation capabilities without loading all the data into memory due to the data size SAP HANA Vora – Latest innovations -30 -20 -10 0 10 Temperature °C Halifax Waterloo
  19. 19. SAP HANA Platform The SAP focus: End-to-end value chain SPATIAL PROCESSING ANALYTICS, TEXT, GRAPH, PREDICTIVE ENGINES CONSUME COMPUTE STORAGE SOURCE INGEST Application Development Environment Transformations & Cleansing Smart Data Integration Smart Data Quality Stream Processing Smart Data Streaming STREAM PROCESSING LogsTextOLTP Social MachineGeoERP SensorStore & forward Mobile applications and BI Smart Data Access Virtual Tables User Defined Functions 1010100 1010110 1001110 Dynamic Tiering Aged data in Disk In-Memory Data model & data Calculation engine Fast computing Column Storage High performance analytics Series Data Storage Store time- series data Reporting & Dashboards High Performance Applications Data Exploration & Visualization Adhoc & OLAP Analytics Predictive Analysis Business Planning & Forecasting Lumira / BI But there is more work to do… Hadoop / Vora MapReduce YARN HDFS
  20. 20. © 2016 SAP SE or an SAP affiliate company. All rights reserved. Thank You Henrique Pinto Director, Global HANA COE henrique.pinto@sap.com

×