SlideShare une entreprise Scribd logo
1  sur  43
Télécharger pour lire hors ligne
Eric.kavanagh@bloorgroup.com




Twitter Tag: #briefr
!   Reveal the essential characteristics of enterprise
       software, good and bad

    !   Provide a forum for detailed analysis of today s
       innovative technologies

    !   Give vendors a chance to explain their product to
       savvy analysts

    !   Allow audience members to pose serious questions...
       and get answers!



Twitter Tag: #briefr
!  November: Cloud
   !  December: Innovators
   !  January: Big Data
   !  February: Performance
   !  March: Integration

Twitter Tag: #briefr
!  Databases were designed primarily to store information for
         retrieval at a later time.

       !  Big Data requires big databases.
       !  The convergence of multi-structured data and the need to
         perform both transactional and operational analytics has led to
         substantial innovations in database technologies.

       !  Today some of the biggest databases blend the best of both
         worlds, transforming the way organizations store and analyze
         enterprise data.




Twitter Tag: #briefr
Robin Bloor is
                        Chief Analyst at
                       The Bloor Group.




                       Robin.Bloor@Bloorgroup.com




Twitter Tag: #briefr
!    German-founded SAP is one of the largest software companies in
         the world. Its best-known products are SAP ERP, SAP Business
         Warehouse, SAP Business Objects, SAP Sybase IQ and SAP HANA.


    !    SAP offers a comprehensive set of database management
         solutions that spans the needs of the enterprise, leveraging in-
         memory, cloud and mobile technologies.


    !    Recent innovations include a Big Data analytics platform that
         loads, processes and delivers massive amounts of multi-
         structured data and is accessible on demand enterprise-wide.




Twitter Tag: #briefr
Courtney Claussen is a product manager
    at Sybase, Inc., concentrating on Sybase's
    data warehousing and analytics products.
       She has enjoyed a 30 year career in
    software development, technical support
      and product marketing in the areas of
     computer aided design, computer aided
         software engineering, database
     management systems, middleware, and
                     analytics.




Twitter Tag: #briefr
The New Possible: Very Big Data for Serious Business Value
The Briefing Room with Dr. Robin Bloor and SAP


October 9, 2012




                                                   CON
                                                         FIDE
                                                              N   TIAL
AGENDA




  •         Big Data Analytics: A Reality
  •         SAP Sybase IQ: Built for Big Data Analytics
  •         SAP Sybase IQ: Continuing Innovation




©  2012 SAP AG. All rights reserved.                      10
Big Data Analytics
     A Reality
THE NEW DYNAMICS OF BUSINESS
COMPETING ON BIG DATA DRIVEN ANALYTICS




                                                                                   New Strategies &
                                                                                   Business Models


                                                                                          Business
                                                                                           Value*

                                                                   Operational                                 Revenue
                                                                   Efficiencies                                Growth




      *A McKinsey study titled “Big Data: Next frontier for innovation, competition, and productivity”, May 2011, has found huge potential for
       Big Data Analytics with metrics as impressive as 60% improvements in Retail operating margins, 8% reduction in (US) national healthcare
       expenditures, and $150M savings in operational efficiencies in European economies


©  2012 SAP AG. All rights reserved.                                                                                                             12
Getting Value from Big Data

                                             Find supply chain
                                               inefficiencies



                     Predict financial                               Uncover insurance fraud
                      performance




                                             Applied
                 Optimize stocking of       Big Data                 Dispense correct health
                                                                              care
                      products
                                            Analytics

                                         Maintain customer loyalty




©  2012 SAP AG. All rights reserved.                                                           13
EDW AND BIG DATA PLATFORMS
CONTRASTS


                                                         Big Data
                                                                           Clickstreams,
                                       EDW                               sensors, log data,
                                                                        unstructured social
                                                                               media
                                             Large -> ENORMOUS

                                        Pre-processed data -> Raw data
                                                                Business
                                               Schema -> No schemaVale*

                                                  SQL -> Programmatic
                    Enterprise data,
                      relational,                OLAP -> Batch processing
                      structured,
                     indexed text                     Scale up -> Scale out


©  2012 SAP AG. All rights reserved.                                                          14
EDW AND BIG DATA PLATFORMS
PARTNERSHIP


                                                           Big Data
                                                                   Clickstreams, sensors,
                                       EDW                         log data, unstructured
                                                                        social media


                                         •  Combine all relevant data
                                            for better insights
                                                                      Business
                                         •  Real-time BI               Vale*
                                         •  SQL declarative processing
                                         •  Big Data pre-processing
                                            with EDW deep analytics
                 Enterprise data,
              relational, structured,
                   indexed text




©  2012 SAP AG. All rights reserved.                                                        15
Big-data analytics plus data warehousing
Deserves a new platform

                                                               Data loading
                                                          Mobile         OLAP*
                                             Integrated                      Web Operational
                                              workflow                           reporting
                                       Specialized                                         Data
                                          apps                                             mining



                                                                                                    Ÿ  Platform accessible
                                                                                                        to all business
                                                   Support	
  massive	
  numbers	
  	
                  processes and all
      Ÿ    Volume
                                                    of	
  users	
  and	
  workloads	
                   business users
      Ÿ    Velocity
                                                                                                    Ÿ  Requirement for
      Ÿ    Variety                                                      Ÿ  MapReduce                  data and algorithms
      Ÿ    Costs                                                        Ÿ  RDBMS+
                                                                         Ÿ  EDW                        together in the
      Ÿ    Skills                                In-DB analytics
                                                                                                        platform
                                                     Analyze	
  massive	
  volumes	
                Ÿ  Ability to distribute
                                             of	
  complex	
  data	
  from	
  many	
  sources	
         interactions
                                                                                                        throughout the
                                                                                                        enterprise
                                                                                                            *Online analytical processing
                                            HDFS                                                       +Relationaldatabase management
                                                                                                                                  system


©  2012 SAP AG. All rights reserved.                                                                                                 16
SAP SYBASE IQ
BUILT FOR BIG DATA ANALYTICS
Grid architecture
System scale out


                                          Full Mesh Interconnect




                                              Storage Fabric




Multi-dimensional scale out
•  Multiple resources can scale out independently
      –  Storage, server (CPU, memory), SAN switches, interconnect can scale on their own
•  Scale out is incremental and linear
      –  No need to add large units of monolithic CPU/storage pairs



©  2012 SAP AG. All rights reserved.                                                        18
Deployed use case
comScore Networks measures the digital world


Ÿ  comScore provides solutions for online audience measurement, e-commerce, advertising,
    search, video and mobile to analysts with digital marketing and vertical-specific industry
    expertise
Ÿ  Large SAP Sybase IQ Multiplex Grid on v15.x with 10s of servers and hundreds of CPU
    cores
Ÿ  Manages more than 150TB of data with trillions of rows and 10s of thousands of tables
Ÿ  More than 200+ concurrent users with highly parallel and distributed workload
Ÿ  Incrementally scalable on commodity hardware



                                                      ……………

                                         Storage Fabric




©  2012 SAP AG. All rights reserved.                                                             19
Community platform
Elastic virtual data marts

                                                                                                              VDM1
                                                                                                              VDM2
                                                                                                              Shared

                                       Full Mesh High Speed Interconnect

                                                                                              Virtual Shared CPU,
                                                                                              Memory
                           Logical Server 1                           Logical Server 2

                                                 Storage Fabric

                                                                             Virtual Shared
                                                                             Storage




Virtual data marts
•  VDM is logical binding of mutually exclusive nodes, memory, storage
      –  Logical Server (LS) is a mutually exclusive logical binding of nodes, memory
      –  Logical Server (LS) is a subset of VDM
           –    Bindings are elastic i.e. they can dynamically grow/shrink



©  2012 SAP AG. All rights reserved.                                                                                   20
Robust load engine


Loading can be from multiple                 Extraction, transformation, and load (ETL) in SAP software
modes:
                                             Scale out                                                                     Scale out
Ÿ  Parallel bulk load processing:                       ETL project 1           ETL project 1             ETL project 1

  –  Load rates in excess of 250 GB/hr are
     common even with modest-size                                        Full-­‐mesh	
  interconnect	
  
     hardware nodes

Ÿ  Continuous and trickle feed via
    microbatching (change data capture)

Page-level snapshot versioning:
                                                                                 Storage fabric
Ÿ  Allows non-blocking concurrent loads
    and queries

Load from client machines




©  2012 SAP AG. All rights reserved.                                                                                                   21
Query engine
Distributed query processing


                     Query 1                               Query 2
                 5 node DQP                            4 node DQP




                                          Storage Fabric




 Massively parallel processing
•  Leader node: Receives and initiates queries
      –  Any node can be a leader
      –  Leader node may satisfy query within itself
•  Worker node: Nodes pick up work units from leader
      –  Many worker nodes per query
      –  Same worker node can serve multiple queries


©  2012 SAP AG. All rights reserved.                                 22
Text search and analysis


                                            Table in SAP          Text index            Full text
Text                  File ingestion into
load                  blob or clob           Sybase IQ                                  queries




                                                                                         ?
                                                   TextCol   ID     Term       Pos
                                                                               Info
Text                  Filtering to plain            abc
filtering                                                    0       a     1,3,4
                      text and formatting          feed
                                                    dad      1       b     1,5

                                                   dead      2       c     1
Schema                Hierarchical to                                                 Visualization
transform                                                    3       d     2,3,4
                      relational                   beef
                                                     …       4       e     2,4,5

                                                     …       5        f    2,5
Entity                Categorization
extraction            tokenization



Full-text queries:
SELECT * FROM myTable WHERE CONTAINS (TextCol, ‘d’); – returns rows
SELECT * FROM myTable CONTAINS (TextCol, ‘d’); – returns rows and scoring
SELECT * FROM myTable WHERE CONTAINS (TextCol, ‘a AND NOT b’); – Boolean
SELECT * FROM myTable WHERE CONTAINS (TextCol, ‘a NEAR b’); – proximity



©  2012 SAP AG. All rights reserved.                                                                23
In-database analytics


No compromise for complex analytics:
Ÿ  Basic to advanced analytical functions available to SQL
Ÿ  Data never leaves the database until results are materialized
Ÿ  Analytics code and models are shareable
Ÿ  Analytics code and models are applicable to the latest data set
Ÿ  Average developer can build in database analytical models




                        Process in SAP Sybase IQ                        Database =
                                                                     logic and filtering

          Built-in functions           External DLL “A”             applied in database




                                       External DLL “A”   Analytics simplified: Logic to data = fast and efficient



©  2012 SAP AG. All rights reserved.                                                                             24
Federation
With external file systems (Hadoop distributed file system)


1.
                                       Client-side federation: Join data from SAP Sybase IQ and
                                       Hadoop at a client-application level

                                       Load Hadoop data into column store of
2.                         ETL
                                       SAP Sybase IQ: Extract, transform, and load data from Hadoop
                                       distributed file system (HDFS) into schemas of SAP Sybase IQ


                                       Join HDFS data with data of SAP Sybase IQ on the fly: Fetch
                                       and join subsets of HDFS data on demand, using SQL queries
3.                                     from SAP Sybase IQ (data federation technique)


                                       Combine results of Hadoop MapReduce (MR) jobs with SAP
                                       Sybase IQ data on the fly: Initiate and join results of MR jobs
                                       on demand using SQL queries from data in SAP Sybase IQ
4.                                     (query federation technique)


©  2012 SAP AG. All rights reserved.                                                                     25
Native MapReduce
Highly distributed processing without Hadoop

        SELECT (Reducer… (Mapper… OVER PARTITION BY…) OVER PARTITION BY…)




                                        …                                         …
                                Parallel mapper TPFs                    Parallel reducer TPFs


                                                       Storage Fabric




 •     TPFs (Table Parameterized Functions) consume/produce data sets in bulk
 •     TPFs run in parallel
 •     TPFs are fed with disjoint data sets
 •     TPFs can be arbitrarily nested to multiple levels via sub-queries
 •     TPFs currently available in popular, performance efficient C++

©  2012 SAP AG. All rights reserved.                                                            26
SAP Sybase IQ: Continuing Innovation




     IMPORTANT LEGAL DISCLAIMER CONCERNING PROGRAM DATES, RELEASE-RELATED INFORMATION & CONTENT

     All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ
     materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements,
     which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
SAP Sybase IQ: Next Wave
Innovations for extremely large databases (XLDB)




       Storage Architecture                            Loading Engine
    •  New generation column store                 •  Fully parallel bulk loading
    •  New partitioning and compression            •  Real-time loading into delta store


                         Petabytes           SAP                   Real-time
                                          Sybase IQ:
                                          Next Wave

       System Reliability                              Query Processing
    •  Grid resiliency                             •  Data affinity
    •  Data availability                           •  Aggressively parallel and distributed




©  2012 SAP AG. All rights reserved.                                                          28
Summary
SAP SYBASE IQ
A COMPREHENSIVE PLATFORM FOR BIG DATA ANALYTICS

                                           Sybase	
  PowerDesigner,	
                    Bradmark,	
                  SAS,	
  SPSS,	
  KXEN,	
  
                                          Sybase	
  Replica9on	
  Server,	
              Symantec,	
                    Fuzzy	
  Logix,	
                        BMMSoZ,	
  	
  
                                            SAP	
  BusinessObjects	
                    Whitesands,	
                 Zemen9s,	
  Visual	
                       SOLIX,	
  PBS	
  	
  
                                              ISYS,	
  Panop9con	
                      Quest,	
  ZEND	
                 Numerics	
  



        Eco-System                       Op9mized	
  BI,EIM,	
  
                                                                          Dev	
  and	
  admin	
  tools	
        Predic9ve	
  Analy9cs	
  	
         Packaged	
  ILM	
  apps	
  
                                         Model,	
  Replicate	
  
      App. Services
                                                                                                                                                                        Hadoop,	
  
            DBMS                                                                                                                                                           R	
  


                                        Comprehensive	
                Built-­‐in	
  Full	
       InDB	
  Analy9cs	
  w/	
                                             Big	
  Data	
  
                                                                                                                                        Web	
  2.0	
  APIs	
  
                                       ANSI	
  SQL	
  w/OLAP	
         Text	
  Search	
         MapReduce	
  +	
  simulator	
                                         OpnSrc	
  	
  APIs	
  




                                       Most	
  mature	
         Comprehensive	
                 MPP	
  queries	
  +	
  Virtual	
         High	
  Speed	
               Structured	
  +	
  
                                       column	
  store	
        lifecycle	
  9ering	
           Marts	
  +	
  User	
  scaling	
             loads	
                 Unstructured	
  Store	
  


©  2012 SAP AG. All rights reserved.                                                                                                                                                           30
© 2012 SAP AG. All rights reserved.


No part of this publication may be reproduced or transmitted in any form or for any    SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP BusinessObjects
purpose without the express permission of SAP AG. The information contained            Explorer, StreamWork, SAP HANA, and other SAP products and services
herein may be changed without prior notice.                                            mentioned herein as well as their respective logos are trademarks or registered
                                                                                       trademarks of SAP AG in Germany and other countries.
Some software products marketed by SAP AG and its distributors contain
proprietary software components of other software vendors.                             Business Objects and the Business Objects logo, BusinessObjects, Crystal
                                                                                       Reports, Crystal Decisions, Web Intelligence, Xcelsius, and other Business
Microsoft, Windows, Excel, Outlook, and PowerPoint are registered trademarks of        Objects products and services mentioned herein as well as their respective logos
Microsoft Corporation.                                                                 are trademarks or registered trademarks of Business Objects Software Ltd.
IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5,            Business Objects is an
System x, System z, System z10, System z9, z10, z9, iSeries, pSeries, xSeries,         SAP company.
zSeries, eServer, z/VM, z/OS, i5/OS, S/390, OS/390, OS/400, AS/400, S/390
                                                                                       Sybase and Adaptive Server, iAnywhere, Sybase 365, SQL Anywhere, and other
Parallel Enterprise Server, PowerVM, Power Architecture, POWER6+, POWER6,              Sybase products and services mentioned herein as well as their respective logos
POWER5+, POWER5, POWER, OpenPower, PowerPC, BatchPipes,
                                                                                       are trademarks or registered trademarks of Sybase, Inc. Sybase is an SAP
BladeCenter, System Storage, GPFS, HACMP, RETAIN, DB2 Connect, RACF,
                                                                                       company.
Redbooks, OS/2, Parallel Sysplex, MVS/ESA, AIX, Intelligent Miner, WebSphere,
Netfinity, Tivoli and Informix are trademarks or registered trademarks of IBM          All other product and service names mentioned are the trademarks of their
Corporation.                                                                           respective companies. Data contained in this document serves informational
                                                                                       purposes only. National product specifications may vary.
Linux is the registered trademark of Linus Torvalds in the U.S. and other countries.
                                                                                       The information in this document is proprietary to SAP. No part of this document
Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or
                                                                                       may be reproduced, copied, or transmitted in any form or for any purpose without
registered trademarks of Adobe Systems Incorporated in the United States and/or
                                                                                       the express prior written permission of SAP AG.
other countries.
Oracle and Java are registered trademarks of Oracle.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.
Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and
MultiWin are trademarks or registered trademarks of Citrix Systems, Inc.
HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®,
World Wide Web Consortium, Massachusetts Institute of Technology.




 ©  2012 SAP AG. All rights reserved.                                                                                                                               31
Twitter Tag: #briefr
The Universe
 of Big Data




               The	
  Bloor	
  Group	
  
In marketing terms
               BIG DATA
       is as big a trend as
        cloud computing

(if you measure the trend in terms of column inches)


                                           The	
  Bloor	
  Group	
  
The Big Data Trend
    q    Corporate data volumes
          grow at about 55% per
          annum
    q    VLDB volumes grow at
          about 55% per annum
    q    This is exponential
    q    Data has been growing
          at this rate for at least
          20 years
    q    As such there is nothing
          new about big data
          other than the current
          data volumes which
          follow a well established
          trend

Twitter Tag: #briefr                    The	
  Bloor	
  Group	
  
So What s New?

q    Volume, velocity, variety, verifiability and other
      words beginning with V - but not all at once
q    Hadoop is new
q    Big Data in the cloud is new
q    And there’s a new dynamic in data analytics
q    Volume (and velocity) is now mostly about events,
      not transactions, and the world of embedded
      processors is going to expand the number of
      events worth processing


                                                 The	
  Bloor	
  Group	
  
The Analytics Two-Step




                    The	
  Bloor	
  Group	
  
The Future?

q    The data growth trend is likely to continue
q    More and more companies will be drawn into using
      Big Data technologies

q    Will the two-step become a one-step? Not sure. If
      you gather Big Data, you also need to be able to
      throw it away
q    RDBMS (column store) will remain as the analytics
      engine




                                                The	
  Bloor	
  Group	
  
!  Please explain why you believe that the Sybase IQ
   shared some things architecture is equal to or
  better than a shared nothing architecture?

!  Are you seeing the same trend that I seem to be
  noticing with Big Data in respect to analytics?

!  Roughly how many of your customers are using
  Hadoop?

!  If I were a Sybase customer would you recommend
  Hadoop as an ETL mechanism or is it your view that
  Sybase IQ can do it all?


                                             The	
  Bloor	
  Group	
  
!  Please describe the most extensive use of Sybase
  IQ (in respect of data volumes, daily ingest,
  instances, etc.).

!  How difficult is it to use (in other words, what are
  the labor/DBA overheads compared to a traditional
  RDBMS)?

!  Are your competitors always the usual
  suspects (i.e. other column store products)? Do
  you ever compete with the NoSQL crowd?

!  Explain how you usually fit with HANA in sites
  where both products are in use. Is HANA promoting
  sales of Sybase IQ?
                                              The	
  Bloor	
  Group	
  
Twitter Tag: #briefr
!  This Month: Database
   !  November: Cloud
   !  December: Innovators
   !  January: Big Data
   !  2013 Editorial Calendar
          (www.insideanalysis.com)




Twitter Tag: #briefr
Twitter Tag: #briefr

Contenu connexe

Plus de Inside Analysis

First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataInside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 
Phasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey MalafskyPhasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey MalafskyInside Analysis
 
Red Hat - Sarangan Rangachari
Red Hat - Sarangan RangachariRed Hat - Sarangan Rangachari
Red Hat - Sarangan RangachariInside Analysis
 

Plus de Inside Analysis (20)

First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 
Phasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey MalafskyPhasic Systems - Dr. Geoffrey Malafsky
Phasic Systems - Dr. Geoffrey Malafsky
 
Red Hat - Sarangan Rangachari
Red Hat - Sarangan RangachariRed Hat - Sarangan Rangachari
Red Hat - Sarangan Rangachari
 

Dernier

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Dernier (20)

AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

The New Possible: Very Big Data for Serious Business Value

  • 1.
  • 3. !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr
  • 4. !  November: Cloud !  December: Innovators !  January: Big Data !  February: Performance !  March: Integration Twitter Tag: #briefr
  • 5. !  Databases were designed primarily to store information for retrieval at a later time. !  Big Data requires big databases. !  The convergence of multi-structured data and the need to perform both transactional and operational analytics has led to substantial innovations in database technologies. !  Today some of the biggest databases blend the best of both worlds, transforming the way organizations store and analyze enterprise data. Twitter Tag: #briefr
  • 6. Robin Bloor is Chief Analyst at The Bloor Group. Robin.Bloor@Bloorgroup.com Twitter Tag: #briefr
  • 7. !  German-founded SAP is one of the largest software companies in the world. Its best-known products are SAP ERP, SAP Business Warehouse, SAP Business Objects, SAP Sybase IQ and SAP HANA. !  SAP offers a comprehensive set of database management solutions that spans the needs of the enterprise, leveraging in- memory, cloud and mobile technologies. !  Recent innovations include a Big Data analytics platform that loads, processes and delivers massive amounts of multi- structured data and is accessible on demand enterprise-wide. Twitter Tag: #briefr
  • 8. Courtney Claussen is a product manager at Sybase, Inc., concentrating on Sybase's data warehousing and analytics products. She has enjoyed a 30 year career in software development, technical support and product marketing in the areas of computer aided design, computer aided software engineering, database management systems, middleware, and analytics. Twitter Tag: #briefr
  • 9. The New Possible: Very Big Data for Serious Business Value The Briefing Room with Dr. Robin Bloor and SAP October 9, 2012 CON FIDE N TIAL
  • 10. AGENDA •  Big Data Analytics: A Reality •  SAP Sybase IQ: Built for Big Data Analytics •  SAP Sybase IQ: Continuing Innovation ©  2012 SAP AG. All rights reserved. 10
  • 11. Big Data Analytics A Reality
  • 12. THE NEW DYNAMICS OF BUSINESS COMPETING ON BIG DATA DRIVEN ANALYTICS New Strategies & Business Models Business Value* Operational Revenue Efficiencies Growth *A McKinsey study titled “Big Data: Next frontier for innovation, competition, and productivity”, May 2011, has found huge potential for Big Data Analytics with metrics as impressive as 60% improvements in Retail operating margins, 8% reduction in (US) national healthcare expenditures, and $150M savings in operational efficiencies in European economies ©  2012 SAP AG. All rights reserved. 12
  • 13. Getting Value from Big Data Find supply chain inefficiencies Predict financial Uncover insurance fraud performance Applied Optimize stocking of Big Data Dispense correct health care products Analytics Maintain customer loyalty ©  2012 SAP AG. All rights reserved. 13
  • 14. EDW AND BIG DATA PLATFORMS CONTRASTS Big Data Clickstreams, EDW sensors, log data, unstructured social media Large -> ENORMOUS Pre-processed data -> Raw data Business Schema -> No schemaVale* SQL -> Programmatic Enterprise data, relational, OLAP -> Batch processing structured, indexed text Scale up -> Scale out ©  2012 SAP AG. All rights reserved. 14
  • 15. EDW AND BIG DATA PLATFORMS PARTNERSHIP Big Data Clickstreams, sensors, EDW log data, unstructured social media •  Combine all relevant data for better insights Business •  Real-time BI Vale* •  SQL declarative processing •  Big Data pre-processing with EDW deep analytics Enterprise data, relational, structured, indexed text ©  2012 SAP AG. All rights reserved. 15
  • 16. Big-data analytics plus data warehousing Deserves a new platform Data loading Mobile OLAP* Integrated Web Operational workflow reporting Specialized Data apps mining Ÿ  Platform accessible to all business Support  massive  numbers     processes and all Ÿ  Volume of  users  and  workloads   business users Ÿ  Velocity Ÿ  Requirement for Ÿ  Variety Ÿ  MapReduce data and algorithms Ÿ  Costs Ÿ  RDBMS+ Ÿ  EDW together in the Ÿ  Skills In-DB analytics platform Analyze  massive  volumes   Ÿ  Ability to distribute of  complex  data  from  many  sources   interactions throughout the enterprise *Online analytical processing HDFS +Relationaldatabase management system ©  2012 SAP AG. All rights reserved. 16
  • 17. SAP SYBASE IQ BUILT FOR BIG DATA ANALYTICS
  • 18. Grid architecture System scale out Full Mesh Interconnect Storage Fabric Multi-dimensional scale out •  Multiple resources can scale out independently –  Storage, server (CPU, memory), SAN switches, interconnect can scale on their own •  Scale out is incremental and linear –  No need to add large units of monolithic CPU/storage pairs ©  2012 SAP AG. All rights reserved. 18
  • 19. Deployed use case comScore Networks measures the digital world Ÿ  comScore provides solutions for online audience measurement, e-commerce, advertising, search, video and mobile to analysts with digital marketing and vertical-specific industry expertise Ÿ  Large SAP Sybase IQ Multiplex Grid on v15.x with 10s of servers and hundreds of CPU cores Ÿ  Manages more than 150TB of data with trillions of rows and 10s of thousands of tables Ÿ  More than 200+ concurrent users with highly parallel and distributed workload Ÿ  Incrementally scalable on commodity hardware …………… Storage Fabric ©  2012 SAP AG. All rights reserved. 19
  • 20. Community platform Elastic virtual data marts VDM1 VDM2 Shared Full Mesh High Speed Interconnect Virtual Shared CPU, Memory Logical Server 1 Logical Server 2 Storage Fabric Virtual Shared Storage Virtual data marts •  VDM is logical binding of mutually exclusive nodes, memory, storage –  Logical Server (LS) is a mutually exclusive logical binding of nodes, memory –  Logical Server (LS) is a subset of VDM –  Bindings are elastic i.e. they can dynamically grow/shrink ©  2012 SAP AG. All rights reserved. 20
  • 21. Robust load engine Loading can be from multiple Extraction, transformation, and load (ETL) in SAP software modes: Scale out Scale out Ÿ  Parallel bulk load processing: ETL project 1 ETL project 1 ETL project 1 –  Load rates in excess of 250 GB/hr are common even with modest-size Full-­‐mesh  interconnect   hardware nodes Ÿ  Continuous and trickle feed via microbatching (change data capture) Page-level snapshot versioning: Storage fabric Ÿ  Allows non-blocking concurrent loads and queries Load from client machines ©  2012 SAP AG. All rights reserved. 21
  • 22. Query engine Distributed query processing Query 1 Query 2 5 node DQP 4 node DQP Storage Fabric Massively parallel processing •  Leader node: Receives and initiates queries –  Any node can be a leader –  Leader node may satisfy query within itself •  Worker node: Nodes pick up work units from leader –  Many worker nodes per query –  Same worker node can serve multiple queries ©  2012 SAP AG. All rights reserved. 22
  • 23. Text search and analysis Table in SAP Text index Full text Text File ingestion into load blob or clob Sybase IQ queries ? TextCol ID Term Pos Info Text Filtering to plain abc filtering 0 a 1,3,4 text and formatting feed dad 1 b 1,5 dead 2 c 1 Schema Hierarchical to Visualization transform 3 d 2,3,4 relational beef … 4 e 2,4,5 … 5 f 2,5 Entity Categorization extraction tokenization Full-text queries: SELECT * FROM myTable WHERE CONTAINS (TextCol, ‘d’); – returns rows SELECT * FROM myTable CONTAINS (TextCol, ‘d’); – returns rows and scoring SELECT * FROM myTable WHERE CONTAINS (TextCol, ‘a AND NOT b’); – Boolean SELECT * FROM myTable WHERE CONTAINS (TextCol, ‘a NEAR b’); – proximity ©  2012 SAP AG. All rights reserved. 23
  • 24. In-database analytics No compromise for complex analytics: Ÿ  Basic to advanced analytical functions available to SQL Ÿ  Data never leaves the database until results are materialized Ÿ  Analytics code and models are shareable Ÿ  Analytics code and models are applicable to the latest data set Ÿ  Average developer can build in database analytical models Process in SAP Sybase IQ Database = logic and filtering Built-in functions External DLL “A” applied in database External DLL “A” Analytics simplified: Logic to data = fast and efficient ©  2012 SAP AG. All rights reserved. 24
  • 25. Federation With external file systems (Hadoop distributed file system) 1. Client-side federation: Join data from SAP Sybase IQ and Hadoop at a client-application level Load Hadoop data into column store of 2. ETL SAP Sybase IQ: Extract, transform, and load data from Hadoop distributed file system (HDFS) into schemas of SAP Sybase IQ Join HDFS data with data of SAP Sybase IQ on the fly: Fetch and join subsets of HDFS data on demand, using SQL queries 3. from SAP Sybase IQ (data federation technique) Combine results of Hadoop MapReduce (MR) jobs with SAP Sybase IQ data on the fly: Initiate and join results of MR jobs on demand using SQL queries from data in SAP Sybase IQ 4. (query federation technique) ©  2012 SAP AG. All rights reserved. 25
  • 26. Native MapReduce Highly distributed processing without Hadoop SELECT (Reducer… (Mapper… OVER PARTITION BY…) OVER PARTITION BY…) … … Parallel mapper TPFs Parallel reducer TPFs Storage Fabric •  TPFs (Table Parameterized Functions) consume/produce data sets in bulk •  TPFs run in parallel •  TPFs are fed with disjoint data sets •  TPFs can be arbitrarily nested to multiple levels via sub-queries •  TPFs currently available in popular, performance efficient C++ ©  2012 SAP AG. All rights reserved. 26
  • 27. SAP Sybase IQ: Continuing Innovation IMPORTANT LEGAL DISCLAIMER CONCERNING PROGRAM DATES, RELEASE-RELATED INFORMATION & CONTENT All forward-looking statements are subject to various risks and uncertainties that could cause actual results to differ materially from expectations. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of their dates, and they should not be relied upon in making purchasing decisions.
  • 28. SAP Sybase IQ: Next Wave Innovations for extremely large databases (XLDB) Storage Architecture Loading Engine •  New generation column store •  Fully parallel bulk loading •  New partitioning and compression •  Real-time loading into delta store Petabytes SAP Real-time Sybase IQ: Next Wave System Reliability Query Processing •  Grid resiliency •  Data affinity •  Data availability •  Aggressively parallel and distributed ©  2012 SAP AG. All rights reserved. 28
  • 30. SAP SYBASE IQ A COMPREHENSIVE PLATFORM FOR BIG DATA ANALYTICS Sybase  PowerDesigner,   Bradmark,   SAS,  SPSS,  KXEN,   Sybase  Replica9on  Server,   Symantec,   Fuzzy  Logix,   BMMSoZ,     SAP  BusinessObjects   Whitesands,   Zemen9s,  Visual   SOLIX,  PBS     ISYS,  Panop9con   Quest,  ZEND   Numerics   Eco-System Op9mized  BI,EIM,   Dev  and  admin  tools   Predic9ve  Analy9cs     Packaged  ILM  apps   Model,  Replicate   App. Services Hadoop,   DBMS R   Comprehensive   Built-­‐in  Full   InDB  Analy9cs  w/   Big  Data   Web  2.0  APIs   ANSI  SQL  w/OLAP   Text  Search   MapReduce  +  simulator   OpnSrc    APIs   Most  mature   Comprehensive   MPP  queries  +  Virtual   High  Speed   Structured  +   column  store   lifecycle  9ering   Marts  +  User  scaling   loads   Unstructured  Store   ©  2012 SAP AG. All rights reserved. 30
  • 31. © 2012 SAP AG. All rights reserved. No part of this publication may be reproduced or transmitted in any form or for any SAP, R/3, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP BusinessObjects purpose without the express permission of SAP AG. The information contained Explorer, StreamWork, SAP HANA, and other SAP products and services herein may be changed without prior notice. mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and other countries. Some software products marketed by SAP AG and its distributors contain proprietary software components of other software vendors. Business Objects and the Business Objects logo, BusinessObjects, Crystal Reports, Crystal Decisions, Web Intelligence, Xcelsius, and other Business Microsoft, Windows, Excel, Outlook, and PowerPoint are registered trademarks of Objects products and services mentioned herein as well as their respective logos Microsoft Corporation. are trademarks or registered trademarks of Business Objects Software Ltd. IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, Business Objects is an System x, System z, System z10, System z9, z10, z9, iSeries, pSeries, xSeries, SAP company. zSeries, eServer, z/VM, z/OS, i5/OS, S/390, OS/390, OS/400, AS/400, S/390 Sybase and Adaptive Server, iAnywhere, Sybase 365, SQL Anywhere, and other Parallel Enterprise Server, PowerVM, Power Architecture, POWER6+, POWER6, Sybase products and services mentioned herein as well as their respective logos POWER5+, POWER5, POWER, OpenPower, PowerPC, BatchPipes, are trademarks or registered trademarks of Sybase, Inc. Sybase is an SAP BladeCenter, System Storage, GPFS, HACMP, RETAIN, DB2 Connect, RACF, company. Redbooks, OS/2, Parallel Sysplex, MVS/ESA, AIX, Intelligent Miner, WebSphere, Netfinity, Tivoli and Informix are trademarks or registered trademarks of IBM All other product and service names mentioned are the trademarks of their Corporation. respective companies. Data contained in this document serves informational purposes only. National product specifications may vary. Linux is the registered trademark of Linus Torvalds in the U.S. and other countries. The information in this document is proprietary to SAP. No part of this document Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or may be reproduced, copied, or transmitted in any form or for any purpose without registered trademarks of Adobe Systems Incorporated in the United States and/or the express prior written permission of SAP AG. other countries. Oracle and Java are registered trademarks of Oracle. UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group. Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered trademarks of Citrix Systems, Inc. HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C®, World Wide Web Consortium, Massachusetts Institute of Technology. ©  2012 SAP AG. All rights reserved. 31
  • 33. The Universe of Big Data The  Bloor  Group  
  • 34. In marketing terms BIG DATA is as big a trend as cloud computing (if you measure the trend in terms of column inches) The  Bloor  Group  
  • 35. The Big Data Trend q  Corporate data volumes grow at about 55% per annum q  VLDB volumes grow at about 55% per annum q  This is exponential q  Data has been growing at this rate for at least 20 years q  As such there is nothing new about big data other than the current data volumes which follow a well established trend Twitter Tag: #briefr The  Bloor  Group  
  • 36. So What s New? q  Volume, velocity, variety, verifiability and other words beginning with V - but not all at once q  Hadoop is new q  Big Data in the cloud is new q  And there’s a new dynamic in data analytics q  Volume (and velocity) is now mostly about events, not transactions, and the world of embedded processors is going to expand the number of events worth processing The  Bloor  Group  
  • 37. The Analytics Two-Step The  Bloor  Group  
  • 38. The Future? q  The data growth trend is likely to continue q  More and more companies will be drawn into using Big Data technologies q  Will the two-step become a one-step? Not sure. If you gather Big Data, you also need to be able to throw it away q  RDBMS (column store) will remain as the analytics engine The  Bloor  Group  
  • 39. !  Please explain why you believe that the Sybase IQ shared some things architecture is equal to or better than a shared nothing architecture? !  Are you seeing the same trend that I seem to be noticing with Big Data in respect to analytics? !  Roughly how many of your customers are using Hadoop? !  If I were a Sybase customer would you recommend Hadoop as an ETL mechanism or is it your view that Sybase IQ can do it all? The  Bloor  Group  
  • 40. !  Please describe the most extensive use of Sybase IQ (in respect of data volumes, daily ingest, instances, etc.). !  How difficult is it to use (in other words, what are the labor/DBA overheads compared to a traditional RDBMS)? !  Are your competitors always the usual suspects (i.e. other column store products)? Do you ever compete with the NoSQL crowd? !  Explain how you usually fit with HANA in sites where both products are in use. Is HANA promoting sales of Sybase IQ? The  Bloor  Group  
  • 42. !  This Month: Database !  November: Cloud !  December: Innovators !  January: Big Data !  2013 Editorial Calendar (www.insideanalysis.com) Twitter Tag: #briefr