SlideShare une entreprise Scribd logo
1  sur  36
Big Data
Jean-Pierre Dijcks
Agenda

•   Big Data
•   Strategy
•   Technology
•   Use Cases
Big Data




3   Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Big Data
            React to an Event                    Pro-Actively Change Outcomes




                         “Technology presents the opportunity
                               to transform business“*
                                 Mark Hurd, President, Oracle

* Oracle Profit Magazine, Volume 17, Number 1
Big Data’s Key Ingredient

  “ Improvement merely lets you      Big Data transforms
  hit the numbers. Creativity is        our business       5%
  what transforms.“*
  Ron Johnson, CEO, JCPenney
                                     Big Data improves
                                        our business       20%

                                     What is Big Data?
                                                           75%


* Fortune Magazine VOL. 165, NO. 4
Big Data Extends the Breadth and Speed of Data

                       Video and Images

Big Data:
Decisions based                                       Documents
on all your data
                    Social Data
                                                 Machine-Generated Data


 Information
 Architectures
 Today:                           Transactions
 Decisions based
 on database data
Big Data Extends the Depth of Analytics




                      Graph Analytics




                                        Statistics



Query and Reporting                                  Data Mining
                                                                           2 miles


                                                                   Spatial Analytics

                                                                                       Text Analytics
Big Data Defined


    Big Data: Techniques and
    Technologies that Enable Enterprises
    to Effectively and Economically
    Analyze All of their Data
Strategy
Strategic Transformations


   Reporting                 Analytics


                            Autonomous
Rear-view Mirror
                              Actions

  Transactional
                             All Data
      Data
Oracle’s Big Data solution
                                                                             Endeca Information Discovery


                                                 Oracle
                                                Big Data                                   Oracle
                                                Appliance                                  Exadata                  Oracle
                                                                              Oracle
                                                                                                                   Exalytics
                                                                             Big Data
                                                                            Connectors
                                                                              InfiniBand             InfiniBand




                                                                                                                                Oracle
           CEP
                                                                                                                               Real-Time
                                                                                                                               Decisions

                                  Acquire                    Organize & Discover               Analyze            Decide


11   Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Oracle Big Data Strategy

                             BI Tools
             Semantic                            Text

       CEP                             Data
        &                                               Advanced
       RTD
                                    Management           Analytics


             Graph                              Spatial
                        Data Discovery Tools

                     Management Infrastructure


     Build      Acquire                 Adopt             Engineer
Technology
Big Data Appliance
Hardware:
       • 288 CPU cores with 1152 GB RAM
       • 648 TB of raw disk storage
       • 40 Gb/s InfiniBand
Integrated Software:
   •   Oracle Linux
   •   Oracle Java VM
   •   Cloudera Distribution of Apache Hadoop (CDH)
   •   Cloudera Manager
   •   Open-source distribution of R
   •   NoSQL Database Community Edition
All integrated software (except NoSQL DB CE) is supported as part of Premier Support for Systems and Premier Support for
Operating Systems
Oracle Big Data Appliance

            File System Mount                   UI Framework                       SDK
                             FUSE-DFS                               HUE                   HUE SDK




                Workflow                          Scheduling                     Metadata
                      APACHE OOZIE                         APACHE OOZIE               APACHE HIVE




                                          Languages / Compilers
                                        APACHE PIG, APACHE HIVE, APACHE MAHOUT
                                                                                    Fast
                Data
                                                                                  Read/Write
             Integration
                                                                                   Access
                APACHE
             FLUME, APACHE                                                         APACHE HBASE
                SQOOP
                                                              HDFS, MAPREDUCE



                                                 Coordination
                                                                                 APACHE ZOOKEEPER
Why Cloudera?

• Includes Open Source Apache Hadoop
  – Fast evolution in critical features
  – Proven at very large scale
• Managed Distribution
  – Components certified to work together in regular updates
  – Cloudera Manager provides Management GUI
• Most popular distribution in the market
Oracle and Cloudera

• All Cloudera software pre-installed and pre-configured
  on BDA
  – Engineered with Cloudera
• All Cloudera assets included
  – Single Oracle Product SKU for HW & SW
  – Single Oracle Support SKU for HW & SW (life of the machine)
• Oracle is the single point of contact for the solution
Price comparison

Oracle Big Data Appliance                                       “Build-Your-Own” – HP hardware and Cloudera
                Year 1     Year 2    Year 3      Total                            Year 1       Year 2     Year 3    Total


                                                                 Servers and
 BDA Cost       $450,000                                                           $428,220
                                                                 switches


 Support
                 $54,000   $54,000     $54,000                   Support Cost      $136,233     $72,000   $72,000
 Cost

 On-site                                                         Installation &
 Installation    $14,150                                         configuration
                                                                 not included

 Total          $518,150   $54,000     $54,000    $626,150       Total             $564,453     $72,000   $72,000    $708,453


Full details at https://blogs.oracle.com/datawarehousing/entry/price_comparison_for_big_data
Oracle NoSQL Database
A distributed, scalable key-value database
•   Simple Data Model
     •   Key-value pair with major+sub-key paradigm              Application       Application
     •   Read/insert/update/delete operations                   NoSQLDB Driver   NoSQLDB Driver
•   Scalability
     •   Dynamic data partitioning and distribution
     •   Optimized data access via intelligent driver
•   High availability
     •   One or more replicas
     •   Disaster recovery through location of replicas
     •   Resilient to partition master failures
     •   No single point of failure
•   Transparent load balancing                            Storage Nodes               Storage Nodes
     •   Reads from master or replicas                    Data Center A               Data Center B
     •   Driver is network topology & latency aware
Big Data Connectors
Optimized integration of Hadoop with Oracle Database
and Oracle Exadata
• Oracle Loader for Hadoop
• Oracle Direct Connector for Hadoop Distributed File System
  (HDFS)
• Oracle Data Integrator Application Adapter for Hadoop
• Oracle R Connector for Hadoop


• Does not require Big Data Appliance – can be licensed for
  Hadoop running on non-Oracle hardware
Oracle Loader for Hadoop
Use The Cluster
     ORACLE LOADER FOR HADOOP
        MAP
                          REDUCE
        MAP                        Last stage in MapReduce
        MAP
               SHUFFLE
                /SORT
                          REDUCE   workflow

                                   Partitioned and non-
       MAP               REDUCE    partitioned tables
       MAP               REDUCE
              SHUFFLE
       MAP     /SORT     REDUCE
                                   Online and offline loads
Oracle Direct Connector for HDFS
Direct Access from Oracle Database

    HDFS               Oracle Database
                                         SQL Query
                                                     SQL access to HDFS
                                    External
                                     Table           External table view

                                                     Data query or import
                              DCH
                             DCH
                                    HDFS
           Infini
                    Band   DCH
                                    Client
Oracle Data Integrator
Simplifying MapReduce

            Oracle
             Data
          Integrator     Automatically generates
                         MapReduce code
           Oracle
          Loader for     Manages the process
           Hadoop
                         Loads into Data Warehouse
What is Data Discovery?
  Simplified


  Quickly explore all relevant data

 Relationships              Advanced search       Structured
  undefined or unknown       Faceted navigation    Semi-structured
 No pre-defined model       Analytics             Unstructured
  required                                          Messy data
 Rapid, iterative change                           Beyond the data
                                                     warehouse
Business Intelligence and Data Discovery
 Complementary Solutions, Integrated Business Processes
                               Known & Clearly               Uncertain or
                              Defined Questions         Open-Ended Questions
                                Who, What, When?           Why, How, What Else?



    Un-modeled Data            Insights yield
                                                          Data Discovery
                              mature models
Diverse and Changing Models      and KPIs
                                                   Fast Answers to New Questions



                                                                  New questions
      Modeled Data                Business Intelligence
                                                                   require new
                                 Proven Answers to Known
 Conforms to a Single Model             Questions
                                                                  data, explorati
                                                                        on
Oracle Endeca Information Discovery
A platform for data discovery applications across the enterprise


                                        Endeca Information Discovery
                                        (EID) helps organizations
                                        quickly explore all relevant data
                                        • Combine structured & unstructured
                                          data from disparate systems
                                        • Rapidly assemble easy to use
                                          analysis applications
                                        • Automatically organize information
                                          for search, discovery & analysis
Big Data: Why Deeper Analytics?
Communications
                        Enhanced churn prediction with social network analytics
                          Consider each customer’s value as part of their social network
                                     Focus retention campaigns on high-value social networks
                                     Identify new prospective high-value customers
                                     Target promotions for upselling and cross-selling to key social network influencers
                                     Identify rotational churners and exclude from retention offers

   Insurance
                       Automated deep analytics for fraud and abuse in insurance claims processing
                                Enhance fraud analytics by considering text data (assessor’s reports, police reports, witness interviews) in
                                   addition to transaction data
                                  Investigate claims that have the highest expected risk (based on likelihood of fraud and claim size)
                                  Focus scarce investigative resources and create feedback loop for automated analysis

     Retail
                         Identify and respond to shifts in behavior
                           Combine past and most recent point-of-sale data with customer information
                             Track and monitor shifts in individual customer behaviors and household purchases
                             Anticipate new up-sell and cross-sell opportunities


    27   |   © 2012 Oracle Corporation
Deeper Analytics:
     Oracle Advanced Analytics
                                     • Oracle Advanced Analytics extends Oracle
                                       Database into a comprehensive analytical platform
                                       – Predictive analytics, data mining, text mining, statistical
                                         analysis, advanced numerical computations
                                     • Scalable and parallel: analyze huge volumes of
                                       data
                                     • Tightly integrated with SQL: share results of
                                       analytics throughout enterprise
                                     • Built for data analysts

28   |   © 2012 Oracle Corporation
Oracle Advanced Analytics: Data Mining
• 12 cutting-edge machine-learning algorithms
       – Parallel model creation
       – Data transformation and preparation for data mining
       – Scalable mode creation
       – Efficiently scoring for large volumes
       – Data Miner GUI to build and evaluate data mining models
• Data Mining can provide valuable results:
   – Predict customer behavior (Classification)
   – Predict or estimate a value (Regression)
   – Segment a population (Clustering)
   – Identify factors more associated with a business
     problem (Attribute Importance)
   – Find profiles of targeted people or items (Decision Trees)
   – Determine important relationships and “market baskets” within the population (Associations)
   – Find fraudulent or “rare events” (Anomaly Detection)
  29   |   © 2012 Oracle Corporation
Oracle Advanced Analytics: Oracle R Enterprise
 • Oracle R Enterprise brings R’s statistical
   functionality closer to the Oracle Database
 1. Eliminate R’s memory constraint by enabling R
    to work directly & transparently on database objects
              – Allows R to run on very large data sets
 2. Architected for Enterprise production infrastructure
              – Automatically exploits database parallelism without require
                parallel R programming
              – Build and immediately deploy
 3. Oracle R leverages the latest R algorithms and packages
              – R is an embedded component of the DBMS server
30   |   © 2012 Oracle Corporation
Use Cases
Big Data Architecture Pattern
                                                                                                                         Analyze

                                                                                                                                                                2 miles

                                     Capture                                     Text Analytics     Statistics        Data Mining   Graph Analytics      Spatial Analytics                     Integrate into
                                                                                                                                                                                                Applications



                                                                                                                                                              Operational Systems


                                                     Real-time Event Detection
                                                                                                                                                            Front End                  Back End
                                     Data Handlers


         Acquire                                                                 Low value density data           Organize
         Real-time &
         Batch Feeds
                                                                                       Algorithms                High value data

                                                                                      Filter
                                                                                      Index                           ETL
                                                                                     Classify
                                                                                     Correlate




                                                                                                                                    Store
                                                                                                    Low density                                                           High value      Semantic
                                                                                         HDFS        value data         NoSQL                         Relational             data          /Spatial




32   |   © 2012 Oracle Corporation
Big Data Examples
              Insurance
                     Individualize auto-insurance policies based on newly captured vehicle telemetry data
                          Insurer gains insight into customer’s driving habits delivering
                                     More accurate assessments of risks
                                     Individualized pricing based on actual individual customer driving habits
                                     Guide and motivate individual customers to improve their driving habits

                  Travel
                     Optimize buying experience through web log and social media data analysis
                           Travel site gains insight into customer preferences and desires
                                     Up-selling products by correlating current sales with (subsequent) browsing behavior
                                     Increase browse-to-buy conversions via customized offers and packages
                                     Deliver personalized travel recommendations based on social media data

                  Games
                      Collect gaming data to optimize spend within and across games
                           Games company gains insight into likes, dislikes and relationships of its users
                                     Enhance games to drive customer spend within games
                                     Recommend other content based on analysis of player connections and similar “likes”
                                     Create special offers or packages based on browsing and (non-)buying behavior


33   |   © 2012 Oracle Corporation
Big Data Use Case: Smart Mall                              Point of Sale Capture:
                  Customer Profile:                           • Coupon used
                  Jane         Send Coupon:                   • 3 items bought (up 1)
Customer enters   Doe, 32, Married of item
                               20%                            • Increased spend (up $10)
mall area based   2 kids (2&4 yrs) used in the
                               when
on Cell Phone                  next 15 minutes
          112 113 114 115 coupons
                  Uses our
                             116 117 118 119                   120
location data                                                              121




                                              126   125 124
                                        127                          123         122

   34   |   © 2012 Oracle Corporation
Big Data Technology Pattern
                                                    Identify User                   Collection &
                                                            Deliver                 Decision Point
                                                            Coupon
                                     Filter                                              Oracle
                                                                         Decision        RTD
                                        CEP
                                                            Enrich
                                              Big Data
             Collection &                     Appliance
             Decision Points                                                                            Models
                                                                                                        Scores
                                                                Analyze

              Streaming
                                                Map                  Big Data                 Analyze
                                               Reduce               Connectors



                            Social
                            Feeds

35   |   © 2012 Oracle Corporation
2012 10 bigdata_overview

Contenu connexe

Tendances

Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Cloudera, Inc.
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Cloudera, Inc.
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Zaloni
 
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiModern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiSlim Baltagi
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseBui Ha
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Data Con LA
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digitalsambiswal
 
Piranha vs. mammoth predator appliances that chew up big data
Piranha vs. mammoth   predator appliances that chew up big dataPiranha vs. mammoth   predator appliances that chew up big data
Piranha vs. mammoth predator appliances that chew up big dataJack (Yaakov) Bezalel
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQLDataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQLDataStax
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]shuwutong
 
Why Data Lake should be the foundation of Enterprise Data Architecture
Why Data Lake should be the foundation of Enterprise Data ArchitectureWhy Data Lake should be the foundation of Enterprise Data Architecture
Why Data Lake should be the foundation of Enterprise Data ArchitectureAgilisium Consulting
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitecturePerficient, Inc.
 
GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017Jeremy Maranitch
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lakeJames Serra
 

Tendances (20)

Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
Hadoop World 2011: I Want to Be BIG - Lessons Learned at Scale - David "Sunny...
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
2022 02 Integration Bootcamp
2022 02 Integration Bootcamp2022 02 Integration Bootcamp
2022 02 Integration Bootcamp
 
SQL Server Disaster Recovery Implementation
SQL Server Disaster Recovery ImplementationSQL Server Disaster Recovery Implementation
SQL Server Disaster Recovery Implementation
 
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
Building a Modern Data Architecture by Ben Sharma at Strata + Hadoop World Sa...
 
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-BaltagiModern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
Modern-Data-Warehouses-In-The-Cloud-Use-Cases-Slim-Baltagi
 
From Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data WarehouseFrom Hadoop to Enterprise Data Warehouse
From Hadoop to Enterprise Data Warehouse
 
Disaster Recovery Site Implementation with MySQL
Disaster Recovery Site Implementation with MySQLDisaster Recovery Site Implementation with MySQL
Disaster Recovery Site Implementation with MySQL
 
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
Big Data Day LA 2015 - Data Lake - Re Birth of Enterprise Data Thinking by Ra...
 
Enterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable DigitalEnterprise Data Lake - Scalable Digital
Enterprise Data Lake - Scalable Digital
 
Piranha vs. mammoth predator appliances that chew up big data
Piranha vs. mammoth   predator appliances that chew up big dataPiranha vs. mammoth   predator appliances that chew up big data
Piranha vs. mammoth predator appliances that chew up big data
 
Data lake
Data lakeData lake
Data lake
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQLDataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
DataStax GeekNet Webinar - Apache Cassandra: Enterprise NoSQL
 
Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]Kb 40 kevin_klineukug_reading20070717[1]
Kb 40 kevin_klineukug_reading20070717[1]
 
Why Data Lake should be the foundation of Enterprise Data Architecture
Why Data Lake should be the foundation of Enterprise Data ArchitectureWhy Data Lake should be the foundation of Enterprise Data Architecture
Why Data Lake should be the foundation of Enterprise Data Architecture
 
Creating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data ArchitectureCreating a Next-Generation Big Data Architecture
Creating a Next-Generation Big Data Architecture
 
GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017GigaOm-sector-roadmap-cloud-analytic-databases-2017
GigaOm-sector-roadmap-cloud-analytic-databases-2017
 
Big data architectures and the data lake
Big data architectures and the data lakeBig data architectures and the data lake
Big data architectures and the data lake
 
Hadoop Trends
Hadoop TrendsHadoop Trends
Hadoop Trends
 

En vedette

2013 05 Oracle big_dataapplianceoverview
2013 05 Oracle big_dataapplianceoverview2013 05 Oracle big_dataapplianceoverview
2013 05 Oracle big_dataapplianceoverviewjdijcks
 
Ephesians 1 3 14
Ephesians 1 3 14Ephesians 1 3 14
Ephesians 1 3 14mfewkes1
 
Swap’s Guide to the Holidays
Swap’s Guide to the HolidaysSwap’s Guide to the Holidays
Swap’s Guide to the Holidayslaurindatracey
 
Hd카메라 빔프로젝트
Hd카메라  빔프로젝트Hd카메라  빔프로젝트
Hd카메라 빔프로젝트leekyusoon
 
Ephesians introduction
Ephesians   introductionEphesians   introduction
Ephesians introductionmfewkes1
 
Ephesians 6 1 24
Ephesians 6 1 24Ephesians 6 1 24
Ephesians 6 1 24mfewkes1
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...jdijcks
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsjdijcks
 

En vedette (8)

2013 05 Oracle big_dataapplianceoverview
2013 05 Oracle big_dataapplianceoverview2013 05 Oracle big_dataapplianceoverview
2013 05 Oracle big_dataapplianceoverview
 
Ephesians 1 3 14
Ephesians 1 3 14Ephesians 1 3 14
Ephesians 1 3 14
 
Swap’s Guide to the Holidays
Swap’s Guide to the HolidaysSwap’s Guide to the Holidays
Swap’s Guide to the Holidays
 
Hd카메라 빔프로젝트
Hd카메라  빔프로젝트Hd카메라  빔프로젝트
Hd카메라 빔프로젝트
 
Ephesians introduction
Ephesians   introductionEphesians   introduction
Ephesians introduction
 
Ephesians 6 1 24
Ephesians 6 1 24Ephesians 6 1 24
Ephesians 6 1 24
 
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
Oracle Openworld Presentation with Paul Kent (SAS) on Big Data Appliance and ...
 
Oracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analyticsOracle Big Data Appliance and Big Data SQL for advanced analytics
Oracle Big Data Appliance and Big Data SQL for advanced analytics
 

Similaire à 2012 10 bigdata_overview

Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendCaserta
 
hadoop 101 aug 21 2012 tohug
 hadoop 101 aug 21 2012 tohug hadoop 101 aug 21 2012 tohug
hadoop 101 aug 21 2012 tohugAdam Muise
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data SolutionsMark Kromer
 
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...yaevents
 
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondTeradata Aster
 
Big Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick BuddenbaumBig Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick BuddenbaumIntelAPAC
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingm_hepburn
 
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value Splunk
 
Big Data Real Time Applications
Big Data Real Time ApplicationsBig Data Real Time Applications
Big Data Real Time ApplicationsDataWorks Summit
 
Big Data launch Singapore Patrick Buddenbaum
Big Data launch Singapore Patrick BuddenbaumBig Data launch Singapore Patrick Buddenbaum
Big Data launch Singapore Patrick BuddenbaumIntelAPAC
 
Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyHitachi Vantara
 
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Cloudera, Inc.
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinerySteve Loughran
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranJAX London
 
Big Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the FutureBig Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the FutureOdinot Stanislas
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsInside Analysis
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopCloudera, Inc.
 

Similaire à 2012 10 bigdata_overview (20)

Introducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & TalendIntroducing the Big Data Ecosystem with Caserta Concepts & Talend
Introducing the Big Data Ecosystem with Caserta Concepts & Talend
 
hadoop 101 aug 21 2012 tohug
 hadoop 101 aug 21 2012 tohug hadoop 101 aug 21 2012 tohug
hadoop 101 aug 21 2012 tohug
 
Anexinet Big Data Solutions
Anexinet Big Data SolutionsAnexinet Big Data Solutions
Anexinet Big Data Solutions
 
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...Контроль зверей: инструменты для управления и мониторинга распределенных сист...
Контроль зверей: инструменты для управления и мониторинга распределенных сист...
 
Hadoop - Now, Next and Beyond
Hadoop - Now, Next and BeyondHadoop - Now, Next and Beyond
Hadoop - Now, Next and Beyond
 
Big Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick BuddenbaumBig Data launch keynote Singapore Patrick Buddenbaum
Big Data launch keynote Singapore Patrick Buddenbaum
 
Apache hadoop bigdata-in-banking
Apache hadoop bigdata-in-bankingApache hadoop bigdata-in-banking
Apache hadoop bigdata-in-banking
 
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
SplunkLive: New Visibility=New Opportunity: How IT Can Drive Business Value
 
Big Data Real Time Applications
Big Data Real Time ApplicationsBig Data Real Time Applications
Big Data Real Time Applications
 
Big Data launch Singapore Patrick Buddenbaum
Big Data launch Singapore Patrick BuddenbaumBig Data launch Singapore Patrick Buddenbaum
Big Data launch Singapore Patrick Buddenbaum
 
Big Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage StrategyBig Data, Big Content, and Aligning Your Storage Strategy
Big Data, Big Content, and Aligning Your Storage Strategy
 
hadoop @ Ibmbigdata
hadoop @ Ibmbigdatahadoop @ Ibmbigdata
hadoop @ Ibmbigdata
 
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
Hadoop World 2011: The Blind Men and the Elephant - Matthew Aslett - The 451 ...
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Big Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the FutureBig Data Beyond Hadoop*: Research Directions for the Future
Big Data Beyond Hadoop*: Research Directions for the Future
 
Big Data
Big DataBig Data
Big Data
 
Cloud computing era
Cloud computing eraCloud computing era
Cloud computing era
 
Left Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise AnalyticsLeft Brain, Right Brain: How to Unify Enterprise Analytics
Left Brain, Right Brain: How to Unify Enterprise Analytics
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
 

Dernier

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 

Dernier (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 

2012 10 bigdata_overview

  • 2. Agenda • Big Data • Strategy • Technology • Use Cases
  • 3. Big Data 3 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
  • 4. Big Data React to an Event Pro-Actively Change Outcomes “Technology presents the opportunity to transform business“* Mark Hurd, President, Oracle * Oracle Profit Magazine, Volume 17, Number 1
  • 5. Big Data’s Key Ingredient “ Improvement merely lets you Big Data transforms hit the numbers. Creativity is our business 5% what transforms.“* Ron Johnson, CEO, JCPenney Big Data improves our business 20% What is Big Data? 75% * Fortune Magazine VOL. 165, NO. 4
  • 6. Big Data Extends the Breadth and Speed of Data Video and Images Big Data: Decisions based Documents on all your data Social Data Machine-Generated Data Information Architectures Today: Transactions Decisions based on database data
  • 7. Big Data Extends the Depth of Analytics Graph Analytics Statistics Query and Reporting Data Mining 2 miles Spatial Analytics Text Analytics
  • 8. Big Data Defined Big Data: Techniques and Technologies that Enable Enterprises to Effectively and Economically Analyze All of their Data
  • 10. Strategic Transformations Reporting Analytics Autonomous Rear-view Mirror Actions Transactional All Data Data
  • 11. Oracle’s Big Data solution Endeca Information Discovery Oracle Big Data Oracle Appliance Exadata Oracle Oracle Exalytics Big Data Connectors InfiniBand InfiniBand Oracle CEP Real-Time Decisions Acquire Organize & Discover Analyze Decide 11 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
  • 12. Oracle Big Data Strategy BI Tools Semantic Text CEP Data & Advanced RTD Management Analytics Graph Spatial Data Discovery Tools Management Infrastructure Build Acquire Adopt Engineer
  • 14. Big Data Appliance Hardware: • 288 CPU cores with 1152 GB RAM • 648 TB of raw disk storage • 40 Gb/s InfiniBand Integrated Software: • Oracle Linux • Oracle Java VM • Cloudera Distribution of Apache Hadoop (CDH) • Cloudera Manager • Open-source distribution of R • NoSQL Database Community Edition All integrated software (except NoSQL DB CE) is supported as part of Premier Support for Systems and Premier Support for Operating Systems
  • 15. Oracle Big Data Appliance File System Mount UI Framework SDK FUSE-DFS HUE HUE SDK Workflow Scheduling Metadata APACHE OOZIE APACHE OOZIE APACHE HIVE Languages / Compilers APACHE PIG, APACHE HIVE, APACHE MAHOUT Fast Data Read/Write Integration Access APACHE FLUME, APACHE APACHE HBASE SQOOP HDFS, MAPREDUCE Coordination APACHE ZOOKEEPER
  • 16. Why Cloudera? • Includes Open Source Apache Hadoop – Fast evolution in critical features – Proven at very large scale • Managed Distribution – Components certified to work together in regular updates – Cloudera Manager provides Management GUI • Most popular distribution in the market
  • 17. Oracle and Cloudera • All Cloudera software pre-installed and pre-configured on BDA – Engineered with Cloudera • All Cloudera assets included – Single Oracle Product SKU for HW & SW – Single Oracle Support SKU for HW & SW (life of the machine) • Oracle is the single point of contact for the solution
  • 18. Price comparison Oracle Big Data Appliance “Build-Your-Own” – HP hardware and Cloudera Year 1 Year 2 Year 3 Total Year 1 Year 2 Year 3 Total Servers and BDA Cost $450,000 $428,220 switches Support $54,000 $54,000 $54,000 Support Cost $136,233 $72,000 $72,000 Cost On-site Installation & Installation $14,150 configuration not included Total $518,150 $54,000 $54,000 $626,150 Total $564,453 $72,000 $72,000 $708,453 Full details at https://blogs.oracle.com/datawarehousing/entry/price_comparison_for_big_data
  • 19. Oracle NoSQL Database A distributed, scalable key-value database • Simple Data Model • Key-value pair with major+sub-key paradigm Application Application • Read/insert/update/delete operations NoSQLDB Driver NoSQLDB Driver • Scalability • Dynamic data partitioning and distribution • Optimized data access via intelligent driver • High availability • One or more replicas • Disaster recovery through location of replicas • Resilient to partition master failures • No single point of failure • Transparent load balancing Storage Nodes Storage Nodes • Reads from master or replicas Data Center A Data Center B • Driver is network topology & latency aware
  • 20. Big Data Connectors Optimized integration of Hadoop with Oracle Database and Oracle Exadata • Oracle Loader for Hadoop • Oracle Direct Connector for Hadoop Distributed File System (HDFS) • Oracle Data Integrator Application Adapter for Hadoop • Oracle R Connector for Hadoop • Does not require Big Data Appliance – can be licensed for Hadoop running on non-Oracle hardware
  • 21. Oracle Loader for Hadoop Use The Cluster ORACLE LOADER FOR HADOOP MAP REDUCE MAP Last stage in MapReduce MAP SHUFFLE /SORT REDUCE workflow Partitioned and non- MAP REDUCE partitioned tables MAP REDUCE SHUFFLE MAP /SORT REDUCE Online and offline loads
  • 22. Oracle Direct Connector for HDFS Direct Access from Oracle Database HDFS Oracle Database SQL Query SQL access to HDFS External Table External table view Data query or import DCH DCH HDFS Infini Band DCH Client
  • 23. Oracle Data Integrator Simplifying MapReduce Oracle Data Integrator Automatically generates MapReduce code Oracle Loader for Manages the process Hadoop Loads into Data Warehouse
  • 24. What is Data Discovery? Simplified Quickly explore all relevant data  Relationships  Advanced search  Structured undefined or unknown  Faceted navigation  Semi-structured  No pre-defined model  Analytics  Unstructured required  Messy data  Rapid, iterative change  Beyond the data warehouse
  • 25. Business Intelligence and Data Discovery Complementary Solutions, Integrated Business Processes Known & Clearly Uncertain or Defined Questions Open-Ended Questions Who, What, When? Why, How, What Else? Un-modeled Data Insights yield Data Discovery mature models Diverse and Changing Models and KPIs Fast Answers to New Questions New questions Modeled Data Business Intelligence require new Proven Answers to Known Conforms to a Single Model Questions data, explorati on
  • 26. Oracle Endeca Information Discovery A platform for data discovery applications across the enterprise Endeca Information Discovery (EID) helps organizations quickly explore all relevant data • Combine structured & unstructured data from disparate systems • Rapidly assemble easy to use analysis applications • Automatically organize information for search, discovery & analysis
  • 27. Big Data: Why Deeper Analytics? Communications Enhanced churn prediction with social network analytics Consider each customer’s value as part of their social network Focus retention campaigns on high-value social networks Identify new prospective high-value customers Target promotions for upselling and cross-selling to key social network influencers Identify rotational churners and exclude from retention offers Insurance Automated deep analytics for fraud and abuse in insurance claims processing Enhance fraud analytics by considering text data (assessor’s reports, police reports, witness interviews) in addition to transaction data Investigate claims that have the highest expected risk (based on likelihood of fraud and claim size) Focus scarce investigative resources and create feedback loop for automated analysis Retail Identify and respond to shifts in behavior Combine past and most recent point-of-sale data with customer information Track and monitor shifts in individual customer behaviors and household purchases Anticipate new up-sell and cross-sell opportunities 27 | © 2012 Oracle Corporation
  • 28. Deeper Analytics: Oracle Advanced Analytics • Oracle Advanced Analytics extends Oracle Database into a comprehensive analytical platform – Predictive analytics, data mining, text mining, statistical analysis, advanced numerical computations • Scalable and parallel: analyze huge volumes of data • Tightly integrated with SQL: share results of analytics throughout enterprise • Built for data analysts 28 | © 2012 Oracle Corporation
  • 29. Oracle Advanced Analytics: Data Mining • 12 cutting-edge machine-learning algorithms – Parallel model creation – Data transformation and preparation for data mining – Scalable mode creation – Efficiently scoring for large volumes – Data Miner GUI to build and evaluate data mining models • Data Mining can provide valuable results: – Predict customer behavior (Classification) – Predict or estimate a value (Regression) – Segment a population (Clustering) – Identify factors more associated with a business problem (Attribute Importance) – Find profiles of targeted people or items (Decision Trees) – Determine important relationships and “market baskets” within the population (Associations) – Find fraudulent or “rare events” (Anomaly Detection) 29 | © 2012 Oracle Corporation
  • 30. Oracle Advanced Analytics: Oracle R Enterprise • Oracle R Enterprise brings R’s statistical functionality closer to the Oracle Database 1. Eliminate R’s memory constraint by enabling R to work directly & transparently on database objects – Allows R to run on very large data sets 2. Architected for Enterprise production infrastructure – Automatically exploits database parallelism without require parallel R programming – Build and immediately deploy 3. Oracle R leverages the latest R algorithms and packages – R is an embedded component of the DBMS server 30 | © 2012 Oracle Corporation
  • 32. Big Data Architecture Pattern Analyze 2 miles Capture Text Analytics Statistics Data Mining Graph Analytics Spatial Analytics Integrate into Applications Operational Systems Real-time Event Detection Front End Back End Data Handlers Acquire Low value density data Organize Real-time & Batch Feeds Algorithms High value data Filter Index ETL Classify Correlate Store Low density High value Semantic HDFS value data NoSQL Relational data /Spatial 32 | © 2012 Oracle Corporation
  • 33. Big Data Examples Insurance Individualize auto-insurance policies based on newly captured vehicle telemetry data Insurer gains insight into customer’s driving habits delivering More accurate assessments of risks Individualized pricing based on actual individual customer driving habits Guide and motivate individual customers to improve their driving habits Travel Optimize buying experience through web log and social media data analysis Travel site gains insight into customer preferences and desires Up-selling products by correlating current sales with (subsequent) browsing behavior Increase browse-to-buy conversions via customized offers and packages Deliver personalized travel recommendations based on social media data Games Collect gaming data to optimize spend within and across games Games company gains insight into likes, dislikes and relationships of its users Enhance games to drive customer spend within games Recommend other content based on analysis of player connections and similar “likes” Create special offers or packages based on browsing and (non-)buying behavior 33 | © 2012 Oracle Corporation
  • 34. Big Data Use Case: Smart Mall Point of Sale Capture: Customer Profile: • Coupon used Jane Send Coupon: • 3 items bought (up 1) Customer enters Doe, 32, Married of item 20% • Increased spend (up $10) mall area based 2 kids (2&4 yrs) used in the when on Cell Phone next 15 minutes 112 113 114 115 coupons Uses our 116 117 118 119 120 location data 121 126 125 124 127 123 122 34 | © 2012 Oracle Corporation
  • 35. Big Data Technology Pattern Identify User Collection & Deliver Decision Point Coupon Filter Oracle Decision RTD CEP Enrich Big Data Collection & Appliance Decision Points Models Scores Analyze Streaming Map Big Data Analyze Reduce Connectors Social Feeds 35 | © 2012 Oracle Corporation

Notes de l'éditeur

  1. http://a964.g.akamaitech.net/7/964/714/ee880cbf1a3897/www.forrester.com/imagesV2/uplmisc/Big_Data_Webinar__final.pdf
  2. Hadoop, you may want to either access that data from Oracle Database by issuing SQL against HDFS files or by moving the data into Oracle tables.Lets start with the latter -- moving the data into Oracle tables. Oracle Loader for Hadoop (or OLH) is a high performance loader for fast movement of data from any Hadoop cluster into Oracle Database tables. Like all other parts the Big Data Connectors, it is available on any Hadoop cluster based on Apache Hadoop in addition to the Big Data Appliance.If you want to take the results and perform additional analysis using advanced BI and data warehousing technologies or incorporate in other applications, OLH is both fast and reduces the processing load on the Database server. It runs as a map reduce job and uses the Hadoop server’s processing resources to sample, sort and pre-partition the data based on the target database metadata. It can automatically take input in delimited text files (CSV) or Hive tables or you can write your own input format. OLH can either directly load the results into the database using the parallel direct path load interface or JDBC, create Oracle formatted Datapump files. OLH has built into load balancing across the reducer nodes that prevents performance from degrading due to unbalanced loads.
  3. Oracle Direct Connector for HDFS makes it possible to access to data on the Hadoop cluster in HDFS from Oracle using SQL. It provides a virtual table view of the HDFS files and the allows for parallel query access to data using the standard Oracle database external table mechanism. If you are using BDA and Exadata, the connectivity occurs using infiniband network fabric so the database access to HDFS, in the very scientific words of the development manager, “flies”. If you need to import the data in HDFS into Oracle, the Direct Connector does not require a file copy and without using Linux Fuse. Instead it uses the native Oracle Loader interface.
  4. If you already use Oracle Data Integrator (or are familiar with this kind of tool and want to use ODI), then it can simplify the MapReduce process.As long as you can describe the transformation that you need to perform on the data, ODI can generate the MapReduce code for you and run that process. It can even invoke Oracle Loader for Hadoop at the end of the cycle.So if you are not an expert in Java, parallel algorithms and the Hadoop framework, there is still a way to use it all to organize your code.Note:ODI generates SQL code which is then passed into Hive (a component of many Hadoop distributions) which generates the actual Java MapReduce codeYou need Big Data Connectors, specifically the ODI Application Adaptor for Hadoop, to make all this work
  5. Our view of the BI landscape is that there are fundamentally two dominant types of problems.On one hand there are questions where we can define up-front both the process and the data required to answer them. What are sales forecasts by region? What is my performance relative to expectation?On the other hand are questions where either the process or the data cannot be defined ahead of time; these questions are open ended by nature. What customers should I target? Why are my sales going down? It's also interesting to point out that these questions are far more transient than the other type, and this follows from their open ended nature. Each question leads to new questions. The interaction model for the former is more like “looking it up”; it’s a report or dashboard. On the other hand, when you don't know exactly what you need or how to ask for ii, the necessary interaction model is exploration and discovery. A dialog with the data.<transition>It also follows that, as a matter of practice, some data is modeled and other data is not. We take modeled to mean that there is a single, overarching semantic model. Of course, modeling costs time and money and so we generally only make the investment in cases where the expected return on that investment is large enough to justify the effort.The cost of storing un-modeled data has continued to drop but importantly, with the popularization of Hadoop, the promise of deriving value from un-modeled data is rising rapidly. The result is an explosion in the capture of un-modeled data.Through this view of the BI landscape we can see how Traditional Business Intelligence and Data Discovery fit in.<transition>Traditional Business Intelligence is purpose built and very strong for known questions and modeled data. Friction arises when organizations attempt to use these products for new and unpredictable questions, which require similarly new and unpredictable data models to meet the need.<transition>In the other space is the emerging market category of data discovery, where the goal is to provide everyday business users with fast answers to new questions to make better, more informed business decisions. Data discovery tools follow several key market trends:First, the growth in data volume, diversity, and complexity. Not much to say here that hasn't already been said. Organizations today are beginning to understand the value inherent in this information and are looking for tools that can unlock that value to give them competitive advantage. And more and more users need to access and understand this information.Second, the consumerization of business software. When IT is unable to deliver, business users are increasingly willing to go outside of IT in order to meet their own needs. Empowered with their choice of tools, and with expectations formed in the consumer world, expectations for amazing user experiences have never been higher.
  6. How do we do it. Endeca Information Discovery provides a full featured platform for creating discovery applications that provide access to all kinds of informationDrilling into the architecture, we accomplish this with three tiers
  7. Notes:This slide is a logical representation of the scope of a Big Data solution. It provides the basis for describing data flows in each stage of the Big Data process in the following slides.The scope of a Big Data solution includes taking actions and decisions on the results of analysis, hence integration with Applications.Real-time event detection can be part of a Big Data solution. This is an important point to draw out because IBM claims it’s Steams capability is a USP, see the book Understanding Big Data, Analytics for Enterprise Class Hadoop and data Streaming.