SlideShare une entreprise Scribd logo
1  sur  31
7 Steps to Build an Oracle
         Big Data Strategy
                    Kurt Lueck
                   March 2013
Contact Presenter


                                                Kurt Lueck
                                                Managing Director, Business Intelligence & Analytics

                                                Email: Kurt.Lueck@pactera.com
                                                Desk: +1.704.944.3155 x240
                                                6100 Fairview Road, Suite 560, Charlotte, NC 28210
                                                Visit our website: www.pactera.com




© Pactera. Confidential. All Rights Reserved.                                                          2
Pactera Snapshot
    NASDAQ: Symbol PACT
    Based in Charlotte NC & Beijing, China
    35 Offices Globally / 24,000 Employees
    Fortune 500 Clients (Financial Services, High Tech, Retail)
    Focus on Driving Innovation (Big Data, Analytics, Mobility, Cloud Solutions)




© Pactera. Confidential. All Rights Reserved.                                   3
Global Footprint and Flexible Delivery Capabilities

        Pactera is a global company strategically headquartered in China, enabling
        partnership with companies seeking to leverage one of the world’s largest and
        fastest-growing technology markets.

          Global FTE: 24,000                    North America & EU: 500 Asia Pacific: 1,000                 Greater China: 22,500



                                                                 London
            Seattle
                                                                                                                                 Changchun

    San Francisco                                                Barcelona                              Beijing                 Dalian
                                                                                                                                                 Tokyo
    Silicon Valley                          Charlotte                                                       Tianjin
                                                                                                                                Qingdao



                                                                                                   Xi’an
                                                                                                                  Nanjing       Wuxi

                                                                                                                                                Osaka
                                        Atlanta
                                                                                                        Wuhan


          San Diego
                                                                                                                               Shanghai
                                                                                         Chengdu     Changsha               Hangzhou


                                                                                                   Guangzhou                           Taiwan
                                                                                                     Dongguan           Hong Kong
                                                                                                                      Shenzhen




                                                                                    Malaysia
                                                                                      Singapore



                                                                                                        Melbourne                                  Sydney


© Pactera. Confidential. All Rights Reserved.                                                                                                               4
Agenda

    1                Definitions (4 Key Functions)


     2               Drivers


     3               Predictions


     4               7 Steps for an Oracle Big Data Strategy


     5               2 Practical Success Stories


     6               Next Steps




© Pactera. Confidential. All Rights Reserved.                  5
Oracle Big Data Solutions – A Preview




© Pactera. Confidential. All Rights Reserved.   6
What Is Big Data?

     Big Data: Oracle’s Definition
      Large volumes, high change velocity, complex variety, unknown value per byte



          The
          Four


          V       s




                              Big Data is high-volume, velocity and variety information
                                assets that demand cost-effective, innovative forms of
                                     information processing for enhanced insight
                                                 and decision making

© Pactera. Confidential. All Rights Reserved.                                             7
Drivers of Big Data

Leveraging dark data represents largest
opportunity to transform business.




Explosion of unstructured data to be
analyzed creates opportunities.
© Pactera. Confidential. All Rights Reserved.   8
Big Data Predictions

                     Through 2014, 20% of enterprise warehouses will add
                     distributed processes


                     By 2015, 20% of Global 1000 organizations will have a
                     strategic focus on information infrastructure equal to that of
                     application management


                       Beginning in 2015, the term ‘big data’ will no longer be a
                       competitive differentiator for technology providers



                     By 2015, big data demand will reach 4.4 million jobs globally
                     but only one third of those jobs will be filled

                                                                    Source: Gartner
© Pactera. Confidential. All Rights Reserved.                                         9
7 Steps To An Oracle Big Data Strategy

               Develop Business Strategy Map

               Align Information Technology to Business

               Identify Resource Needs

               Build Oracle Big Data Technology Stack

               Develop Small Initial Solution

               Evaluate & Correct

               Update Strategy




© Pactera. Confidential. All Rights Reserved.             10
Step 1: Business Strategy Map




                                                 Business
                                                Intelligence
                                                   is about
                                                      the
                                                  business




© Pactera. Confidential. All Rights Reserved.                  11
Step 2: Align IT to Business
                                                             Current State




Information to Business Alignment                  Current State – Data Architecture        Current State – System Architecture




                            Future State –Architecture                            Information Management Roadmap
 © Pactera. Confidential. All Rights Reserved.                                                                           12
Step 3: Identify Resource Needs

    Align Resources to Projects                              Potential Weaknesses:
                                                             • Big Data Skills
                                                             • Predictive Analytics
                                                             • Data Scientist
                                                             • Strong Business Analyst
                                                             • Agile Methodology
                                  Business                   • Project Managers
                                  Expertise

                                                   New
                                                Resources?

                                Technology
                                 Expertise




© Pactera. Confidential. All Rights Reserved.                                            13
Step 4: Build Oracle Big Data Architecture




                                                Oracle Information Management

© Pactera. Confidential. All Rights Reserved.                               14
Step 4: Build Oracle Big Data Architecture
Oracle Information Management - Roadmap

Maturity Model:

•       Stage 1 – Initial

•       Stage 2 – Manage

•       Stage 3 – Advance

•       Stage 4 – Optimize

•       Stage 5 – Innovate




    © Pactera. Confidential. All Rights Reserved.
                                                    Where are you?   15
Step 4: Build Oracle Big Data Architecture

     Oracle Big Data Solutions




© Pactera. Confidential. All Rights Reserved.   16
Step 4: Build Oracle Big Data Architecture

     Why Oracle Big Data Appliance?




© Pactera. Confidential. All Rights Reserved.   17
Step 4: Build Oracle Big Data Architecture




© Pactera. Confidential. All Rights Reserved.   18
Step 4: Build Oracle Big Data Architecture

  Why Oracle Exadata?




                            Data storage is 10x smaller, Scanning process is 2000x faster




© Pactera. Confidential. All Rights Reserved.                                               19
Step 5: Build Oracle Big Data Architecture

  Why Oracle Exalytics?




                                                Traditional OBIEE




© Pactera. Confidential. All Rights Reserved.                       20
Step 5: Build Oracle Big Data Architecture

  Putting it all together




© Pactera. Confidential. All Rights Reserved.   21
Step 4: Build Oracle Big Data Architecture

               Oracle Traditional DB & BI Stack


                                    Oracle R

             Oracle Data Integrator
                     (ETL)                           Oracle
                                                     Loader
                                                                 Apache
                     Oracle NoSQL                     (ETL)                    Apache Hbase
                                                                Cassandra
                         (DB)                                                      (DB)       Apache
                                                                  (DB)
                                                                                               Hive
                                           Storage and Management Capability
                                                                                               (ETL)
                                                     Apache Hadoop
                                                         (HDFS)

                                                       Cloudera Manager

                                                         Oracle Exalytics
                                                         Oracle Exadata
© Pactera. Confidential. All Rights Reserved.                                                          22
Critical Mistakes



            Lack of Expertise

             Big Data is IT project without a problem

             Lack of technology alignment

             Lack of Long-Term Roadmap

             Lack of critical evaluation

© Pactera. Confidential. All Rights Reserved.           23
Story #1 – Travel Cloudera Style
Collecting Data
•     Offline explorer, spiders
•     Web server log files and Web UI scripts
•     Data feed from tools, tealeaf, Omniture feed, etc
•     Data feed from external, such as facebook feed, etc
•     Upstream operational database

Analyzing and Exploiting Data
•     Method, funnel analysis, shopping cart analysis, decision tree, etc
•     Tools, such as Omniture, Google analytics, SSAS, Unica, Weka, etc
•     Analytics of searching engine, such as SEO and SEM reporting

Empower Business with Intelligence
•     Mini-batch
•     Near real time DW/DB
•     A/B and MVT Testing                               Originally, we implement Behavioral Search project intended to capture
•     Recommendation Engine                             customer behavior on line. It captures search parameters from the
                                                        customers using Tealeaf and persists this data in Hadoop. From it, an
•     Finance projection
                                                        analyst would be able to re-tell a story of what the customer searched for,
                                                        what he/she saw, and what he/she did based on the response.
    • High margin comes from the lodging;
    • High degree of merchant hotels are sold in the
                                                         Next, we polished new customer data mart including full roll out of
      1st page of search result;
                                                         individualization, customer segmentations, customer lifetime value calc,
    • Larger families tend to book passenger vans
                                                         and quick lookup of customer purchase details for longer period
      instead of midsize cars

    © Pactera. Confidential. All Rights Reserved.                                                                                     24
Story #1 – Lessons Learned
secs        Data @ Nov. 2012
 1800                                                                                           Hive                Impala 1556
 1600
 1400
 1200
                                                                                                        934
 1000
   800                                                                             667
   600                                                          431                                                        425
   400                                          224                                                     240
                                                                                   151
   200                37                        49               86
                      4
       0
              One Day Query-           One Month Query-   Three Month Query   Six Month Query      One Year Query    Two and half Year
                21GB-24P                 650GB-744P          1.7TB-2047P        2.9TB-2920P         3.8TB-2391P     Query 5.8TB-3500+P



   •       Hadoop Use Cases Moving to Real-Time
   •       71% - Move data from Hadoop to RDBMS for faster and interactive SQL
   •       67% - already query Hadoop using Hive
   •       Impala – Real-Time SQL Queries engine for Hadoop, officially release in Q1, 2013
   •       Query results 4-30x faster than Hive
   •       Support HQL and 100% open source

© Pactera. Confidential. All Rights Reserved.                                                                                      25
Story #2 – Retail Personalization With Big Data




© Pactera. Confidential. All Rights Reserved.     26
Story #2 - Retail Big Data




© Pactera. Confidential. All Rights Reserved.   27
Story #3 – Oracle and Insurance




© Pactera. Confidential. All Rights Reserved.   28
2013 Pactera Focus Area



     1                                           2                                 3                                      4
                                                                                       Putting Big Data                   Visual Performance
       Voice of Customer:                         Predict Your Future:
                                                                                          To Work:                       Management Enabled:


  Large clients are still struggling        Nobody can predict their future      Data volumes are growing fast.        Clients who desire to tie
  with what to do with the other            but using advanced predictive        Customers, partners, and now          individual accountability to
  85% of their data, which is               analytics financial services         even sensor-based systems are         business value drivers can utilize
  unstructured. This unstructured           organizations can apply science to   generating data so quickly that       BPM services to identify metrics
  data is made up of customer               understanding fraudulent             organizations across all industries   and BI & Analytics technology to
  surveys, call center                      activity, customer buying            need new technologies to stay         enable the BPM Strategy.
  discussions, and most recently            behavior, and manage risk etc.       ahead. Organizations must analyze
  social media data. VOC strategies                                              this data to understand and
  help companies manage and gain                                                 improve their business.
  value from this data.

                                                                                  Example: Creating a
                                                Example: Embedding                Big Data Solution to                  Example: Enabling BPM
    Example: Creating
                                                Predictive Analytics into                                               through Visual Analytic
    Customer Buying                                                               analyze customer
                                                Risk Management
    Behavior Solutions                                                            relationship and                      Mgmt Dashboards
                                                solutions
                                                                                  demand data




© Pactera. Confidential. All Rights Reserved.                                                                                                               29
Next Steps

Big Data Consultation                           Big Data Navigator




• 1 Hour Free Phone                             • 2-3 Week Engagement
  Consultation
                                                • High Level Project Plan
• Customized Guidance
  on How to Proceed                             • Cost and Timeline




© Pactera. Confidential. All Rights Reserved.                               30
Thank you




Kurt Lueck
Managing Director, Business Intelligence & Analytics

Email: Kurt.Lueck@pactera.com
Desk: +1.704.944.3155 x240
6100 Fairview Road, Suite 560, Charlotte, NC 28210
Visit our website: www.pactera.com

 © Pactera. Confidential. All Rights Reserved.                     31

Contenu connexe

Similaire à Big Data Webinar

The 10 most valuable sdn solution providers dec jan 2017
The 10 most valuable sdn solution providers dec jan 2017The 10 most valuable sdn solution providers dec jan 2017
The 10 most valuable sdn solution providers dec jan 2017Merry D'souza
 
Why change? Why Open Source? Why Red Hat? Why now?
Why change? Why Open Source? Why Red Hat? Why now?Why change? Why Open Source? Why Red Hat? Why now?
Why change? Why Open Source? Why Red Hat? Why now?Eric D. Schabell
 
Logicalis Annual Review 2010
Logicalis Annual Review 2010Logicalis Annual Review 2010
Logicalis Annual Review 2010Logicalis
 
Wireless Breakfast Briefing
Wireless Breakfast BriefingWireless Breakfast Briefing
Wireless Breakfast BriefingLuke Thomas
 
Building The Next Generation of Connected Smart Contracts
Building The Next Generation of Connected Smart ContractsBuilding The Next Generation of Connected Smart Contracts
Building The Next Generation of Connected Smart ContractsArthur Micoulet
 
Logicalis Annual Review 2010
Logicalis Annual Review 2010Logicalis Annual Review 2010
Logicalis Annual Review 2010Logicalis
 
Bringing Cloud Native Innovation to the Enterprise
Bringing Cloud Native Innovation to the EnterpriseBringing Cloud Native Innovation to the Enterprise
Bringing Cloud Native Innovation to the EnterpriseNicolas (Nick) Barcet
 
LocationSelector.com
LocationSelector.comLocationSelector.com
LocationSelector.comZoe Harries
 
Accenture technology-vision-2013
Accenture technology-vision-2013Accenture technology-vision-2013
Accenture technology-vision-2013ruttens.com
 
Accenture technology vision_2013_feb_18[1]
Accenture technology vision_2013_feb_18[1]Accenture technology vision_2013_feb_18[1]
Accenture technology vision_2013_feb_18[1]Lars Kamp
 
Accenture technology-vision-2013
Accenture technology-vision-2013Accenture technology-vision-2013
Accenture technology-vision-2013ruttens.com
 
Driving change, leading with the SAP®ecosystem
Driving change, leading with the SAP®ecosystemDriving change, leading with the SAP®ecosystem
Driving change, leading with the SAP®ecosystemaccenture
 
Accenture technology-vision-2013
Accenture technology-vision-2013Accenture technology-vision-2013
Accenture technology-vision-2013Francisco Calzado
 
Journey to the Cloud
Journey to the CloudJourney to the Cloud
Journey to the CloudPete Nieminen
 
Accenture Technology Vision for SAP Solutions
Accenture Technology Vision for SAP SolutionsAccenture Technology Vision for SAP Solutions
Accenture Technology Vision for SAP SolutionsAccenture Technology
 
Tynax online platform twtm 30sep11
Tynax online platform twtm 30sep11Tynax online platform twtm 30sep11
Tynax online platform twtm 30sep11H.P. Lem
 
Network Operations | SlideShare | Accenture
Network Operations | SlideShare | AccentureNetwork Operations | SlideShare | Accenture
Network Operations | SlideShare | AccentureAccenture Operations
 

Similaire à Big Data Webinar (20)

The 10 most valuable sdn solution providers dec jan 2017
The 10 most valuable sdn solution providers dec jan 2017The 10 most valuable sdn solution providers dec jan 2017
The 10 most valuable sdn solution providers dec jan 2017
 
Why change? Why Open Source? Why Red Hat? Why now?
Why change? Why Open Source? Why Red Hat? Why now?Why change? Why Open Source? Why Red Hat? Why now?
Why change? Why Open Source? Why Red Hat? Why now?
 
Logicalis Annual Review 2010
Logicalis Annual Review 2010Logicalis Annual Review 2010
Logicalis Annual Review 2010
 
Wireless Breakfast Briefing
Wireless Breakfast BriefingWireless Breakfast Briefing
Wireless Breakfast Briefing
 
Building The Next Generation of Connected Smart Contracts
Building The Next Generation of Connected Smart ContractsBuilding The Next Generation of Connected Smart Contracts
Building The Next Generation of Connected Smart Contracts
 
Logicalis Annual Review 2010
Logicalis Annual Review 2010Logicalis Annual Review 2010
Logicalis Annual Review 2010
 
Bringing Cloud Native Innovation to the Enterprise
Bringing Cloud Native Innovation to the EnterpriseBringing Cloud Native Innovation to the Enterprise
Bringing Cloud Native Innovation to the Enterprise
 
LocationSelector.com
LocationSelector.comLocationSelector.com
LocationSelector.com
 
LocationSelector.com
LocationSelector.comLocationSelector.com
LocationSelector.com
 
Accenture technology-vision-2013
Accenture technology-vision-2013Accenture technology-vision-2013
Accenture technology-vision-2013
 
Accenture technology vision_2013_feb_18[1]
Accenture technology vision_2013_feb_18[1]Accenture technology vision_2013_feb_18[1]
Accenture technology vision_2013_feb_18[1]
 
Accenture technology-vision-2013
Accenture technology-vision-2013Accenture technology-vision-2013
Accenture technology-vision-2013
 
111111
111111111111
111111
 
Driving change, leading with the SAP®ecosystem
Driving change, leading with the SAP®ecosystemDriving change, leading with the SAP®ecosystem
Driving change, leading with the SAP®ecosystem
 
Accenture technology-vision-2013
Accenture technology-vision-2013Accenture technology-vision-2013
Accenture technology-vision-2013
 
Journey to the Cloud
Journey to the CloudJourney to the Cloud
Journey to the Cloud
 
SAP - Achievements and Badges
SAP - Achievements and BadgesSAP - Achievements and Badges
SAP - Achievements and Badges
 
Accenture Technology Vision for SAP Solutions
Accenture Technology Vision for SAP SolutionsAccenture Technology Vision for SAP Solutions
Accenture Technology Vision for SAP Solutions
 
Tynax online platform twtm 30sep11
Tynax online platform twtm 30sep11Tynax online platform twtm 30sep11
Tynax online platform twtm 30sep11
 
Network Operations | SlideShare | Accenture
Network Operations | SlideShare | AccentureNetwork Operations | SlideShare | Accenture
Network Operations | SlideShare | Accenture
 

Plus de Pactera_US

How to Achieve Measurable Benefits Through Project and Organizational Change
How to Achieve Measurable Benefits Through Project and Organizational ChangeHow to Achieve Measurable Benefits Through Project and Organizational Change
How to Achieve Measurable Benefits Through Project and Organizational ChangePactera_US
 
Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Pactera_US
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Pactera_US
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
Predicting Customer Behavior With Big Data
Predicting Customer Behavior With Big Data Predicting Customer Behavior With Big Data
Predicting Customer Behavior With Big Data Pactera_US
 
Pactera Big Data Solutions for Retail
Pactera Big Data Solutions for Retail Pactera Big Data Solutions for Retail
Pactera Big Data Solutions for Retail Pactera_US
 
Siebel to Salesforce
Siebel to Salesforce Siebel to Salesforce
Siebel to Salesforce Pactera_US
 
Business Process Management - Enabling The Business Drivers
Business Process Management - Enabling The Business DriversBusiness Process Management - Enabling The Business Drivers
Business Process Management - Enabling The Business DriversPactera_US
 
How do you monitor your Basel III compliance?
How do you monitor your Basel III compliance? How do you monitor your Basel III compliance?
How do you monitor your Basel III compliance? Pactera_US
 

Plus de Pactera_US (9)

How to Achieve Measurable Benefits Through Project and Organizational Change
How to Achieve Measurable Benefits Through Project and Organizational ChangeHow to Achieve Measurable Benefits Through Project and Organizational Change
How to Achieve Measurable Benefits Through Project and Organizational Change
 
Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks Unlock Big Data's Potential in Financial Services with Hortonworks
Unlock Big Data's Potential in Financial Services with Hortonworks
 
Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data Using Visualization to Succeed with Big Data
Using Visualization to Succeed with Big Data
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Predicting Customer Behavior With Big Data
Predicting Customer Behavior With Big Data Predicting Customer Behavior With Big Data
Predicting Customer Behavior With Big Data
 
Pactera Big Data Solutions for Retail
Pactera Big Data Solutions for Retail Pactera Big Data Solutions for Retail
Pactera Big Data Solutions for Retail
 
Siebel to Salesforce
Siebel to Salesforce Siebel to Salesforce
Siebel to Salesforce
 
Business Process Management - Enabling The Business Drivers
Business Process Management - Enabling The Business DriversBusiness Process Management - Enabling The Business Drivers
Business Process Management - Enabling The Business Drivers
 
How do you monitor your Basel III compliance?
How do you monitor your Basel III compliance? How do you monitor your Basel III compliance?
How do you monitor your Basel III compliance?
 

Dernier

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 

Dernier (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 

Big Data Webinar

  • 1. 7 Steps to Build an Oracle Big Data Strategy Kurt Lueck March 2013
  • 2. Contact Presenter Kurt Lueck Managing Director, Business Intelligence & Analytics Email: Kurt.Lueck@pactera.com Desk: +1.704.944.3155 x240 6100 Fairview Road, Suite 560, Charlotte, NC 28210 Visit our website: www.pactera.com © Pactera. Confidential. All Rights Reserved. 2
  • 3. Pactera Snapshot  NASDAQ: Symbol PACT  Based in Charlotte NC & Beijing, China  35 Offices Globally / 24,000 Employees  Fortune 500 Clients (Financial Services, High Tech, Retail)  Focus on Driving Innovation (Big Data, Analytics, Mobility, Cloud Solutions) © Pactera. Confidential. All Rights Reserved. 3
  • 4. Global Footprint and Flexible Delivery Capabilities Pactera is a global company strategically headquartered in China, enabling partnership with companies seeking to leverage one of the world’s largest and fastest-growing technology markets. Global FTE: 24,000 North America & EU: 500 Asia Pacific: 1,000 Greater China: 22,500 London Seattle Changchun San Francisco Barcelona Beijing Dalian Tokyo Silicon Valley Charlotte Tianjin Qingdao Xi’an Nanjing Wuxi Osaka Atlanta Wuhan San Diego Shanghai Chengdu Changsha Hangzhou Guangzhou Taiwan Dongguan Hong Kong Shenzhen Malaysia Singapore Melbourne Sydney © Pactera. Confidential. All Rights Reserved. 4
  • 5. Agenda 1 Definitions (4 Key Functions) 2 Drivers 3 Predictions 4 7 Steps for an Oracle Big Data Strategy 5 2 Practical Success Stories 6 Next Steps © Pactera. Confidential. All Rights Reserved. 5
  • 6. Oracle Big Data Solutions – A Preview © Pactera. Confidential. All Rights Reserved. 6
  • 7. What Is Big Data? Big Data: Oracle’s Definition Large volumes, high change velocity, complex variety, unknown value per byte The Four V s Big Data is high-volume, velocity and variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making © Pactera. Confidential. All Rights Reserved. 7
  • 8. Drivers of Big Data Leveraging dark data represents largest opportunity to transform business. Explosion of unstructured data to be analyzed creates opportunities. © Pactera. Confidential. All Rights Reserved. 8
  • 9. Big Data Predictions Through 2014, 20% of enterprise warehouses will add distributed processes By 2015, 20% of Global 1000 organizations will have a strategic focus on information infrastructure equal to that of application management Beginning in 2015, the term ‘big data’ will no longer be a competitive differentiator for technology providers By 2015, big data demand will reach 4.4 million jobs globally but only one third of those jobs will be filled Source: Gartner © Pactera. Confidential. All Rights Reserved. 9
  • 10. 7 Steps To An Oracle Big Data Strategy Develop Business Strategy Map Align Information Technology to Business Identify Resource Needs Build Oracle Big Data Technology Stack Develop Small Initial Solution Evaluate & Correct Update Strategy © Pactera. Confidential. All Rights Reserved. 10
  • 11. Step 1: Business Strategy Map Business Intelligence is about the business © Pactera. Confidential. All Rights Reserved. 11
  • 12. Step 2: Align IT to Business Current State Information to Business Alignment Current State – Data Architecture Current State – System Architecture Future State –Architecture Information Management Roadmap © Pactera. Confidential. All Rights Reserved. 12
  • 13. Step 3: Identify Resource Needs Align Resources to Projects Potential Weaknesses: • Big Data Skills • Predictive Analytics • Data Scientist • Strong Business Analyst • Agile Methodology Business • Project Managers Expertise New Resources? Technology Expertise © Pactera. Confidential. All Rights Reserved. 13
  • 14. Step 4: Build Oracle Big Data Architecture Oracle Information Management © Pactera. Confidential. All Rights Reserved. 14
  • 15. Step 4: Build Oracle Big Data Architecture Oracle Information Management - Roadmap Maturity Model: • Stage 1 – Initial • Stage 2 – Manage • Stage 3 – Advance • Stage 4 – Optimize • Stage 5 – Innovate © Pactera. Confidential. All Rights Reserved. Where are you? 15
  • 16. Step 4: Build Oracle Big Data Architecture Oracle Big Data Solutions © Pactera. Confidential. All Rights Reserved. 16
  • 17. Step 4: Build Oracle Big Data Architecture Why Oracle Big Data Appliance? © Pactera. Confidential. All Rights Reserved. 17
  • 18. Step 4: Build Oracle Big Data Architecture © Pactera. Confidential. All Rights Reserved. 18
  • 19. Step 4: Build Oracle Big Data Architecture Why Oracle Exadata? Data storage is 10x smaller, Scanning process is 2000x faster © Pactera. Confidential. All Rights Reserved. 19
  • 20. Step 5: Build Oracle Big Data Architecture Why Oracle Exalytics? Traditional OBIEE © Pactera. Confidential. All Rights Reserved. 20
  • 21. Step 5: Build Oracle Big Data Architecture Putting it all together © Pactera. Confidential. All Rights Reserved. 21
  • 22. Step 4: Build Oracle Big Data Architecture Oracle Traditional DB & BI Stack Oracle R Oracle Data Integrator (ETL) Oracle Loader Apache Oracle NoSQL (ETL) Apache Hbase Cassandra (DB) (DB) Apache (DB) Hive Storage and Management Capability (ETL) Apache Hadoop (HDFS) Cloudera Manager Oracle Exalytics Oracle Exadata © Pactera. Confidential. All Rights Reserved. 22
  • 23. Critical Mistakes Lack of Expertise Big Data is IT project without a problem Lack of technology alignment Lack of Long-Term Roadmap Lack of critical evaluation © Pactera. Confidential. All Rights Reserved. 23
  • 24. Story #1 – Travel Cloudera Style Collecting Data • Offline explorer, spiders • Web server log files and Web UI scripts • Data feed from tools, tealeaf, Omniture feed, etc • Data feed from external, such as facebook feed, etc • Upstream operational database Analyzing and Exploiting Data • Method, funnel analysis, shopping cart analysis, decision tree, etc • Tools, such as Omniture, Google analytics, SSAS, Unica, Weka, etc • Analytics of searching engine, such as SEO and SEM reporting Empower Business with Intelligence • Mini-batch • Near real time DW/DB • A/B and MVT Testing Originally, we implement Behavioral Search project intended to capture • Recommendation Engine customer behavior on line. It captures search parameters from the customers using Tealeaf and persists this data in Hadoop. From it, an • Finance projection analyst would be able to re-tell a story of what the customer searched for, what he/she saw, and what he/she did based on the response. • High margin comes from the lodging; • High degree of merchant hotels are sold in the Next, we polished new customer data mart including full roll out of 1st page of search result; individualization, customer segmentations, customer lifetime value calc, • Larger families tend to book passenger vans and quick lookup of customer purchase details for longer period instead of midsize cars © Pactera. Confidential. All Rights Reserved. 24
  • 25. Story #1 – Lessons Learned secs Data @ Nov. 2012 1800 Hive Impala 1556 1600 1400 1200 934 1000 800 667 600 431 425 400 224 240 151 200 37 49 86 4 0 One Day Query- One Month Query- Three Month Query Six Month Query One Year Query Two and half Year 21GB-24P 650GB-744P 1.7TB-2047P 2.9TB-2920P 3.8TB-2391P Query 5.8TB-3500+P • Hadoop Use Cases Moving to Real-Time • 71% - Move data from Hadoop to RDBMS for faster and interactive SQL • 67% - already query Hadoop using Hive • Impala – Real-Time SQL Queries engine for Hadoop, officially release in Q1, 2013 • Query results 4-30x faster than Hive • Support HQL and 100% open source © Pactera. Confidential. All Rights Reserved. 25
  • 26. Story #2 – Retail Personalization With Big Data © Pactera. Confidential. All Rights Reserved. 26
  • 27. Story #2 - Retail Big Data © Pactera. Confidential. All Rights Reserved. 27
  • 28. Story #3 – Oracle and Insurance © Pactera. Confidential. All Rights Reserved. 28
  • 29. 2013 Pactera Focus Area 1 2 3 4 Putting Big Data Visual Performance Voice of Customer: Predict Your Future: To Work: Management Enabled: Large clients are still struggling Nobody can predict their future Data volumes are growing fast. Clients who desire to tie with what to do with the other but using advanced predictive Customers, partners, and now individual accountability to 85% of their data, which is analytics financial services even sensor-based systems are business value drivers can utilize unstructured. This unstructured organizations can apply science to generating data so quickly that BPM services to identify metrics data is made up of customer understanding fraudulent organizations across all industries and BI & Analytics technology to surveys, call center activity, customer buying need new technologies to stay enable the BPM Strategy. discussions, and most recently behavior, and manage risk etc. ahead. Organizations must analyze social media data. VOC strategies this data to understand and help companies manage and gain improve their business. value from this data. Example: Creating a Example: Embedding Big Data Solution to Example: Enabling BPM Example: Creating Predictive Analytics into through Visual Analytic Customer Buying analyze customer Risk Management Behavior Solutions relationship and Mgmt Dashboards solutions demand data © Pactera. Confidential. All Rights Reserved. 29
  • 30. Next Steps Big Data Consultation Big Data Navigator • 1 Hour Free Phone • 2-3 Week Engagement Consultation • High Level Project Plan • Customized Guidance on How to Proceed • Cost and Timeline © Pactera. Confidential. All Rights Reserved. 30
  • 31. Thank you Kurt Lueck Managing Director, Business Intelligence & Analytics Email: Kurt.Lueck@pactera.com Desk: +1.704.944.3155 x240 6100 Fairview Road, Suite 560, Charlotte, NC 28210 Visit our website: www.pactera.com © Pactera. Confidential. All Rights Reserved. 31

Notes de l'éditeur

  1. Kurt Lueckhas over 20 years of experience within the Business Intelligence and Analytics field. During his consulting career he has worked with over 40 different organizations in multiple industries on a variety of technologies. In his current role, Mr. Lueck manages the BI & Analytics practice for Pactera.    
  2. Good Afternoon and Good Morning on the west Coast.I appreciate everyone’s attendance and sincerely hope that you gather some very valuable information and insights from our presentation today. This is a very exciting topic.As promised, we have our 10 steps, 5 critical mistakes, and 2 success stories but I also wanted to start with some quick definitions, drivers, and key predictions .This presentation was built as a primer and we will be having several follow-up presentations in the coming weeks and months that dive deeper into industry (Financial services and Retail for example) and particular vendor solutions (for example: What is Oracle’s Big Data Solution).Lets get started.
  3. Ok, I feel obligated to start with the 4 V’s. The definition of Big Data has been a work in progress over the past few years. The established definition at this point always has the 3V’s somewhere in the mix. Most recently I have seen another V mentioned but first the traditional 3 V’s.Volume – This is probably the most mentioned. The shear volume of data has been the biggest driver. Velocity – As the saying goes Speed Kills. Social Media put the bullet in most traditional attempts for retail organizations. Other industries such as our Energy client are getting overwhelmed by Smart Grid initiatives. Each industry has their own issues from some new technology.Variety – If it was just traditional data then there probably would not be a neccesity for any of this discussion. However, the fact is we have all different type of data that are simply not handled correctly in the traditional Oracle/DB2/SQL databases. Sure they can store them but they cannot do anything with them very efficiently.What is the fourth V? Value
  4. There is a general consensus that there are 3 drivers of Big Data. The first is something called Dark Data – This is the data that we stored because we had to or wanted to store the data but never used. The thought was we better store it and at some point we might get some value out of it. We never did. This data volume has increased and increased.
  5. I am always doing research and thought these Predictions seem very relevant to our presentation today. These are straight from GartnerI won’t read all of these predictions but the bottom line is the BIG Data IS in a hype cycle….but it IS here to stay. I was recently at the TDWI Conference on BIG DATA and the group was reminded that there have been a number of terms and products that in the beginning were USED in front of every product…. WEB-ENABLED. This is the assumption today. BIG DATA is here to stay for a number of reasons.Enterprise clients MUST engage BIG DATA as a competitive advantage today and later as an equalizer.The last point that I want to drive home is the amount of jobs that will go unfullfilled in the big data arena. If you have any college age kids this is where you should push them. However, I believe it takes a very science oriented mind to really engage this profession.
  6. Most It departments are simply feeling overwhelmed with the amount of data and the amount of pressure from the business to combine data to provide business insight. This can be an incredibly exciting opportunity for IT and business to work together.IF you can gain an understanding of What a Big Data solutions look like then and only then will you be able to determine how Big data can actually help.The chart on this page show IN general which areas are most positively impacted by Big data. Financial Services as usual is right up front on the overall volume of data, velocity of data. Media Services however has a highvariety of data.As an interesting side note, Pactera has worked with Microsoft to develop solutions that will read in videos and decipher them into textual …hence searchable output. This is just one of many example where EVERYTHING is becoming searchable. Pictures, Videos, blogs, and the traditional data Action Item: Look around your enterprise, and identify scenarios where combining and analyzingdiverse datasets will generate substantial business value.
  7. Most It departments are simply feeling overwhelmed with the amount of data and the amount of pressure from the business to combine data to provide business insight. This can be an incredibly exciting opportunity for IT and business to work together.IF you can gain an understanding of What a Big Data solutions look like then and only then will you be able to determine how Big data can actually help.The chart on this page show IN general which areas are most positively impacted by Big data. Financial Services as usual is right up front on the overall volume of data, velocity of data. Media Services however has a highvariety of data.As an interesting side note, Pactera has worked with Microsoft to develop solutions that will read in videos and decipher them into textual …hence searchable output. This is just one of many example where EVERYTHING is becoming searchable. Pictures, Videos, blogs, and the traditional data Action Item: Look around your enterprise, and identify scenarios where combining and analyzingdiverse datasets will generate substantial business value.
  8. The main items that I am worried about for companies is this role called a data scientist. I believe most organizations simply do not have any or enough.What are the key roles of a data scientist?To make a big data project or any analytics project succeed, you actually need a lot of skills. I think of it as a combination of functional skills and technical skills … Most people when they think of data scientists, they think of the technical side. And their minds immediately go to analytics, which is important, but it’s not the whole part of the story. To me it’s 2 Sides:Analytics & DesignSo on the analytics, it’s the things around statistics, operations research, computer science, machine learning in particular is important for data science … But then there’s technology in the sense of being able to understand systems, particularly large systems, because you need to store data all over the place in distributed form, and the ability to program -- to write code that acts as a glue to put all these pieces together. The second functional area is around Design:There’s also the design side of things, which is basically being able to create an interface to the data so people will find it usable, and there's the data side, which is data manipulation, data modeling, data cleansing. So if I got the numbers right, there should be kind of two functional skill sets and four technical skill sets. And all of those need to be combined to make a good data science project work. This is a LOT to ask of ONE person. I believe this set of skills comes from teams of individuals who work on projects together and use each others strengths.
  9. Stage 1 -- Initial At this stage, organizations have sporadic, inconsistent and uncoordinated activities of information management. The main characteristics are:  The organization makes decisions based inaccurate and incomplete information aggregated by various departments/LOBs through inconsistent processes.  Information is fragmented and inconsistent across many different applications and data stores under different LOBs.  Business and IT organizations view information as by product of applications, and usually handled on a project-by-project and department-by-department basis. There is no concept of information ownership and stewardship regarding governance, security or accountability of key information assets. Stage 2 – Manage At this stage, organizations perceive the enterprise information management as necessary to be more effective and efficient for multiple business processes across LOBs. They are taking actions to improve information management but mostly focus on immediate needs, reactively and inconsistently. The main characteristics are: Stage 3 – Advance At this stage, organizations identify information-driven as critical activities for business growth and cost reductions. Therefore, organizations formally establish enterprise information management with support by executive management and actively build these capabilities. The main characteristics are: Stage 4 – Optimize At this stage, organizations complete significant portions of Information Architecture Domain components. The enterprise information becomes pervasive, and part of foundation of business processes to drive profitability and organizational effectiveness. The main characteristics are: Stage 5 – Innovate At this stage, Organizations extend the boundary of entire information ecosystems to external sources and channels to provide innovations in organization growth and drive the market. Information Architecture becomes part of the culture of organizations. The main characteristics are:
  10. Oracle offers a broad portfolio of products to help enterprises acquire, manage, and integrate big data with existing information, with the goal of achieving a complete view of business in the fastest, most reliable, and cost effective way.The Oracle Big Data Appliance is an engineered system of hardware and software designed to help enterprises derive maximum value from their big data strategies. It combines optimized hardware with a comprehensive software stack featuring specialized solutions developed by Oracle to deliver a complete, easy-to-deploy offering for acquiring, organizing and analyzing big data, with enterprise-class performance, availability, supportability, and security. The Oracle Big Data Appliance incorporates Cloudera’s Distribution, including Apache Hadoop with Cloudera Manager, plus an open source distribution of R, all running on Oracle Linux. The Oracle Big Data Appliance comes in a full rack configuration of 18 Oracle Sun servers and scales by connecting multiple racks together via an InfiniBand network, enabling it to acquire, organize, and analyze extreme data volumes. The Oracle Big Data Appliance offers the following benefits:8 Rapid provisioning of a highly-available and scalable system for managing massive amounts of data8 A high-performance platform for acquiring, organizing, and analyzing big data in Hadoop and using R on raw-data sources8 Control of IT costs by pre-integrating all hardware and software components into a single big data solution that complements enterprise data warehousesOracle Big Data Connectors is an optimized software suite to help enterprises integrate data stored in Hadoop or Oracle NoSQL Databases with Oracle Database 11g. It enables very fast data movements between these two environments using Oracle Loader for Hadoop and Oracle Direct Connector for Hadoop Distributed File System (HDFS), while Oracle Data Integrator Application Adapter for Hadoop and Oracle R Connector for Hadoop provide non-Hadoop experts with easier access to HDFS data and MapReduce functionality.
  11. Oracle Big Data Appliance includes a combination of open source software and specialized software developed by Oracle to address enterprise big data requirements. The Oracle Big Data Appliance integrated software includes:  Full distribution of Cloudera’s Distribution including Apache Hadoop (CDH)  Cloudera Manager to administer all aspects of Cloudera CDH  Open source distribution of the statistical package R for analysis of unfiltered data on Oracle Big Data Appliance  Oracle NoSQL Database Community Edition3  And Oracle Enterprise Linux operating system and Oracle Java VM The Oracle Big Data Appliance incorporates Cloudera’s Distribution, including Apache Hadoop with Cloudera Manager, plus an open source distribution of R, all running on Oracle Linux. The Oracle Big Data Appliance comes in a full rack configuration of 18 Oracle Sun servers and scales by connecting multiple racks together via an InfiniBand network, enabling it to acquire, organize, and analyze extreme data volumes. The Oracle Big Data Appliance offers the following benefits:- Rapid provisioning of a highly-available and scalable system for managing massive amounts of dataA high-performance platform for acquiring, organizing, and analyzing big data in Hadoop and using R on raw-data sourcesControl of IT costs by pre-integrating all hardware and software components into a single big data solution that complements enterprise data warehousesIf you are looking for an ORACLE version of Big data literally in a box then this is it!
  12. While Hadoop offers many advantages for organizations, Hadoop is not a wholesale replacement for the traditional relational system and other storage and analysis solutions. Rather, Hadoop is a strong complement to many existing systems. The combination of these technologies offers enterprises tremendous opportunities to maximize IT investments and expand business capabilities by aligning IT workloads to the strengths of each system.
  13. The Oracle Big Data Appliance incorporates Cloudera’s Distribution, including Apache Hadoop with Cloudera Manager, plus an open source distribution of R, all running on Oracle Linux. The Oracle Big Data Appliance comes in a full rack configuration of 18 Oracle Sun servers and scales by connecting multiple racks together via an InfiniBand network, enabling it to acquire, organize, and analyze extreme data volumes. The Oracle Big Data Appliance offers the following benefits:8 Rapid provisioning of a highly-available and scalable system for managing massive amounts of data8 A high-performance platform for acquiring, organizing, and analyzing big data in Hadoop and using R on raw-data sources8 Control of IT costs by pre-integrating all hardware and software components into a single big data solution that complements enterprise data warehouses
  14. Oracle Exalytics In-Memory Machine is purpose-built to deliver the fastest performance for business intelligence (BI) and planning applications. It is designed to provide real-time, speed-of-thought visual analysis, and enable new types of analytic applications so organizations can make decisions faster in the context of rapidly shifting business conditions, while broadening user adoption of BI though introduction of interactive visualization capabilities. Organizations can extend BI initiatives beyond reporting and dashboards to modeling, planning, forecasting, and predictive analytics. The Oracle Exalytics In-Memory Machine is the industry‟s first engineered in-memory analytics machine that delivers extreme performance for Business Intelligence and Enterprise Performance Management applications. The Oracle Exalytics In-Memory Machine hardware is a single server that is optimally configured for in-memory analytics for business intelligence workloads and includes powerful compute capacity, abundant memory, and fast networking options. The Oracle Exalytics In-Memory Machine features an optimized Oracle BI Foundation Suite (Oracle BI Foundation) and Oracle TimesTen In-Memory Database for Exalytics. Business Intelligence Foundation takes advantage of large memory, processors, concurrency, storage, networking, operating system, kernel, and system configuration of the Oracle Exalytics hardware. This optimization results in better query responsiveness, higher user scalability and markedly lower TCO compared to standalone software. The TimesTen In-Memory Database for Exalytics is an optimized in-memory analytic database, with features exclusively available on Oracle Exalytics platform. How does Exalytics and Exadata go together? InfiniBand: Two quad-data rate (QDR) 40 GB/s InfiniBand ports are available with each machine expressly for Oracle Exadata . When connected to Oracle Exadata, Oracle Exalytics becomes an integral part of the Oracle Exadata private InfiniBand network and has high-speed, low latency access to the database servers. When multiple Oracle Exalytics machines are clustered together, the InfiniBand fabric also serves as the high-speed cluster interconnect.
  15. Alright this slide puts it all together.If you are an Oracle shop you have a lot of choices in designing and implementing your big data architecture. Please here me that the Oracle Big Data architecture should be aligned with your BI Strategy and ultimately your business.Big Data is not a standalone concept. Big Data should fit within your existing BI strategy.At the ground level you can start with
  16. The point of this slide is begin to show the nuances and decisions that will have to be made when you design and purchase your Oracle Big Data strategy. At one extreme you can go completely open-source. On the other end, you can go completely Oracle Big Data.Here is a brief outline of Big Data capabilities and their primary technologies: Storage and Management Capability Hadoop Distributed File System (HDFS):  An Apache open source distributed file system, http://hadoop.apache.org  Expected to run on high-performance commodity hardware  Known for highly scalable storage and automatic data replication across three nodes for fault tolerance  Automatic data replication across three nodes eliminates need for backup  Write once, read many times Cloudera Manager:  Cloudera Manager is an end-to-end management application for Cloudera’s Distribution of Apache Hadoop, http://www.cloudera.com  Cloudera Manager gives a cluster-wide, real-time view of nodes and services running; provides a single, central place to enact configuration changes across the cluster; and incorporates a full range of reporting and diagnostic tools to help optimize cluster performance and utilization. An Oracle White Paper in Enterprise Architecture—Information Architecture: An Architect’s Guide to Big Data 8 Database Capability Oracle NoSQL: (Click for more information)  Dynamic and flexible schema design. High performance key value pair database. Key value pair is an alternative to a pre-defined schema. Used for non-predictive and dynamic data.  Able to efficiently process data without a row and column structure. Major + Minor key paradigm allows multiple record reads in a single API call  Highly scalable multi-node, multiple data center, fault tolerant, ACID operations  Simple programming model, random index reads and writes  Not Only SQL. Simple pattern queries and custom-developed solutions to access data such as Java APIs. Apache HBase: (Click for more information)  Allows random, real time read/write access  Strictly consistent reads and writes  Automatic and configurable sharding of tables  Automatic failover support between Region Servers Apache Cassandra: (Click for more information)  Data model offers column indexes with the performance of log-structured updates, materialized views, and built-in caching  Fault tolerance capability is designed for every node, replicating across multiple datacenters  Can choose between synchronous or asynchronous replication for each update Apache Hive: (Click for more information)  Tools to enable easy data extract/transform/load (ETL) from files stored either directly in Apache HDFS or in other data storage systems such as Apache HBase Uses a simple SQL-like query language called HiveQL Query execution via MapReduceAn Oracle White Paper in Enterprise Architecture—Information Architecture: An Architect’s Guide to Big Data 9 Processing Capability MapReduce:  Defined by Google in 2004. (Click here for original paper)  Break problem up into smaller sub-problems  Able to distribute data workloads across thousands of nodes  Can be exposed via SQL and in SQL-based BI tools Apache Hadoop:  Leading MapReduce implementation  Highly scalable parallel batch processing  Highly customizable infrastructure  Writes multiple copies across cluster for fault tolerance Data Integration Capability Oracle Big Data Connectors, Oracle Loader for Hadoop, Oracle Data Integrator: (Click here for Oracle Data Integration and Big Data)  Exports MapReduce results to RDBMS, Hadoop, and other targets  Connects Hadoop to relational databases for SQL processing  Includes a graphical user interface integration designer that generates Hive scripts to move and transform MapReduce results  Optimized processing with parallel data import/export  Can be installed on Oracle Big Data Appliance or on a generic Hadoop cluster Statistical Analysis Capability Open Source Project R and Oracle R Enterprise:  Programming language for statistical analysis (Click here for Project R)  Introduced into Oracle Database as a SQL extension to perform high performance in-database statistical analysis (Click here for Oracle R Enterprise)  Oracle R Enterprise allows reuse of pre-existing R scripts with no modification
  17. This slide discussed the 5 Most Common mistakes that we are seeing within the marketplace.In no particular order:Lack of Expertise – I am actually not referring to the Hadoop or Java expertise that is required. If that was the case most projects never even get started. I am referring more to data scientist type of resources. There are projections from various “critical thinking” organizations such as Gartner who project a significant short-fall of data scientist. The truth is you may have to develop this internally. I would suggest you do that now. I also suggest looking at your universities and hiring graduates from Analytic programs.BIG Data projects without a problem. We are certainly in a hype cycle around Big Data. This is natural with any technology that can be a game changer. More than likely your company does have business problems that can be assisted with BIG Data solutions. The alignment between Savvy business users and technology enabled IT departments is still in the works. Lack of technology alignment – By this I am referring to the fact that it is very easy to begin purchasing point Big Data Solutions for one specific problem. Watch out. This same problem has been happenning for years with out HYPE cycles. Lets get a bit smarter on this CYCLE.This flows directly into my next CAUTION – Develop a longer-term roadmap. If you are going to start a BIG DATA project that means you will be purchase software and may be hiring resources. Before you start, it may be time for a short Big Data strategy. Understand what happens after the first project. I am absolutely in favor of starting with a POC and starting small. However, before large investments think through the 2yr plan. PACTERA’s Big Data Strategy is a quick engagement to review each major business group in an organization and look for detailed problems that may be solved by BIG DATA solutions. It’s a great engagement that has the outcome of a 1-2year plan for implementing BIG DATA>5) Lack of Critical Evaluation – I feel like this has been missing in most IT projects. At the end of the project, did we achieve the expected business goals. If the answer is no then lets figure out why and make improvements.
  18. I now want to present two business cases from real-life projects. The first project is for one of the largest on-line travel organizations in the world. Lets call them Acme OnLine Travel (AOL).Pactera has had a relationship with AOL for over 6years. We built the datawarehouse. We understand the business very well and frankly we understand the weaknesses of the BI solution. The volumes of data were so high and the cost to maintain was growing.The data sources for this client were everything from traditional ERP systems, Click Stream Data to Social Media such as Facebook. It’s not hard to see why the volumes were high. Petabytes is the norm.Part of the main driver for this project was to Reduce cost per TB from which was running at ten thousand USD. So a few years ago we suggested to AOL that we think a BIG Data solution is most likely necessary if we want to continue to be competitive in this industry. It started with some POC’s and then moved into BUILDING ONTO the current BI system at first. We are now beginning to see the natural death of some portions of the traditional BI system. I say natural death because our business users are simply not using some of the old methods. The most interesting and hard-hitting is the Preditive analytic functions that are being built on top of the base hadoop file system.One of the most recent changes is our moving to near REAL-TIME with a newer BIG Data product called Impala. Our team has been working with Impala for the past year or so even before it was officially released. This addresses one of the CRITICAL issues with Big Data and that is the lack of real-time capabilities.
  19. Lets talk about IMPALA for a moment. This graph show our own testing at this client with Petabytes of data. As you can see the performance is quite stark between Hive and Impala. If you know anything about traditional What is also interesting that I wanted to draw out is that DESPITE our success with BIG DATA at this client is that large number of people use HADOOP only to get data so that they can process in a traditional RDBMS. A lot of this is simply because people are more comfortable and end-user tools are more user-friendly on relational/traditional databases. Please keep in mind that when it says FASTER on that 3rd line it is referring to much smaller sets of data that we are placing into RDBMS.In conclusion, the solution provided FASTER , more intelligence insights and the cost is down toless than 2 thousand per TB in Hadoop from 10kUSD.
  20. The final case study that I want to present is around Retail. The picture that you are seeing is the goal of most major Retailers. The goal to drive a marketing and eventual sale down so personalized that it felt like they knew the customer on a one on one basis. Oh and by the way, not to cross the “Creep Factor” line. That is the line where the customer feels violated. This was the case with our client. Our client had a mix of the following types of data:Store POSWeb ClickStreamSocial MediaFinancialA BUNCH of spreadsheetsCustomer Satisfaction dataCall Center DataJust as in the last case study the volume of data was growing and the cost to manage it was growing even faster.The project started with a POC and has now reached into several departments. Examples of business problems / projects include:Customer buying behaviourPrice Optimization – as in changing prices on the web based on behaviourAnd Space planningAll of these projects were accomplished with a Theory, a Model, and A lot of testing. Eventually when good models were built and TESTED significantly then the models were embedded into the clients operational systems. What I have walked away from with these projects and research is how much phycology is required to be successful.This particular client is actually using BIG DATA solutions combined with SAS and several other traditional BI tools.
  21. The key components in this architecture:  Oracle Big Data Appliance (or other Hadoop Solutions): o Powered by the full distribution of Cloudera’s Distribution including Apache Hadoop (CDH) to store logs, reviews, and other related big data  Oracle Big Data Connectors: o Create optimized data sets for efficient loading and analysis in Oracle Database 11g and Oracle Enterprise R  Oracle Database 11g: o External Table: A feature in Oracle database to present data stored in a file system in a table format and can be used in SQL queries transparently.  Traditional SQL Tools: o Oracle SQL Developer: Development tool with graphic user-interface that allows users to access data stored in a relational database using SQL.
  22. The third use case is to continue our discussion of the insurance company mentioned in the earlier section of this paper. In a nutshell, the insurance giant has a need to capture the large amount of sensor data that track their customers’ driving habits, store them in a cost effective manner, process this data to determine trends and identify patterns, and to integrate end results with existing transactional, master, and reference data they are already capturing. The large amount of sensor data needs to be transferred to and stored at the centralized environment that provides flexible data structure, fast processing, as well as scalability and parallelism. MapReduce functions are needed to process the low-density data to identify patterns and trending insights. The end results need to be integrated into the database management system with structured data.
  23. BIG data is not the solution. The solution is some type of use of technology that enables business answers. The four bullets on here represent the 4 focus areas of our BI&Analytic practice in 2013. I believe BIG DATA is the foundation that many of these other solutions.
  24. I love this story because it is so hard hitting ….especially if you have daughters like I do.Most of you have heard the story so I won’t go into all of the details. The basic gist goes something like this.Target started a predictive analytics project that was so successful and accurate that it actually predicted that a Fathers daughter was pregnant before the father knew. Google the story to find the full story if you have not heard it.I wanted to end on this because we all have a corporate responsibility to use our technology without crossing the privacy line with our customers.