SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Tuesday, May 1, 12
Eric.kavanagh@bloorgroup.com




    Twitter Tag: #briefr
Tuesday, May 1, 12
Reveal the essential characteristics of enterprise
                 software, good and bad

                 Provide a forum for detailed analysis of today’s
                 innovative technologies

                 Give vendors a chance to explain their product to
                 savvy analysts

                 Allow audience members to pose serious questions...
                 and get answers!



    Twitter Tag: #briefr
Tuesday, May 1, 12
May: Analytics

                     June: Intelligence

                     July: Governance

                     August: Analytics




     Twitter Tag: #briefr
Tuesday, May 1, 12
Ultimately analytics is about businesses making optimal
                     decisions, although the range of technologies that inhabit
                     this area is wide: statistical analysis, data mining, process
                     mining, predictive analytics, predictive modeling, business
                     process modeling and additionally complex event
                     processing.

                     With the advent of big data, analytics has become “big
                     analytics” with organizations diving into large heaps of data
                     that previously was not available or usable.

                     Open source technologies (Hadoop, etc.) in conjunction with
                     the cloud have expanded the range of what is possible in
                     the cloud and considerably reduced the price of leveraging
                     new and, often very substantial data sources.

     Twitter Tag: #briefr
Tuesday, May 1, 12
Robin Bloor is Chief
                             Analyst at The
                              Bloor Group.



                            Robin.Bloor@Bloorgroup.com




    Twitter Tag: #briefr
Tuesday, May 1, 12
Pervasive Software, a provider of data integration and
               database software, introduced Pervasive DataRush, a
               parallel data flow development platform several years
               ago.

               Aside from marketing that capability it has been using it
               to build data integration and data flow enabled BI
               products that exploits the DataRush capability.

               Pervasive RushAnalyzer is one the new parallel BI products
               that has been built using DataRush. It is aimed squarely at
               solving problems of in the management and analysis of big
               data, and delivering new capabilities.


   Twitter Tag: #briefr
Tuesday, May 1, 12
David Inbar is Senior Director, Pervasive Big Data Products &
                                                 Solutions leading the business and product management
                                                 functions for Pervasive’s Big Data Products group. Previously
                                                 he led the global marketing and international channels
                                                 teams for Pervasive’s Integration Products group as well as
                                                 the company’s Innovation Lab. David has driven innovative
                                                 business models and technology adoption strategies for many
                                                 application development and data management products.


                     Jim Falgout is Chief Technologist, Pervasive Big Data
                     Products and Solutions. As Chief Technologist for Pervasive’s
                     Big Data team, Jim Falgout is responsible for setting
                     innovative design principles that guide Pervasive engineering
                     teams as they develop new big data-focused releases and
                     products. Jim is responsible for the architectural design of a
                     software development platform for parallel applications that
                     deliver high throughput on big data.




   Twitter Tag: #briefr
Tuesday, May 1, 12
May 1, 2012




   Drinking from the Fire Hose:
   Practical Approaches to
   Big Data Preparation and Analytics

   The Briefing Room




bigdata.pervasive.com
The Internet is the Fuel for the Fire




      Source: IBM Corporation



2
The Real Culprit: an Internet of Things




      Source: McKinsey Global Institute report on Big Data, May 2011



3
Big Data Hotspots




4
Big Data Pain Points

     :"##&(*,               -.&/0.&,                   730#+8&,          :"34$%&,
      %"3)*".,                  /."1#&,                    40%/#&,,          .&/".*,
          #"5,                  ,%0*(2,                    %"6&#,             (20.*,
        )35&4*,                ,(#&034&,                 ,,6)4("9&.,      6042;"0.6,
    &9&3*,(0/*$.&,           ,,055.&50*&,                 9)4$0#)8&,         ,,0#&.*,
       6&(.+/*,                  0$6)*,                    /.&6)(*,       (#"4&6,#""/,




                          !"#$%&'!&#"()*+,


               <0*0,C3*&5.0*".4,                                           ?$4)3&44,730#+4*4,
                                             <0*0,=()&3>4*4,               <&()4)"3,@0A&.4,
                7//,<&9&#"/&.4,                        <0*0,730#+4*4,   B/&.0>"30#,C3*&##)5&3(&,




5
Time to Insight Falling Behind Data Growth




6
Big Data Analytics Software Requirements




    Additional Requirements

    •  Must be usable by business users and analysts
        •  Graphical/visual environment
        •  Option to extend via scripting
    •  Scalable and cross-platform: laptop, desktop, Hadoop cluster



7
8
DEMO




9
Pervasive RushAnalyzer: Big Data Prep & Analytics




10
Pervasive RushAnalyzer Key Differentiators




     !    Comprehensive ETL and data preparation
     !    Analytics data scientists will love: machine learning
     !    Works with existing toolsets
     !    No cost to get started
     !    Scales from laptop to server to Hadoop clusters
     !    True distributed computing on Hadoop clusters


11
Twitter Tag: #briefr
Tuesday, May 1, 12
Tuesday, May 1, 12
At the moment Big Data is often managed as “a project on
                     the side” - isolated from the normal data flows associated
                     with data warehousing

                     This situation will not last. Either the large data heaps are
                     ephemeral or they are here to stay. But once your start
                     gathering data you don’t usually stop treated.

                     If the big data heaps are here to stay they require data
                     flow architecture. In that sense the Hadoop - Hive- HBase-
                     Pig arrangement is really just a big prototype.

                     That data flow architecture must serve both big data
                     analysis and traditional data warehousing.

Tuesday, May 1, 12
Tuesday, May 1, 12
We not only have the challenges of big data and big data
                     flow, we also have the problem of data pool proliferation
                     and the opportunities provided by data mashup/discovery

                     If we extrapolate from now we run into a complexity of
                     data flows that can no longer be managed by point-to-
                     point thinking.

                     In effect we get a combinatorial explosion - which
                     dictates the need - in fact the necessity - for data flow
                     architecture and data analysis architecture.

                     If it didn’t deliver value, no-one would do it.


Tuesday, May 1, 12
The PC Revolution, The Internet Revolution, The mobile
                 revolution were all surprises even for those who saw them
                 coming. They all brought more data and more data
                 distribution.

                 The coming Embedded revolution could be characterized
                 as “the web of intelligent things” - things that know their
                 state, report their state, can respond to their state or can
                 respond collectively.

                 Think of:
                     A cup that knows what’s in it
                     A house that knows whose home
                     A car that knows how much you had to drink
Tuesday, May 1, 12
The Challenge is Speed and
                    Complexity
             Big Data has only just begun:

                 Think of current big data
                 projects as the early
                 spreadsheets

                 Data flow architecture is already
                 an issue.

                 Complexity is increasing

                 Speed is the enabler or the
                 barrier
   Twitter Tag: #briefr
Tuesday, May 1, 12
Questions
                     It is not clear to me what product classification this falls
                     under. It appears to be a data flow architecture design and
                     implementation capability. Is that the case?

                     What does RushAnalyzer complement? What does it
                     compete with?

                     What interfaces does it have to different data sources?

                     Clearly this is very fast operationally, because of the
                     underlying parallelism. Can you give us some idea of how
                     this compares in speed terms with, for example, a Hadoop
                     arrangement aimed at a similar set of capabilities

                     What skills are required to make best use of this capability?



   Twitter Tag: #briefr
Tuesday, May 1, 12
Questions
                     Who have been the early adopters of this kind of capability
                     and what kind of business problems are they trying to solve?
                     Which vertical business sectors have shown most interest
                     and which have shown least interest?
                     Quo vadis?




   Twitter Tag: #briefr
Tuesday, May 1, 12
Tuesday, May 1, 12
May: Analytics

             • June: Intelligence
             • July: Governance
             • August: Analytics


     Twitter Tag: #briefr
Tuesday, May 1, 12
Tuesday, May 1, 12

Contenu connexe

Tendances

Assumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slidesAssumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slidesmark madsen
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance Qubole
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data FundamentalsSmarak Das
 
Data Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of PeopleData Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of Peoplemark madsen
 
Big data privacy issues in public social media
Big data privacy issues in public social mediaBig data privacy issues in public social media
Big data privacy issues in public social mediaSupriya Radhakrishna
 
Big Data – Is it a hype or for real?
 Big Data – Is it a hype or for real?  Big Data – Is it a hype or for real?
Big Data – Is it a hype or for real? Dirk Ortloff
 
Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides SlideTeam
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035Neelam Rawat
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018mark madsen
 
Big data issues and challenges
Big data issues and challengesBig data issues and challenges
Big data issues and challengesDilpreet kaur Virk
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analyticsAnirudh
 
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...Dana Gardner
 
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...Dana Gardner
 
Big Data Information Architecture PowerPoint Presentation Slide
Big Data Information Architecture PowerPoint Presentation SlideBig Data Information Architecture PowerPoint Presentation Slide
Big Data Information Architecture PowerPoint Presentation SlideSlideTeam
 
Ibm 1129-the big data zoo
Ibm 1129-the big data zooIbm 1129-the big data zoo
Ibm 1129-the big data zooAccenture
 
Big dataimplementation hadoop_and_beyond
Big dataimplementation hadoop_and_beyondBig dataimplementation hadoop_and_beyond
Big dataimplementation hadoop_and_beyondPatrick Bouillaud
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Thingspateelhs
 

Tendances (20)

Assumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slidesAssumptions about Data and Analysis: Briefing room webcast slides
Assumptions about Data and Analysis: Briefing room webcast slides
 
5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance 5 Factors Impacting Your Big Data Project's Performance
5 Factors Impacting Your Big Data Project's Performance
 
Big Data Fundamentals
Big Data FundamentalsBig Data Fundamentals
Big Data Fundamentals
 
Data Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of PeopleData Architecture: OMG It’s Made of People
Data Architecture: OMG It’s Made of People
 
Big data privacy issues in public social media
Big data privacy issues in public social mediaBig data privacy issues in public social media
Big data privacy issues in public social media
 
Big Data – Is it a hype or for real?
 Big Data – Is it a hype or for real?  Big Data – Is it a hype or for real?
Big Data – Is it a hype or for real?
 
Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides Big Data Ppt PowerPoint Presentation Slides
Big Data Ppt PowerPoint Presentation Slides
 
Big data-analytics-cpe8035
Big data-analytics-cpe8035Big data-analytics-cpe8035
Big data-analytics-cpe8035
 
Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018Architecting a Platform for Enterprise Use - Strata London 2018
Architecting a Platform for Enterprise Use - Strata London 2018
 
Big data issues and challenges
Big data issues and challengesBig data issues and challenges
Big data issues and challenges
 
Notebooks in IBM
Notebooks in IBMNotebooks in IBM
Notebooks in IBM
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analytics
 
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
Loyalty Management Innovator AIMIA's Transformation Journey to Modernized and...
 
Big Data Information Architecture PowerPoint Presentation Slide
Big Data Information Architecture PowerPoint Presentation SlideBig Data Information Architecture PowerPoint Presentation Slide
Big Data Information Architecture PowerPoint Presentation Slide
 
Big Data: Issues and Challenges
Big Data: Issues and ChallengesBig Data: Issues and Challenges
Big Data: Issues and Challenges
 
Ibm 1129-the big data zoo
Ibm 1129-the big data zooIbm 1129-the big data zoo
Ibm 1129-the big data zoo
 
Big dataimplementation hadoop_and_beyond
Big dataimplementation hadoop_and_beyondBig dataimplementation hadoop_and_beyond
Big dataimplementation hadoop_and_beyond
 
big data Big Things
big data Big Thingsbig data Big Things
big data Big Things
 

En vedette

Big data approaches to healthcare systems
Big data approaches to healthcare systemsBig data approaches to healthcare systems
Big data approaches to healthcare systemsShubham Jain
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBernard Marr
 

En vedette (6)

Big data approaches to healthcare systems
Big data approaches to healthcare systemsBig data approaches to healthcare systems
Big data approaches to healthcare systems
 
Big data ppt
Big data pptBig data ppt
Big data ppt
 
What is big data?
What is big data?What is big data?
What is big data?
 
What is Big Data?
What is Big Data?What is Big Data?
What is Big Data?
 
Big Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should KnowBig Data - 25 Amazing Facts Everyone Should Know
Big Data - 25 Amazing Facts Everyone Should Know
 
Big data ppt
Big  data pptBig  data ppt
Big data ppt
 

Similaire à Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

Similaire à Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics (20)

Big Data
Big DataBig Data
Big Data
 
An Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data AnalyticsAn Encyclopedic Overview Of Big Data Analytics
An Encyclopedic Overview Of Big Data Analytics
 
Big data
Big dataBig data
Big data
 
An Overview of BigData
An Overview of BigDataAn Overview of BigData
An Overview of BigData
 
Research paper on big data and hadoop
Research paper on big data and hadoopResearch paper on big data and hadoop
Research paper on big data and hadoop
 
Top 10 renowned big data companies
Top 10 renowned big data companiesTop 10 renowned big data companies
Top 10 renowned big data companies
 
Big Data Analytics
Big Data AnalyticsBig Data Analytics
Big Data Analytics
 
Map Reduce in Big fata
Map Reduce in Big fataMap Reduce in Big fata
Map Reduce in Big fata
 
1
11
1
 
Introduction to Big Data An analogy between Sugar Cane & Big Data
Introduction to Big Data An analogy  between Sugar Cane & Big DataIntroduction to Big Data An analogy  between Sugar Cane & Big Data
Introduction to Big Data An analogy between Sugar Cane & Big Data
 
Complete-SRS.doc
Complete-SRS.docComplete-SRS.doc
Complete-SRS.doc
 
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
DAS Slides: Emerging Trends in Data Architecture – What’s the Next Big Thing?
 
Is Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data ScienceIs Hadoop a Necessity for Data Science
Is Hadoop a Necessity for Data Science
 
Big Data: an introduction
Big Data: an introductionBig Data: an introduction
Big Data: an introduction
 
Big Data-Survey
Big Data-SurveyBig Data-Survey
Big Data-Survey
 
Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big data management
Big data managementBig data management
Big data management
 
Big Data 2.0
Big Data 2.0Big Data 2.0
Big Data 2.0
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big Data
 

Plus de Inside Analysis

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIInside Analysis
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataInside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 

Plus de Inside Analysis (20)

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 

Dernier

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Dernier (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics

  • 2. Eric.kavanagh@bloorgroup.com Twitter Tag: #briefr Tuesday, May 1, 12
  • 3. Reveal the essential characteristics of enterprise software, good and bad Provide a forum for detailed analysis of today’s innovative technologies Give vendors a chance to explain their product to savvy analysts Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr Tuesday, May 1, 12
  • 4. May: Analytics June: Intelligence July: Governance August: Analytics Twitter Tag: #briefr Tuesday, May 1, 12
  • 5. Ultimately analytics is about businesses making optimal decisions, although the range of technologies that inhabit this area is wide: statistical analysis, data mining, process mining, predictive analytics, predictive modeling, business process modeling and additionally complex event processing. With the advent of big data, analytics has become “big analytics” with organizations diving into large heaps of data that previously was not available or usable. Open source technologies (Hadoop, etc.) in conjunction with the cloud have expanded the range of what is possible in the cloud and considerably reduced the price of leveraging new and, often very substantial data sources. Twitter Tag: #briefr Tuesday, May 1, 12
  • 6. Robin Bloor is Chief Analyst at The Bloor Group. Robin.Bloor@Bloorgroup.com Twitter Tag: #briefr Tuesday, May 1, 12
  • 7. Pervasive Software, a provider of data integration and database software, introduced Pervasive DataRush, a parallel data flow development platform several years ago. Aside from marketing that capability it has been using it to build data integration and data flow enabled BI products that exploits the DataRush capability. Pervasive RushAnalyzer is one the new parallel BI products that has been built using DataRush. It is aimed squarely at solving problems of in the management and analysis of big data, and delivering new capabilities. Twitter Tag: #briefr Tuesday, May 1, 12
  • 8. David Inbar is Senior Director, Pervasive Big Data Products & Solutions leading the business and product management functions for Pervasive’s Big Data Products group. Previously he led the global marketing and international channels teams for Pervasive’s Integration Products group as well as the company’s Innovation Lab. David has driven innovative business models and technology adoption strategies for many application development and data management products. Jim Falgout is Chief Technologist, Pervasive Big Data Products and Solutions. As Chief Technologist for Pervasive’s Big Data team, Jim Falgout is responsible for setting innovative design principles that guide Pervasive engineering teams as they develop new big data-focused releases and products. Jim is responsible for the architectural design of a software development platform for parallel applications that deliver high throughput on big data. Twitter Tag: #briefr Tuesday, May 1, 12
  • 9. May 1, 2012 Drinking from the Fire Hose: Practical Approaches to Big Data Preparation and Analytics The Briefing Room bigdata.pervasive.com
  • 10. The Internet is the Fuel for the Fire Source: IBM Corporation 2
  • 11. The Real Culprit: an Internet of Things Source: McKinsey Global Institute report on Big Data, May 2011 3
  • 13. Big Data Pain Points :"##&(*, -.&/0.&, 730#+8&, :"34$%&, %"3)*"., /."1#&, 40%/#&,, .&/".*, #"5, ,%0*(2, %"6&#, (20.*, )35&4*, ,(#&034&, ,,6)4("9&., 6042;"0.6, &9&3*,(0/*$.&, ,,055.&50*&, 9)4$0#)8&, ,,0#&.*, 6&(.+/*, 0$6)*, /.&6)(*, (#"4&6,#""/, !"#$%&'!&#"()*+, <0*0,C3*&5.0*".4, ?$4)3&44,730#+4*4, <0*0,=()&3>4*4, <&()4)"3,@0A&.4, 7//,<&9&#"/&.4, <0*0,730#+4*4, B/&.0>"30#,C3*&##)5&3(&, 5
  • 14. Time to Insight Falling Behind Data Growth 6
  • 15. Big Data Analytics Software Requirements Additional Requirements •  Must be usable by business users and analysts •  Graphical/visual environment •  Option to extend via scripting •  Scalable and cross-platform: laptop, desktop, Hadoop cluster 7
  • 16. 8
  • 18. Pervasive RushAnalyzer: Big Data Prep & Analytics 10
  • 19. Pervasive RushAnalyzer Key Differentiators !  Comprehensive ETL and data preparation !  Analytics data scientists will love: machine learning !  Works with existing toolsets !  No cost to get started !  Scales from laptop to server to Hadoop clusters !  True distributed computing on Hadoop clusters 11
  • 22. At the moment Big Data is often managed as “a project on the side” - isolated from the normal data flows associated with data warehousing This situation will not last. Either the large data heaps are ephemeral or they are here to stay. But once your start gathering data you don’t usually stop treated. If the big data heaps are here to stay they require data flow architecture. In that sense the Hadoop - Hive- HBase- Pig arrangement is really just a big prototype. That data flow architecture must serve both big data analysis and traditional data warehousing. Tuesday, May 1, 12
  • 24. We not only have the challenges of big data and big data flow, we also have the problem of data pool proliferation and the opportunities provided by data mashup/discovery If we extrapolate from now we run into a complexity of data flows that can no longer be managed by point-to- point thinking. In effect we get a combinatorial explosion - which dictates the need - in fact the necessity - for data flow architecture and data analysis architecture. If it didn’t deliver value, no-one would do it. Tuesday, May 1, 12
  • 25. The PC Revolution, The Internet Revolution, The mobile revolution were all surprises even for those who saw them coming. They all brought more data and more data distribution. The coming Embedded revolution could be characterized as “the web of intelligent things” - things that know their state, report their state, can respond to their state or can respond collectively. Think of: A cup that knows what’s in it A house that knows whose home A car that knows how much you had to drink Tuesday, May 1, 12
  • 26. The Challenge is Speed and Complexity Big Data has only just begun: Think of current big data projects as the early spreadsheets Data flow architecture is already an issue. Complexity is increasing Speed is the enabler or the barrier Twitter Tag: #briefr Tuesday, May 1, 12
  • 27. Questions It is not clear to me what product classification this falls under. It appears to be a data flow architecture design and implementation capability. Is that the case? What does RushAnalyzer complement? What does it compete with? What interfaces does it have to different data sources? Clearly this is very fast operationally, because of the underlying parallelism. Can you give us some idea of how this compares in speed terms with, for example, a Hadoop arrangement aimed at a similar set of capabilities What skills are required to make best use of this capability? Twitter Tag: #briefr Tuesday, May 1, 12
  • 28. Questions Who have been the early adopters of this kind of capability and what kind of business problems are they trying to solve? Which vertical business sectors have shown most interest and which have shown least interest? Quo vadis? Twitter Tag: #briefr Tuesday, May 1, 12
  • 30. May: Analytics • June: Intelligence • July: Governance • August: Analytics Twitter Tag: #briefr Tuesday, May 1, 12