SlideShare une entreprise Scribd logo
1  sur  4
Copyright © Prime Dimensions 2013 All rights reserved.
Point-of-View
The Evolving On-Demand
Infrastructure for Big Data
0
Copyright © Prime Dimensions 2013 All rights reserved.
Structured
Source Data
Dashboards &
Visualizations
E-T-L
Data
Warehouse
OLAP
Evolving On-Demand Infrastructure
Data
Discovery
Copyright © Prime Dimensions 2013 All rights reserved.
Structured
Source Data
Multi-structured
And Stream
Source Data
Hadoop
Distributed
File System
NoSQL
In-Memory
Database
Bi-directional
node-to-node
integration
Evolving On-Demand Infrastructure
MapReduce
Dashboards &
Visualizations
E-T-L
Data
Warehouse
OLAP
Data
Discovery
Analytic Applications
Embedded
Analytics
E-L-T
Copyright © Prime Dimensions 2013 All rights reserved.
3
Questions?
Our Contact Information
Michael Joseph
Managing Partner
Direct: 703.861.9897
Email: mjoseph@primedimensions.com
Richard Rowan
Managing Director
Direct: 703.201.2641
Email: rrowan@primedimensions.com
Prime Dimensions, LLC
www.primedimensions.com
Data Management | Business Intelligence | Advanced Analytics
Follow us @primedimensions

Contenu connexe

Tendances

Pentaho big data camp - 5 min
Pentaho   big data camp - 5 minPentaho   big data camp - 5 min
Pentaho big data camp - 5 minianfyfe
 
Getting started with Cosmos DB + Linkurious Enterprise
Getting started with Cosmos DB + Linkurious EnterpriseGetting started with Cosmos DB + Linkurious Enterprise
Getting started with Cosmos DB + Linkurious EnterpriseLinkurious
 
Big data landscape map collection by aibdp
Big data landscape map collection by aibdpBig data landscape map collection by aibdp
Big data landscape map collection by aibdpAIBDP
 
Bay Area Hadoop User Group
Bay Area Hadoop User GroupBay Area Hadoop User Group
Bay Area Hadoop User GroupPentaho
 
Unstructured Data Processing
Unstructured Data ProcessingUnstructured Data Processing
Unstructured Data ProcessingJohn Paul
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessDataWorks Summit/Hadoop Summit
 
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQLCouchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQLDATAVERSITY
 
How to visualize Cosmos DB graph data
How to visualize Cosmos DB graph dataHow to visualize Cosmos DB graph data
How to visualize Cosmos DB graph dataLinkurious
 
Working with data using AI based tools
Working with data using AI based toolsWorking with data using AI based tools
Working with data using AI based toolsdhruv_gairola
 
BigData-Architecture
BigData-ArchitectureBigData-Architecture
BigData-ArchitectureNarayana B
 
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...DataStax
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and HadoopGreyCampus
 
How to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using SemanticsHow to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using SemanticsCambridge Semantics
 
Oracle Enterprise Metadata Management
Oracle Enterprise Metadata ManagementOracle Enterprise Metadata Management
Oracle Enterprise Metadata ManagementAndrey Akulov
 
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...Pentaho
 
Stuctured Data Governance
Stuctured Data GovernanceStuctured Data Governance
Stuctured Data GovernanceCraig Adams
 

Tendances (20)

Pentaho big data camp - 5 min
Pentaho   big data camp - 5 minPentaho   big data camp - 5 min
Pentaho big data camp - 5 min
 
Getting started with Cosmos DB + Linkurious Enterprise
Getting started with Cosmos DB + Linkurious EnterpriseGetting started with Cosmos DB + Linkurious Enterprise
Getting started with Cosmos DB + Linkurious Enterprise
 
Big data landscape map collection by aibdp
Big data landscape map collection by aibdpBig data landscape map collection by aibdp
Big data landscape map collection by aibdp
 
Bay Area Hadoop User Group
Bay Area Hadoop User GroupBay Area Hadoop User Group
Bay Area Hadoop User Group
 
Unstructured Data Processing
Unstructured Data ProcessingUnstructured Data Processing
Unstructured Data Processing
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
Data Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awarenessData Aggregation, Curation and analytics for security and situational awareness
Data Aggregation, Curation and analytics for security and situational awareness
 
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQLCouchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
Couchbase and Apache Kafka - Bridging the gap between RDBMS and NoSQL
 
How to visualize Cosmos DB graph data
How to visualize Cosmos DB graph dataHow to visualize Cosmos DB graph data
How to visualize Cosmos DB graph data
 
Solution Architecture - AWS
Solution Architecture - AWSSolution Architecture - AWS
Solution Architecture - AWS
 
Working with data using AI based tools
Working with data using AI based toolsWorking with data using AI based tools
Working with data using AI based tools
 
BigData-Architecture
BigData-ArchitectureBigData-Architecture
BigData-Architecture
 
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
Bloor Research & DataStax: How graph databases solve previously unsolvable bu...
 
Introduction to Big Data and Hadoop
Introduction to Big Data and HadoopIntroduction to Big Data and Hadoop
Introduction to Big Data and Hadoop
 
How to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using SemanticsHow to Build a Smart Data Lake Using Semantics
How to Build a Smart Data Lake Using Semantics
 
Semantic Web For Dummies
Semantic Web For DummiesSemantic Web For Dummies
Semantic Web For Dummies
 
Solution Architecture Cassandra
Solution Architecture CassandraSolution Architecture Cassandra
Solution Architecture Cassandra
 
Oracle Enterprise Metadata Management
Oracle Enterprise Metadata ManagementOracle Enterprise Metadata Management
Oracle Enterprise Metadata Management
 
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
Big Data Integration Webinar: Reducing Implementation Efforts of Hadoop, NoSQ...
 
Stuctured Data Governance
Stuctured Data GovernanceStuctured Data Governance
Stuctured Data Governance
 

En vedette

Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...dipak sahoo
 
Whitepaper 1 - butterfly effect and big data
Whitepaper 1 - butterfly effect and big dataWhitepaper 1 - butterfly effect and big data
Whitepaper 1 - butterfly effect and big dataDerick Jose
 
Montreal info session - Market Data an Market Data Company (MDC) Point of View
Montreal info session - Market Data an Market Data Company (MDC) Point of ViewMontreal info session - Market Data an Market Data Company (MDC) Point of View
Montreal info session - Market Data an Market Data Company (MDC) Point of ViewRobert Benedetto
 
TMX presentation, May 16, 2013
TMX presentation, May 16, 2013TMX presentation, May 16, 2013
TMX presentation, May 16, 2013ONEIA
 
State of Big Data Adoption
State of Big Data AdoptionState of Big Data Adoption
State of Big Data AdoptionQubole
 
Big-Data Server Farm Architecture
Big-Data Server Farm Architecture Big-Data Server Farm Architecture
Big-Data Server Farm Architecture Jordan Chung
 
DWH & big data architecture approaches
DWH & big data architecture approachesDWH & big data architecture approaches
DWH & big data architecture approachesLuxoft
 
Think Big: A New Social Point Of View for Marketing.
Think Big: A New Social Point Of View for Marketing.Think Big: A New Social Point Of View for Marketing.
Think Big: A New Social Point Of View for Marketing.Andy Hunter
 

En vedette (8)

Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
Disruptive Impact of Big Data Analytics on Insurance- Capgemini Australia Poi...
 
Whitepaper 1 - butterfly effect and big data
Whitepaper 1 - butterfly effect and big dataWhitepaper 1 - butterfly effect and big data
Whitepaper 1 - butterfly effect and big data
 
Montreal info session - Market Data an Market Data Company (MDC) Point of View
Montreal info session - Market Data an Market Data Company (MDC) Point of ViewMontreal info session - Market Data an Market Data Company (MDC) Point of View
Montreal info session - Market Data an Market Data Company (MDC) Point of View
 
TMX presentation, May 16, 2013
TMX presentation, May 16, 2013TMX presentation, May 16, 2013
TMX presentation, May 16, 2013
 
State of Big Data Adoption
State of Big Data AdoptionState of Big Data Adoption
State of Big Data Adoption
 
Big-Data Server Farm Architecture
Big-Data Server Farm Architecture Big-Data Server Farm Architecture
Big-Data Server Farm Architecture
 
DWH & big data architecture approaches
DWH & big data architecture approachesDWH & big data architecture approaches
DWH & big data architecture approaches
 
Think Big: A New Social Point Of View for Marketing.
Think Big: A New Social Point Of View for Marketing.Think Big: A New Social Point Of View for Marketing.
Think Big: A New Social Point Of View for Marketing.
 

Similaire à POV on Evolving On-Demand Infrastructure for Big Data

Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataPentaho
 
Organising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldOrganising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldDataWorks Summit/Hadoop Summit
 
Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Precisely
 
Up Your Analytics Game with Pentaho and Vertica
Up Your Analytics Game with Pentaho and Vertica Up Your Analytics Game with Pentaho and Vertica
Up Your Analytics Game with Pentaho and Vertica Pentaho
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarHortonworks
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInMinh-Hoang Nguyen
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with HadoopPrecisely
 
Pentaho Analytics on MongoDB
Pentaho Analytics on MongoDBPentaho Analytics on MongoDB
Pentaho Analytics on MongoDBMark Kromer
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution ShowcaseInside Analysis
 
Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展Etu Solution
 
Incorporating cloud computing for enhanced communication v2
Incorporating cloud computing for enhanced communication v2Incorporating cloud computing for enhanced communication v2
Incorporating cloud computing for enhanced communication v2Christian Verstraete
 
Oracle Big Data Governance Webcast Charts
Oracle Big Data Governance Webcast ChartsOracle Big Data Governance Webcast Charts
Oracle Big Data Governance Webcast ChartsJeffrey T. Pollock
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPDr Geetha Mohan
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Denodo
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Cloudera, Inc.
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Inside Analysis
 
Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jrJonathan Raspaud
 

Similaire à POV on Evolving On-Demand Infrastructure for Big Data (20)

Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
 
Organising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldOrganising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data World
 
Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Hadoop Perspectives for 2017
Hadoop Perspectives for 2017
 
Up Your Analytics Game with Pentaho and Vertica
Up Your Analytics Game with Pentaho and Vertica Up Your Analytics Game with Pentaho and Vertica
Up Your Analytics Game with Pentaho and Vertica
 
Hortonworks and HP Vertica Webinar
Hortonworks and HP Vertica WebinarHortonworks and HP Vertica Webinar
Hortonworks and HP Vertica Webinar
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedIn
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with Hadoop
 
Pentaho Analytics on MongoDB
Pentaho Analytics on MongoDBPentaho Analytics on MongoDB
Pentaho Analytics on MongoDB
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 
Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展Cloudera 助力台灣大數據產業的發展
Cloudera 助力台灣大數據產業的發展
 
Incorporating cloud computing for enhanced communication v2
Incorporating cloud computing for enhanced communication v2Incorporating cloud computing for enhanced communication v2
Incorporating cloud computing for enhanced communication v2
 
Oracle Big Data Governance Webcast Charts
Oracle Big Data Governance Webcast ChartsOracle Big Data Governance Webcast Charts
Oracle Big Data Governance Webcast Charts
 
INTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOPINTRODUCTION TO BIG DATA AND HADOOP
INTRODUCTION TO BIG DATA AND HADOOP
 
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
Self Service Analytics and a Modern Data Architecture with Data Virtualizatio...
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
Limitless Data, Rapid Discovery, Powerful Insight: How to Connect Cloudera to...
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?
 
Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jr
 

Dernier

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 

Dernier (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

POV on Evolving On-Demand Infrastructure for Big Data

  • 1. Copyright © Prime Dimensions 2013 All rights reserved. Point-of-View The Evolving On-Demand Infrastructure for Big Data 0
  • 2. Copyright © Prime Dimensions 2013 All rights reserved. Structured Source Data Dashboards & Visualizations E-T-L Data Warehouse OLAP Evolving On-Demand Infrastructure Data Discovery
  • 3. Copyright © Prime Dimensions 2013 All rights reserved. Structured Source Data Multi-structured And Stream Source Data Hadoop Distributed File System NoSQL In-Memory Database Bi-directional node-to-node integration Evolving On-Demand Infrastructure MapReduce Dashboards & Visualizations E-T-L Data Warehouse OLAP Data Discovery Analytic Applications Embedded Analytics E-L-T
  • 4. Copyright © Prime Dimensions 2013 All rights reserved. 3 Questions? Our Contact Information Michael Joseph Managing Partner Direct: 703.861.9897 Email: mjoseph@primedimensions.com Richard Rowan Managing Director Direct: 703.201.2641 Email: rrowan@primedimensions.com Prime Dimensions, LLC www.primedimensions.com Data Management | Business Intelligence | Advanced Analytics Follow us @primedimensions

Notes de l'éditeur

  1. Hi my name is…Prime Dimensions is…Our host today is…Housekeeping…
  2. Store and Analyze ApproachPerceived high value per byte – rigorous cleansing and transformationThe store and analyze approach integrates source data into a consolidated data storebefore it is analyzed. This approach is used by a traditional data warehousing systemto create data analytics. In a data warehousing system, the consolidated data store isusually an enterprise data warehouse or data mart managed by a relational ormultidimensional DBMS. The advantages of this approach are improved dataintegration and data quality management, plus the ability to maintain historicalinformation. The disadvantages are additional data storage requirements and thelatency introduced by the data integration task.What is a data warehouse?In the 1990s, Bill Inmon defined a design known as a data warehouse. In 2005, Gartner clarified and updated those definitions. From these we summarize that a data warehouse is:1. Subject oriented: The data is modeled after business concepts, organizing them into subjects areas like sales, finance, and inventory. Each subject area contains detailed data.2. Integrated: The logical model is integrated and consistent. Data formats and values are standardized. Thus, dates are in the same format, male/female codes are consistent, etc. More important, all subject areas use the same customer record, not copies.3. Nonvolatile: Data is stored in the data warehouse unmodified, and retained for long periods of time.4. Time variant: When changes to a record are needed, new versions of the record are captured using effective dates or temporal functions.5. Not virtual: The data warehouse is a physical, persistent repository.http://searchdatamanagement.techtarget.com/definition/OLAPOLAP (online analytical processing) is computer processing that enables a user to easily and selectively extract and view data from different points of view. For example, a user can request that data be analyzed to display a spreadsheet showing all of a company's beach ball products sold in Florida in the month of July, compare revenue figures with those for the same products in September, and then see a comparison of other product sales in Florida in the same time period. To facilitate this kind of analysis, OLAP data is stored in amultidimensional database. Whereas a relational database can be thought of as two-dimensional, a multidimensional database considers each data attribute (such as product, geographic sales region, and time period) as a separate "dimension." OLAP software can locate the intersection of dimensions (all products sold in the Eastern region above a certain price during a certain time period) and display them. Attributes such as time periods can be broken down into subattributes.OLAP can be used for data mining or the discovery of previously undiscerned relationships between data items. An OLAP database does not need to be as large as a data warehouse, since not all transactional data is needed for trend analysis. Using Open Database Connectivity (ODBC), data can be imported from existing relational databases to create a multidimensional database for OLAP.Two leading OLAP products are Hyperion Solution's Essbase and Oracle's Express Server. OLAP products are typically designed for multiple-user environments, with the cost of thesoftware based on the number of users.Data Warehouse DifferentiatorsAfter nearly 30 years of investment, refinement and growth, the list of features available in a data warehouse is quite staggering. Built upon relational database technology using schemas and integrating Business Intelligence (BI) tools, the major differences in this architecture are:> Data warehouse performance> Integrated data that provides business value> Interactive BI tools for end usersData Warehouse PerformanceBasic indexing, found in open source databases, such as MySQL or Postgres, is a standard feature used to improve query response times or enforce constraints on data. More advanced forms such as materialized views, aggregate join indexes, cube indexes and sparse join indexes enable numerous performance gains in data warehouses. However, the most important performance enhancement to date is the cost-based optimizer. The optimizer examines incoming SQL and considers multiple plans for executing each query as fast as possible. It achieves this by comparing the SQL request to the database design and extensive data statistics that help identify the best combination of execution steps. In essence, the optimizer is like having a genius programmer examine every query and tune it for the best performance. Lacking an optimizer or data demographic statistics, a query that could run in minutes may take hours, even with many indexes. For this reason, database vendors are constantly adding new index types, partitioning, statistics, and optimizer features. For the past 30 years, every software release has been a performance release.Integrating Data: the Raison d’ÊtreAt the heart of any data warehouse is the promise to answer essential business questions. Integrated data is the unique foundation required to achieve this goal. Pulling data from multiple subject areas and numerous applications into one repository is the raison d’être for data warehouses. Data model designers and ETL architects armed with metadata, data cleansing tools, and patience must rationalize data formats, source systems, and semantic meaning of the data to make it understandable and trustworthy. This creates a common vocabulary within the corporation so that critical concepts such as “customer,” “end of month,”or “price elasticity,” are uniformly measured and understood. Nowhere else in the entire IT data center is data collected, cleaned, and integrated as it is in the data warehouse.Interactive BI ToolsBI tools such as MicroStrategy, Tableau, IBM Cognos, and others provide business users with direct access to data warehouse insights. First, the business user can create reports and complex analysis quickly and easily using these tools. As a result, there is a trend in many data warehouse sites towards end-user self service. Business users can easily demand more reports than IT has staffing to provide. More important than self service however, is that the users become intimately familiar with the data. They can run a report, discover they missed a metric or filter, make an adjustment, and run their report again all within minutes. This process results in significant changes in business users’ understanding the business and their decision-making process. First, users stop asking trivial questions and start asking more complex strategic questions. Generally, the more complex and strategic the report, the more revenue and cost savings the user captures. This leads to some users becoming “power users” in a company. These individuals become wizards at teasing business value from the data and supplying valuable strategic information to the executive staff. Every data warehouse has anywhere from two to 20 power users.Query performance with BI tools lowers the analytic pain threshold. If it takes 24 hours to ask and get an answer, users only ask once. If it takes minutes, they will ask dozens of questions. For example, a major retailer was comparing stock-on-hand to planned newspaper coupon advertising. Initially they ran an eight-hour report that analyzed hundreds of stores. One power user saw they could make more money if the advertising was customized for stores by geographic region. By adding filters and constraints and selecting small groups of regional stores, the by-region query ran in two minutes. They added more constraints and filters and ran it again. They discovered that inventory and regional preferences would sell more and increase profits. Where an eight-hour query was discouraging, two-minute queries were an enabler. The power user was then willing to spend a few hours analyzing each region for the best sales, inventory, and profit mix. The lower pain threshold to analytics was enabled by data warehouse performance and the interactivity of the BI tools.
  3. Distributed DW architecture. The issue in a multi-workload environment is whether a single-platform data warehouse can be designed and optimized such that all workloads run optimally, even when concurrent. More DW teams are concluding that a multi-platform data warehouse environment is more cost-effective and flexible. Plus, some workloads receive better optimization when moved to a platform beside the data warehouse. In reaction, many organizations now maintain a core DW platform for traditional workloads but offload other workloads to other platforms. For example, data and processing for SQL-based analytics are regularly offloaded to DW appliances and columnar DBMSs. A few teams offload workloads for big data and advanced analytics to HDFS, discovery platforms, MapReduce, and similar platforms. The result is a strong trend toward distributed DW architectures, where many areas of the logical DW architecture are physically deployed on standalone platforms instead of the core DW platform. Big Data requires a new generation of scalable technologies designed to extract meaning from very large volumes of disparate, multi-structured data by enabling high velocity capture, discovery, and analysisSource of second graphic: http://www.saama.com/blog/bid/78289/Why-large-enterprises-and-EDW-owners-suddenly-care-about-BigDatahttp://www.cloudera.com/content/dam/cloudera/Resources/PDF/Hadoop_and_the_Data_Warehouse_Whitepaper.pdfComplex Hadoop jobs can use the data warehouse as a data source, simultaneously leveraging the massively parallel capabilities of two systems. Any MapReduce program can issue SQL statements to the data warehouse. In one context, a MapReduce program is “just another program,” and the data warehouse is “just another database.” Now imagine 100 MapReduce programs concurrently accessing 100 data warehouse nodes in parallel. Both raw processing and the data warehouse scale to meet any big data challenge. Inevitably, visionary companies will take this step to achieve competitive advantages.Promising Uses of Hadoop that Impact DW Architectures I see a handful of areas in data warehouse architectures where HDFS and other Hadoop products have the potential to play positive roles: Data staging. A lot of data processing occurs in a DW’s staging area, to prepare source data for specific uses (reporting, analytics, OLAP) and for loading into specific databases (DWs, marts, appliances). Much of this processing is done by homegrown or tool-based solutions for extract, transform, and load (ETL). Imagine staging and processing a wide variety of data on HDFS. For users who prefer to hand-code most of their solutions for extract, transform, and load (ETL), they will most likely feel at home in code-intense environments like Apache MapReduce. And they may be able to refactor existing code to run there. For users who prefer to build their ETL solutions atop a vendor tool, the community of vendors for ETL and other data management tools is rolling out new interfaces and functions for the entire Hadoop product family. Note that I’m assuming that (whether you use Hadoop or not), you should physically locate your data staging area(s) on standalone systems outside the core data warehouse, if you haven’t already. That way, you preserve the core DW’s capacity for what it does best: squeaky clean, well modeled data (with an audit trail via metadata and master data) for standard reports, dashboards, performance management, and OLAP. In this scenario, the standalone data staging area(s) offload most of the management of big data, archiving source data, and much of the data processing for ETL, data quality, and so on. Data archiving. When organizations embrace forms of advanced analytics that require detail source data, they amass large volumes of source data, which taxes areas of the DW architecture where source data is stored. Imagine managing detailed source data as an archive on HDFS. You probably already do archiving with your data staging area, though you probably don’t call it archiving. If you think of it as an archive, maybe you’ll adopt the best practices of archiving, especially information lifecycle management (ILM), which I feel is valuable but woefully vacant from most DWs today. Archiving is yet another thing the staging area in a modern DW architecture must do, thus another reason to offload the staging area from the core DW platform. Traditionally, enterprises had three options when it came to archiving data: leave it within a relational database, move it to tape or optical disk, or delete it. Hadoop’s scalability and low cost enable organizations to keep far more data in a readily accessible online environment. An online archive can greatly expand applications in business intelligence, advanced analytics, data exploration, auditing, security, and risk management. Multi-structured data. Relatively few organizations are getting BI value from semi- and unstructured data, despite years of wishing for it. Imagine HDFS as a special place within your DW environment for managing and processing semi-structured and unstructured data. Another way to put it is: imagine not stretching your RDBMS-based DW platform to handle data types that it’s not all that good with. One of Hadoop’s strongest complements to a DW is its handling of semi- and unstructured data. But don’t go thinking that Hadoop is only for unstructured data: HDFS handles the full range of data, including structured forms, too. In fact, Hadoop can manage just about any data you can store in a file and copy into HDFS. Processing flexibility. Given its ability to manage diverse multi-structured data, as I just described, Hadoop’sNoSQL approach is a natural framework for manipulating non-traditional data types. Note that these data types are often free of schema or metadata, which makes them challenging for SQL-based relational DBMSs. Hadoop supports a variety of programming languages (Java, R, C), thus providing more capabilities than SQL alone can offer. In addition, Hadoop enables the growing practice of “late binding”. Instead of transforming data as it’s ingested by Hadoop (the way you often do with ETL for data warehousing), which imposes an a priori model on data, structure is applied at runtime. This, in turn, enables the open-ended data exploration and discovery analytics that many users are looking for today. Advanced analytics. Imagine HDFS as a data stage, archive, or twenty-first-century operational data store that manages and processes big data for advanced forms of analytics, especially those based on MapReduce, data mining, statistical analysis, and natural language processing (NLP). There’s much to say about this; in a future blog I’ll drill into how advanced analytics is one of the strongest influences on data warehouse architectures today, whether Hadoop is in use or not.Analyze and Store Approach (ELT?)The analyze and store approach analyzes data as it flows through businessprocesses, across networks, and between systems. The analytical results can then bepublished to interactive dashboards and/or published into a data store (such as a datawarehouse) for user access, historical reporting and additional analysis. Thisapproach can also be used to filter and aggregate big data before it is brought into adata warehouse.There are two main ways of implementing the analyze and store approach:• Embedding the analytical processing in business processes. This techniqueworks well when implementing business process management and serviceorientedtechnologies because the analytical processing can be called as a servicefrom the process workflow. This technique is particularly useful for monitoring andanalyzing business processes and activities in close to real-time – action times ofa few seconds or minutes are possible here. The process analytics created canalso be published to an operational dashboard or stored in a data warehouse forsubsequent use.• Analyzing streaming data as it flows across networks and between systems.This technique is used to analyze data from a variety of different (possiblyunrelated) data sources where the volumes are too high for the store and analyzeapproach, sub-second action times are required, and/or where there is a need toanalyze the data streams for patterns and relationships. To date, many vendorshave focused on analyzing event streams (from trading systems, for example)using the services of a complex event processing (CEP) engine, but this style ofprocessing is evolving to support a wider variety of streaming technologies anddata. Creates stream analytics from many types of streamingdata such as event, video and GPS data.The benefits of the analyze and store approach are fast action times and lower datastorage overheads because the raw data does not have to be gathered andconsolidated before it can be analyzed.using HiveQL to create a load-ready file for a relational database.