SlideShare a Scribd company logo
1 of 14
Integrating Hadoop into Data
Warehousing Architecture
Where is the Wisdom? Lost in the Knowledge.
Where is the Knowledge? Lost in the Information.
T.S. Eliot
© Humza Naseer, University of Melbourne 2014
Outline
Findings,
Conclusion &
Future Work
Current Work:
Hadoop Integration
into Data Warehouse
Environment
Related Work:
Trends in Data
Warehouse
Architecture
Link Between Hadoop
and Data Warehouse
Introduction
© Humza Naseer, University of Melbourne 2014 2
Identify all possible enterprise data assets
Select those assets that have actionable content and can be
accessed
Bring the data assets into a logically centralized “enterprise
data warehouse”
Expose those data assets most effectively for decision
making
(Kimball & Ross, 2013)
Intro: The Data Warehouse Mission
© Humza Naseer, University of Melbourne 2014 3
Hadoop is an Ecosystem of products
 Open source
 Vendor distributions
 Additional tools for development and administration
Hadoop Benefits
 Enables big data analytics
 Supports advanced forms of analytics
 Scales cost effectively
 Extends a data warehouse environment
Hadoop Limitations
• Low latency queries
• Ease of access
• Data integration and integrity
• Fine grained security
Intro: Overview of Hadoop
Unstructured
Data
Query Results
HDFS
Data Nodes
Map Reduce
© Humza Naseer, University of Melbourne 2014 4
A data warehouse system fetches and unifies data from
heterogeneous source systems into a centralized dimensional
or normalized data repository
(Rainardi, 2008)
Data warehouse is not a tool or technology
 It is a business process which unifies an enterprise through data
(Eckerson, 2012)
Hadoop a problem or an opportunity?
Where Hadoop fits into data warehouse architecture?
Link Between Hadoop and Data
Warehouse
© Humza Naseer, University of Melbourne 2014 5
Traditional RDBMSs cannot handle
 The new data types
 Extended analytic processing
 Terabytes/hour loading with immediate query access
We want to use SQL, but we don’t want the RDBMS storage
constraints
The disruptive solution: Hadoop (Kimball & Ross, 2013)
Why is Integration Happening?
DB1
DB2
DB3
Transformation
and Load
Central
DW
BI App-1
BI App-2
BI App-3
Decision
Making
© Humza Naseer, University of Melbourne 2014 6
Ponniah (2011) notes that selection of DW architecture is based on
enterprise requirements.
DW architecture has multiple architectural layers and components
 Logical architecture
 Physical architecture
(Moss and Atre, 2013)
DW architecture overlaps with data integration, business intelligence and
enterprise data
(Russom, 2014)
Inmon vs Kimball dichotomy
(Ariyachandra and Watson, 2010)
Trends in Data Warehouse
Architectures
© Humza Naseer, University of Melbourne 2014 7
Eckerson (2012) notes that reporting and analytics have different
workload requirements
Reporting is based on the entities and facts which are well known
Advanced analytics empowers the discovery of new facts which are
not well known
Multi-platform unified data architecture
 Includes enterprise data warehouse (EDW) and several other new data
platforms which augment EDW
(Russom, 2013)
Hadoop Integration into data
warehousing environment
© Humza Naseer, University of Melbourne 2014 8
Data Staging
Data archiving
Advanced analytics
Multi-structured data
Uses of Hadoop that Extend DW
Architectures
DB1
DB2
DB3
Transformation
and Load
EDW
BI App-1
BI App-2
BI App-3
Decision
Making
© Humza Naseer, University of Melbourne 2014 9
Analytics and reporting have different requirements for DW
architectures
Characterize the DW architecture by counting the number and
types of workloads it supports
Logical DW architecture must integrate multiple physical
platforms
Design of logical DW architecture must be compartmentalized
Proposed logical architecture for new DW ecosystem
(An Extension of Eckerson (2012) BI architecture)
Findings
© Humza Naseer, University of Melbourne 2014 10
Enterprise Data
WarehouseOperational
System
Operational
System
Operational
Data Store
Subject Area
Data Marts
BI
Server
Online Transaction Processing Systems
(Relational Data) Event driven alerting
environment
Reporting/analysis
Environment
Logical Architecture of New DW
Ecosystem
DW-Centric Sandbox
Web Data
Machine Data
Log files
Legacy/External
Data
Replicated
Sandbox
In-memory
BI Sandbox
Hadoop Ecosystem
Cluster
(Non-relational Data)
Exploration/discovery
environment
Non-relational
Extract, transform and Load
(Batch, real time or near real
time)
Power User
Casual User
QueryETLStreaming
Top down architecture
Bottom up architecture
© Humza Naseer, University of Melbourne 2014 11
BI Assessment Model
Data Warehouse
Ecosystem
Data Marts
Enterprise Data
Warehouse
Work Load Specific
Data Platforms
Workload Capacity
Degree of
Integration
High
High
Low
Low
Degree of
Standardization
High
Low
© Humza Naseer, University of Melbourne 2014 12
Hadoop enables new types of applications within DW
environment
Big data analytics, advanced analytics and discovery analytics
Information exploration and augmenting a data warehouse
Should be implemented in multi-platform DW environment
Future work:
 Conformed dimensions
 BI maturity roadmap
Conclusion
© Humza Naseer, University of Melbourne 2014 13
Questions
© Humza Naseer, University of Melbourne 2014 14

More Related Content

What's hot

Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Lviv Startup Club
 
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsBest Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsCloudera, Inc.
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsDavid Portnoy
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...NoSQLmatters
 
Integrated Data Warehouse with Hadoop and Oracle Database
Integrated Data Warehouse with Hadoop and Oracle DatabaseIntegrated Data Warehouse with Hadoop and Oracle Database
Integrated Data Warehouse with Hadoop and Oracle DatabaseGwen (Chen) Shapira
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiridatastack
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesDataWorks Summit
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big DataDataWorks Summit
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprisesmarkgrover
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse OptimizationCloudera, Inc.
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's includedJames Serra
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMark Kromer
 
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonUsing Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonMapR Technologies
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Building a Data Lake - An App Dev's Perspective
Building a Data Lake - An App Dev's PerspectiveBuilding a Data Lake - An App Dev's Perspective
Building a Data Lake - An App Dev's PerspectiveGeekNightHyderabad
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data LakeMetroStar
 

What's hot (20)

Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
Artur Fejklowicz - “Data Lake architecture” AI&BigDataDay 2017
 
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop ProfessionalsBest Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
Best Practices for the Hadoop Data Warehouse: EDW 101 for Hadoop Professionals
 
Hybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop ImplementationsHybrid Data Warehouse Hadoop Implementations
Hybrid Data Warehouse Hadoop Implementations
 
What is hadoop
What is hadoopWhat is hadoop
What is hadoop
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
 
Integrated Data Warehouse with Hadoop and Oracle Database
Integrated Data Warehouse with Hadoop and Oracle DatabaseIntegrated Data Warehouse with Hadoop and Oracle Database
Integrated Data Warehouse with Hadoop and Oracle Database
 
Big Data Architecture Workshop - Vahid Amiri
Big Data Architecture Workshop -  Vahid AmiriBig Data Architecture Workshop -  Vahid Amiri
Big Data Architecture Workshop - Vahid Amiri
 
Hadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data ArchitecturesHadoop Powers Modern Enterprise Data Architectures
Hadoop Powers Modern Enterprise Data Architectures
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
Data Warehouse Optimization
Data Warehouse OptimizationData Warehouse Optimization
Data Warehouse Optimization
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
Microsoft Data Platform - What's included
Microsoft Data Platform - What's includedMicrosoft Data Platform - What's included
Microsoft Data Platform - What's included
 
Microsoft Azure Big Data Analytics
Microsoft Azure Big Data AnalyticsMicrosoft Azure Big Data Analytics
Microsoft Azure Big Data Analytics
 
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad AnsersonUsing Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
Using Hadoop to Offload Data Warehouse Processing and More - Brad Anserson
 
Data lake
Data lakeData lake
Data lake
 
Data Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop ImplementationData Lake for the Cloud: Extending your Hadoop Implementation
Data Lake for the Cloud: Extending your Hadoop Implementation
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Building a Data Lake - An App Dev's Perspective
Building a Data Lake - An App Dev's PerspectiveBuilding a Data Lake - An App Dev's Perspective
Building a Data Lake - An App Dev's Perspective
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 

Viewers also liked

Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data WarehouseCaserta
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with HadoopOReillyStrata
 
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformBig Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformCaserta
 
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Caserta
 
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11Hyunsik Choi
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataGruter
 
Roadmap for solution company
Roadmap for  solution companyRoadmap for  solution company
Roadmap for solution companyLytton He
 
Tajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for HadoopTajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for HadoopHyunsik Choi
 
(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014
(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014
(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014Amazon Web Services
 
Informatica Command Line Statements
Informatica Command Line StatementsInformatica Command Line Statements
Informatica Command Line Statementsmnsk80
 
Dimensional modeling primer
Dimensional modeling primerDimensional modeling primer
Dimensional modeling primerTerry Bunio
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2akitda
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3Malik Alig
 
Why PTC for SLM?
Why PTC for SLM?Why PTC for SLM?
Why PTC for SLM?Tom Kenslea
 
Cloud- A Technical or Organisational Challenge? Or Both?
Cloud- A Technical or Organisational Challenge? Or Both?Cloud- A Technical or Organisational Challenge? Or Both?
Cloud- A Technical or Organisational Challenge? Or Both?Justin Pirie
 
Kimball Vs Inmon
Kimball Vs InmonKimball Vs Inmon
Kimball Vs Inmonguest2308b5
 
Designing the Industrial Internet
Designing the Industrial InternetDesigning the Industrial Internet
Designing the Industrial InternetDane Petersen
 
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about..."Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...Kai Wähner
 

Viewers also liked (20)

Hadoop and Your Data Warehouse
Hadoop and Your Data WarehouseHadoop and Your Data Warehouse
Hadoop and Your Data Warehouse
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
 
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformBig Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
 
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
Big Data Warehousing Meetup: Dimensional Modeling Still Matters!!!
 
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
Tajo Seoul Meetup July 2015 - What's New Tajo 0.11
 
Introduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big DataIntroduction to Apache Tajo: Data Warehouse for Big Data
Introduction to Apache Tajo: Data Warehouse for Big Data
 
Roadmap for solution company
Roadmap for  solution companyRoadmap for  solution company
Roadmap for solution company
 
Tajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for HadoopTajo: A Distributed Data Warehouse System for Hadoop
Tajo: A Distributed Data Warehouse System for Hadoop
 
(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014
(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014
(BDT403) Netflix's Next Generation Big Data Platform | AWS re:Invent 2014
 
Informatica Command Line Statements
Informatica Command Line StatementsInformatica Command Line Statements
Informatica Command Line Statements
 
Dimensional modeling primer
Dimensional modeling primerDimensional modeling primer
Dimensional modeling primer
 
Dimensional Modelling Session 2
Dimensional Modelling Session 2Dimensional Modelling Session 2
Dimensional Modelling Session 2
 
Dimensional modelling-mod-3
Dimensional modelling-mod-3Dimensional modelling-mod-3
Dimensional modelling-mod-3
 
Why PTC for SLM?
Why PTC for SLM?Why PTC for SLM?
Why PTC for SLM?
 
Cloud- A Technical or Organisational Challenge? Or Both?
Cloud- A Technical or Organisational Challenge? Or Both?Cloud- A Technical or Organisational Challenge? Or Both?
Cloud- A Technical or Organisational Challenge? Or Both?
 
Dimensional Modeling
Dimensional ModelingDimensional Modeling
Dimensional Modeling
 
Kimball Vs Inmon
Kimball Vs InmonKimball Vs Inmon
Kimball Vs Inmon
 
Designing the Industrial Internet
Designing the Industrial InternetDesigning the Industrial Internet
Designing the Industrial Internet
 
Retaam_ThingWorx
Retaam_ThingWorxRetaam_ThingWorx
Retaam_ThingWorx
 
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about..."Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
"Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about...
 

Similar to Hadoop Integration into Data Warehousing Architectures

Trends in Computer Science and Information Technology
Trends in Computer Science and Information TechnologyTrends in Computer Science and Information Technology
Trends in Computer Science and Information Technologypeertechzpublication
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopSlim Baltagi
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataHortonworks
 
Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsCognizant
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Innovative Management Services
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraCloudera, Inc.
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraMongoDB
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to HadoopPOSSCON
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouseStephen Alex
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopIOSR Journals
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoptionfaizrashid1995
 
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)GeeksLab Odessa
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Denodo
 
Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Sheena Crouch
 
1.demystifying big data & hadoop
1.demystifying big data & hadoop1.demystifying big data & hadoop
1.demystifying big data & hadoopdatabloginfo
 
Relational Technologies Under Siege: Will Handsome Newcomers Displace the St...
Relational Technologies Under Siege:  Will Handsome Newcomers Displace the St...Relational Technologies Under Siege:  Will Handsome Newcomers Displace the St...
Relational Technologies Under Siege: Will Handsome Newcomers Displace the St...Neil Raden
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionAppfluent Technology
 

Similar to Hadoop Integration into Data Warehousing Architectures (20)

Trends in Computer Science and Information Technology
Trends in Computer Science and Information TechnologyTrends in Computer Science and Information Technology
Trends in Computer Science and Information Technology
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big DataCombine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
Combine Apache Hadoop and Elasticsearch to Get the Most of Your Big Data
 
Big Data , Big Problem?
Big Data , Big Problem?Big Data , Big Problem?
Big Data , Big Problem?
 
Infrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical WorkloadsInfrastructure Considerations for Analytical Workloads
Infrastructure Considerations for Analytical Workloads
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management Orchestra
 
Better Together: The New Data Management Orchestra
Better Together: The New Data Management OrchestraBetter Together: The New Data Management Orchestra
Better Together: The New Data Management Orchestra
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Modern data warehouse
Modern data warehouseModern data warehouse
Modern data warehouse
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 
G017143640
G017143640G017143640
G017143640
 
Big data and apache hadoop adoption
Big data and apache hadoop adoptionBig data and apache hadoop adoption
Big data and apache hadoop adoption
 
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
FOSS Sea 2014_DataWarehouse & BigData_Владимир Слободянюк ( Luxoft)
 
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
Simplifying Your Cloud Architecture with a Logical Data Fabric (APAC)
 
Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...Relational Databases For An Efficient Data Management And...
Relational Databases For An Efficient Data Management And...
 
1.demystifying big data & hadoop
1.demystifying big data & hadoop1.demystifying big data & hadoop
1.demystifying big data & hadoop
 
Relational Technologies Under Siege: Will Handsome Newcomers Displace the St...
Relational Technologies Under Siege:  Will Handsome Newcomers Displace the St...Relational Technologies Under Siege:  Will Handsome Newcomers Displace the St...
Relational Technologies Under Siege: Will Handsome Newcomers Displace the St...
 
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR DistributionCisco Big Data Warehouse Expansion Featuring MapR Distribution
Cisco Big Data Warehouse Expansion Featuring MapR Distribution
 

Recently uploaded

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Onlineanilsa9823
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlkumarajju5765
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Recently uploaded (20)

BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Chinhat Lucknow best sexual service Online
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girlCall Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
Call Girls 🫤 Dwarka ➡️ 9711199171 ➡️ Delhi 🫦 Two shot with one girl
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Hadoop Integration into Data Warehousing Architectures

  • 1. Integrating Hadoop into Data Warehousing Architecture Where is the Wisdom? Lost in the Knowledge. Where is the Knowledge? Lost in the Information. T.S. Eliot © Humza Naseer, University of Melbourne 2014
  • 2. Outline Findings, Conclusion & Future Work Current Work: Hadoop Integration into Data Warehouse Environment Related Work: Trends in Data Warehouse Architecture Link Between Hadoop and Data Warehouse Introduction © Humza Naseer, University of Melbourne 2014 2
  • 3. Identify all possible enterprise data assets Select those assets that have actionable content and can be accessed Bring the data assets into a logically centralized “enterprise data warehouse” Expose those data assets most effectively for decision making (Kimball & Ross, 2013) Intro: The Data Warehouse Mission © Humza Naseer, University of Melbourne 2014 3
  • 4. Hadoop is an Ecosystem of products  Open source  Vendor distributions  Additional tools for development and administration Hadoop Benefits  Enables big data analytics  Supports advanced forms of analytics  Scales cost effectively  Extends a data warehouse environment Hadoop Limitations • Low latency queries • Ease of access • Data integration and integrity • Fine grained security Intro: Overview of Hadoop Unstructured Data Query Results HDFS Data Nodes Map Reduce © Humza Naseer, University of Melbourne 2014 4
  • 5. A data warehouse system fetches and unifies data from heterogeneous source systems into a centralized dimensional or normalized data repository (Rainardi, 2008) Data warehouse is not a tool or technology  It is a business process which unifies an enterprise through data (Eckerson, 2012) Hadoop a problem or an opportunity? Where Hadoop fits into data warehouse architecture? Link Between Hadoop and Data Warehouse © Humza Naseer, University of Melbourne 2014 5
  • 6. Traditional RDBMSs cannot handle  The new data types  Extended analytic processing  Terabytes/hour loading with immediate query access We want to use SQL, but we don’t want the RDBMS storage constraints The disruptive solution: Hadoop (Kimball & Ross, 2013) Why is Integration Happening? DB1 DB2 DB3 Transformation and Load Central DW BI App-1 BI App-2 BI App-3 Decision Making © Humza Naseer, University of Melbourne 2014 6
  • 7. Ponniah (2011) notes that selection of DW architecture is based on enterprise requirements. DW architecture has multiple architectural layers and components  Logical architecture  Physical architecture (Moss and Atre, 2013) DW architecture overlaps with data integration, business intelligence and enterprise data (Russom, 2014) Inmon vs Kimball dichotomy (Ariyachandra and Watson, 2010) Trends in Data Warehouse Architectures © Humza Naseer, University of Melbourne 2014 7
  • 8. Eckerson (2012) notes that reporting and analytics have different workload requirements Reporting is based on the entities and facts which are well known Advanced analytics empowers the discovery of new facts which are not well known Multi-platform unified data architecture  Includes enterprise data warehouse (EDW) and several other new data platforms which augment EDW (Russom, 2013) Hadoop Integration into data warehousing environment © Humza Naseer, University of Melbourne 2014 8
  • 9. Data Staging Data archiving Advanced analytics Multi-structured data Uses of Hadoop that Extend DW Architectures DB1 DB2 DB3 Transformation and Load EDW BI App-1 BI App-2 BI App-3 Decision Making © Humza Naseer, University of Melbourne 2014 9
  • 10. Analytics and reporting have different requirements for DW architectures Characterize the DW architecture by counting the number and types of workloads it supports Logical DW architecture must integrate multiple physical platforms Design of logical DW architecture must be compartmentalized Proposed logical architecture for new DW ecosystem (An Extension of Eckerson (2012) BI architecture) Findings © Humza Naseer, University of Melbourne 2014 10
  • 11. Enterprise Data WarehouseOperational System Operational System Operational Data Store Subject Area Data Marts BI Server Online Transaction Processing Systems (Relational Data) Event driven alerting environment Reporting/analysis Environment Logical Architecture of New DW Ecosystem DW-Centric Sandbox Web Data Machine Data Log files Legacy/External Data Replicated Sandbox In-memory BI Sandbox Hadoop Ecosystem Cluster (Non-relational Data) Exploration/discovery environment Non-relational Extract, transform and Load (Batch, real time or near real time) Power User Casual User QueryETLStreaming Top down architecture Bottom up architecture © Humza Naseer, University of Melbourne 2014 11
  • 12. BI Assessment Model Data Warehouse Ecosystem Data Marts Enterprise Data Warehouse Work Load Specific Data Platforms Workload Capacity Degree of Integration High High Low Low Degree of Standardization High Low © Humza Naseer, University of Melbourne 2014 12
  • 13. Hadoop enables new types of applications within DW environment Big data analytics, advanced analytics and discovery analytics Information exploration and augmenting a data warehouse Should be implemented in multi-platform DW environment Future work:  Conformed dimensions  BI maturity roadmap Conclusion © Humza Naseer, University of Melbourne 2014 13
  • 14. Questions © Humza Naseer, University of Melbourne 2014 14