SlideShare une entreprise Scribd logo
1  sur  13
Télécharger pour lire hors ligne
Presented by
EDW Technology & Process Recommendation
Over the last few years, organizations across public and private sectors have made a
strategic decision to turn big data into competitive advantage. The challenge of
extracting value from big data is similar in many ways to the age-old problem of
distilling business intelligence from transactional data. At the heart of this challenge
is the process used to extract data from multiple sources, transform it to fit your
analytical needs, and load it into a Enterprise Data Warehouse for subsequent
analysis, a process known as “Extract, Transform & Load” (ETL) for which Smartmonk
is recommending Apache hadoop echo-system.
Big Data analytics and the Apache Hadoop open Source
project are rapidly emerging as the preferred solution to
address business and technology trends that are
disrupting traditional data management and processing
Enterprises can gain a competitive advantage by
being early adopters of big data analytics.
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Big Data is Different than Business Intelligence
"TRADITIONAL BI"
Experimental, Ad Hoc
Mostly Semi-Structured
External + Operational
10s of TB to 100 of PB's
Repetitive
Structures
Operational
GBs to 10s of TBs
Presented by
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Questions from Business will Vary
Presented by
Past Future
What
happened?
What is
happening
What is likely to
happen?
Reporting,
Dashboards
Forensics & Data
Mining
Real-Time
Analytics
Real-Time
Data Mining
Predictive
Analytics
Prescriptive
Analytics
Why did it
happen?
Why is it
happening?
What should I do
about it?
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Presented by
Hadoop Adoption in the industry
2007 2008
Presented by
Hadoop Adoption in the industry
2007 2008 2009 2010
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Presented by
Traditional EDW Architecture
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Presented by
Proposed Hadoop Architecture
LOGICAL ARCHITECTURE
Processing: MapReduce
Storage: HDFS
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Presented by
PROCESS FLOW
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Proposed Hadoop Architecture
Presented by
PHYSICAL ARCHITECTURE
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Proposed Hadoop Architecture
Presented by
Traditional ETL Architecture
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Presented by
Offload ETL with Hadoop
TBSS Proposed EDW architecture
Presented by
MapReduce Provides
• Automatic parallelization and distribution
• Fault Tolerance
• Status and Monitoring Tools
• A clean Abstraction for Programmers
• Google Technology Roundtable : MapReduce
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Presented by
Hadoop Vs RDBMS
Hadoop RDBMS
Open Source Mostly propriety
Eco System Suite of java based(mostly) projects, A
framework
One project with multiple components
Designed to support distributed architecture Designed with idea of server client Architecture
Designed to run on commodity hardware High usage would expect High end server
Cost efficient Costly
High fault tolerance Legacy procedure
Based on distributed file system like GFS, HDFS.. Rely on OS file system
Very good support of unstructured data Needs structured data
Flexible, evolvable and fast Needs to follow defined constraints
Still evolving Has lots of very good products like oracle ,sql.
Suitable for Batch processing Real time Read/Write
Sequential write Arbitrary insert and update
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
Presented by
Comparing RDMS and MapReduce
Traditional RDBMS MapReduce
Data Size Gigabytes (terabytes) Petabytes(Exabyte's)
Access Interactive and Batch Batch
Updates Read /write many times Read /write many times
Structure Static schema Dynamic schema
Integrity high (ACID) Low
Scaling Nonlinear Linear
DBA Ratio 1:40 1:3000
Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co

Contenu connexe

Tendances

"Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr...
"Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr..."Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr...
"Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr...
Dataconomy Media
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
DataWorks Summit
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & Analytics
Rick Perret
 
Kick Off – Graphs: The Fuel Behind Innovation and Transformation in Every Field
Kick Off – Graphs: The Fuel Behind Innovation and Transformation in Every FieldKick Off – Graphs: The Fuel Behind Innovation and Transformation in Every Field
Kick Off – Graphs: The Fuel Behind Innovation and Transformation in Every Field
Neo4j
 
Hadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business UnitHadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business Unit
DataWorks Summit
 

Tendances (20)

Deutsche Telekom on Big Data
Deutsche Telekom on Big DataDeutsche Telekom on Big Data
Deutsche Telekom on Big Data
 
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
EMC World 2014 Breakout: Move to the Business Data Lake – Not as Hard as It S...
 
Hadoop dev 01
Hadoop dev 01Hadoop dev 01
Hadoop dev 01
 
Overview - IBM Big Data Platform
Overview - IBM Big Data PlatformOverview - IBM Big Data Platform
Overview - IBM Big Data Platform
 
Destroying Data Silos
Destroying Data SilosDestroying Data Silos
Destroying Data Silos
 
Making the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British AirwaysMaking the Case for Hadoop in a Large Enterprise-British Airways
Making the Case for Hadoop in a Large Enterprise-British Airways
 
The Rise of Intelligent Content Services
The Rise of Intelligent Content ServicesThe Rise of Intelligent Content Services
The Rise of Intelligent Content Services
 
"Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr...
"Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr..."Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr...
"Empower Developers with HPE Machine Learning and Augmented Intelligence", Dr...
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
Why Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & AnalyticsWhy Infrastructure Matters for Big Data & Analytics
Why Infrastructure Matters for Big Data & Analytics
 
Scaling Face Recognition with Big Data
Scaling Face Recognition with Big DataScaling Face Recognition with Big Data
Scaling Face Recognition with Big Data
 
Open Source Ecosystem Future of Enterprise IT
Open Source Ecosystem Future of Enterprise ITOpen Source Ecosystem Future of Enterprise IT
Open Source Ecosystem Future of Enterprise IT
 
Kick Off – Graphs: The Fuel Behind Innovation and Transformation in Every Field
Kick Off – Graphs: The Fuel Behind Innovation and Transformation in Every FieldKick Off – Graphs: The Fuel Behind Innovation and Transformation in Every Field
Kick Off – Graphs: The Fuel Behind Innovation and Transformation in Every Field
 
Hadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata CompanyHadoop 2015: what we larned -Think Big, A Teradata Company
Hadoop 2015: what we larned -Think Big, A Teradata Company
 
Big Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on DataBig Data LDN 2017: The New Dominant Companies Are Running on Data
Big Data LDN 2017: The New Dominant Companies Are Running on Data
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
Ibm big data
Ibm big dataIbm big data
Ibm big data
 
Hadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business UnitHadoop: Making it work for the Business Unit
Hadoop: Making it work for the Business Unit
 
A Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura WynterA Big Data Telco Solution by Dr. Laura Wynter
A Big Data Telco Solution by Dr. Laura Wynter
 
Accelerating Big Data Implementations for the Connected World
Accelerating Big Data Implementations for the Connected WorldAccelerating Big Data Implementations for the Connected World
Accelerating Big Data Implementations for the Connected World
 

En vedette

International Competitive Intelligence Report (Boeing, Raytheon, General Dyna...
International Competitive Intelligence Report (Boeing, Raytheon, General Dyna...International Competitive Intelligence Report (Boeing, Raytheon, General Dyna...
International Competitive Intelligence Report (Boeing, Raytheon, General Dyna...
William Callahan
 
Symantec Intelligence Report
Symantec Intelligence ReportSymantec Intelligence Report
Symantec Intelligence Report
Symantec
 
EMMF - Sean campbell Competitive Intelligence presentation
EMMF - Sean campbell   Competitive Intelligence presentationEMMF - Sean campbell   Competitive Intelligence presentation
EMMF - Sean campbell Competitive Intelligence presentation
Zoom Industries
 
120. business intelligence modeling for increasing company value and competit...
120. business intelligence modeling for increasing company value and competit...120. business intelligence modeling for increasing company value and competit...
120. business intelligence modeling for increasing company value and competit...
Hendry Hartono
 
Molekule Pharmaceuticals Marketing Suite Presentation
Molekule Pharmaceuticals Marketing Suite PresentationMolekule Pharmaceuticals Marketing Suite Presentation
Molekule Pharmaceuticals Marketing Suite Presentation
Modicum
 

En vedette (20)

Best Practices in Implementing Strategic and Competitive Intelligence
Best Practices in Implementing Strategic and Competitive IntelligenceBest Practices in Implementing Strategic and Competitive Intelligence
Best Practices in Implementing Strategic and Competitive Intelligence
 
American Airlines Competitive Intelligence Report
American Airlines Competitive Intelligence ReportAmerican Airlines Competitive Intelligence Report
American Airlines Competitive Intelligence Report
 
International Competitive Intelligence Report (Boeing, Raytheon, General Dyna...
International Competitive Intelligence Report (Boeing, Raytheon, General Dyna...International Competitive Intelligence Report (Boeing, Raytheon, General Dyna...
International Competitive Intelligence Report (Boeing, Raytheon, General Dyna...
 
Japan’s Middle Market: Crucial. Competitive. Concerned.
Japan’s Middle Market: Crucial. Competitive. Concerned.Japan’s Middle Market: Crucial. Competitive. Concerned.
Japan’s Middle Market: Crucial. Competitive. Concerned.
 
Tennessee Higher Education and the Use of Decision Support Systems in Strate...
Tennessee Higher Education and the Use of Decision Support Systems  in Strate...Tennessee Higher Education and the Use of Decision Support Systems  in Strate...
Tennessee Higher Education and the Use of Decision Support Systems in Strate...
 
Pharma ci-Capabilities-Presentation
Pharma ci-Capabilities-PresentationPharma ci-Capabilities-Presentation
Pharma ci-Capabilities-Presentation
 
Telcom Industry Review and Future of Telcom Providers - Telco 2015
Telcom Industry Review and Future of Telcom Providers - Telco 2015Telcom Industry Review and Future of Telcom Providers - Telco 2015
Telcom Industry Review and Future of Telcom Providers - Telco 2015
 
COMPETITIVE INTELLIGENCE FOR SALES AND MARKETING: HOW TO WIN MORE OPPORTUNITI...
COMPETITIVE INTELLIGENCE FOR SALES AND MARKETING: HOW TO WIN MORE OPPORTUNITI...COMPETITIVE INTELLIGENCE FOR SALES AND MARKETING: HOW TO WIN MORE OPPORTUNITI...
COMPETITIVE INTELLIGENCE FOR SALES AND MARKETING: HOW TO WIN MORE OPPORTUNITI...
 
Phelps Research Services Experience
Phelps Research Services ExperiencePhelps Research Services Experience
Phelps Research Services Experience
 
JEC Europe 2013 Competitive Intelligence Report
JEC Europe 2013 Competitive Intelligence ReportJEC Europe 2013 Competitive Intelligence Report
JEC Europe 2013 Competitive Intelligence Report
 
2015 Global Threat Intelligence Report Executive Summary | NTT i3
2015 Global Threat Intelligence Report Executive Summary | NTT i32015 Global Threat Intelligence Report Executive Summary | NTT i3
2015 Global Threat Intelligence Report Executive Summary | NTT i3
 
How Intelligence Accelerates New Client Acquisitions for Law Firms
How Intelligence Accelerates New Client Acquisitions for Law FirmsHow Intelligence Accelerates New Client Acquisitions for Law Firms
How Intelligence Accelerates New Client Acquisitions for Law Firms
 
Credentialing
CredentialingCredentialing
Credentialing
 
The Upper Hand of Innovation: Using Competitive Intelligence to Drive Product...
The Upper Hand of Innovation: Using Competitive Intelligence to Drive Product...The Upper Hand of Innovation: Using Competitive Intelligence to Drive Product...
The Upper Hand of Innovation: Using Competitive Intelligence to Drive Product...
 
Symantec Intelligence Report
Symantec Intelligence ReportSymantec Intelligence Report
Symantec Intelligence Report
 
CI Report
CI ReportCI Report
CI Report
 
EMMF - Sean campbell Competitive Intelligence presentation
EMMF - Sean campbell   Competitive Intelligence presentationEMMF - Sean campbell   Competitive Intelligence presentation
EMMF - Sean campbell Competitive Intelligence presentation
 
Competitive Intelligence 101: An Introduction
Competitive Intelligence 101: An IntroductionCompetitive Intelligence 101: An Introduction
Competitive Intelligence 101: An Introduction
 
120. business intelligence modeling for increasing company value and competit...
120. business intelligence modeling for increasing company value and competit...120. business intelligence modeling for increasing company value and competit...
120. business intelligence modeling for increasing company value and competit...
 
Molekule Pharmaceuticals Marketing Suite Presentation
Molekule Pharmaceuticals Marketing Suite PresentationMolekule Pharmaceuticals Marketing Suite Presentation
Molekule Pharmaceuticals Marketing Suite Presentation
 

Similaire à EDW_Recommendation_Smartmonk_26-12-14

Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Pentaho
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
Amazon Web Services
 
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev KumarApache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
Yahoo Developer Network
 

Similaire à EDW_Recommendation_Smartmonk_26-12-14 (20)

Big Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of AnalyticsBig Data Expo 2015 - Pentaho The Future of Analytics
Big Data Expo 2015 - Pentaho The Future of Analytics
 
Modernizing to a Cloud Data Architecture
Modernizing to a Cloud Data ArchitectureModernizing to a Cloud Data Architecture
Modernizing to a Cloud Data Architecture
 
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
EsgynDB: A Big Data Engine. Simplifying Fast and Reliable Mixed Workloads
 
Amr Ghanem resume
Amr Ghanem resumeAmr Ghanem resume
Amr Ghanem resume
 
Big Data in Azure
Big Data in AzureBig Data in Azure
Big Data in Azure
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR DataExclusive Verizon Employee Webinar: Getting More From Your CDR Data
Exclusive Verizon Employee Webinar: Getting More From Your CDR Data
 
Create your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouseCreate your Big Data vision and Hadoop-ify your data warehouse
Create your Big Data vision and Hadoop-ify your data warehouse
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy:  A Simple, Scalable Solution for Getting Started with HadoopBig Data Made Easy:  A Simple, Scalable Solution for Getting Started with Hadoop
Big Data Made Easy: A Simple, Scalable Solution for Getting Started with Hadoop
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 
Hadoop India Summit, Feb 2011 - Informatica
Hadoop India Summit, Feb 2011 - InformaticaHadoop India Summit, Feb 2011 - Informatica
Hadoop India Summit, Feb 2011 - Informatica
 
Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台Track B-1 建構新世代的智慧數據平台
Track B-1 建構新世代的智慧數據平台
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Qo Introduction V2
Qo Introduction V2Qo Introduction V2
Qo Introduction V2
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Big data an elephant business opportunities
Big data an elephant   business opportunitiesBig data an elephant   business opportunities
Big data an elephant business opportunities
 
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev KumarApache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
 
BIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in LogisticsBIG Data & Hadoop Applications in Logistics
BIG Data & Hadoop Applications in Logistics
 
Big data journey to the cloud maz chaudhri 5.30.18
Big data journey to the cloud   maz chaudhri 5.30.18Big data journey to the cloud   maz chaudhri 5.30.18
Big data journey to the cloud maz chaudhri 5.30.18
 

Plus de Nagi Reddy B (6)

case study-pharma
case study-pharmacase study-pharma
case study-pharma
 
TBSS_BI_Fields
TBSS_BI_FieldsTBSS_BI_Fields
TBSS_BI_Fields
 
retail article (Repaired)
retail article (Repaired)retail article (Repaired)
retail article (Repaired)
 
Tableau google analytics formulas by B N Reddy
Tableau google analytics formulas by B N ReddyTableau google analytics formulas by B N Reddy
Tableau google analytics formulas by B N Reddy
 
Tableau training doc bnreddy call_09396725649
Tableau training doc bnreddy call_09396725649Tableau training doc bnreddy call_09396725649
Tableau training doc bnreddy call_09396725649
 
Data analytic process by bn reddy
Data analytic process by bn reddyData analytic process by bn reddy
Data analytic process by bn reddy
 

EDW_Recommendation_Smartmonk_26-12-14

  • 1. Presented by EDW Technology & Process Recommendation Over the last few years, organizations across public and private sectors have made a strategic decision to turn big data into competitive advantage. The challenge of extracting value from big data is similar in many ways to the age-old problem of distilling business intelligence from transactional data. At the heart of this challenge is the process used to extract data from multiple sources, transform it to fit your analytical needs, and load it into a Enterprise Data Warehouse for subsequent analysis, a process known as “Extract, Transform & Load” (ETL) for which Smartmonk is recommending Apache hadoop echo-system. Big Data analytics and the Apache Hadoop open Source project are rapidly emerging as the preferred solution to address business and technology trends that are disrupting traditional data management and processing Enterprises can gain a competitive advantage by being early adopters of big data analytics. Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
  • 2. Big Data is Different than Business Intelligence "TRADITIONAL BI" Experimental, Ad Hoc Mostly Semi-Structured External + Operational 10s of TB to 100 of PB's Repetitive Structures Operational GBs to 10s of TBs Presented by Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
  • 3. Questions from Business will Vary Presented by Past Future What happened? What is happening What is likely to happen? Reporting, Dashboards Forensics & Data Mining Real-Time Analytics Real-Time Data Mining Predictive Analytics Prescriptive Analytics Why did it happen? Why is it happening? What should I do about it? Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
  • 4. Presented by Hadoop Adoption in the industry 2007 2008 Presented by Hadoop Adoption in the industry 2007 2008 2009 2010 Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
  • 5. Presented by Traditional EDW Architecture Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
  • 6. Presented by Proposed Hadoop Architecture LOGICAL ARCHITECTURE Processing: MapReduce Storage: HDFS Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
  • 7. Presented by PROCESS FLOW Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co Proposed Hadoop Architecture
  • 8. Presented by PHYSICAL ARCHITECTURE Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co Proposed Hadoop Architecture
  • 9. Presented by Traditional ETL Architecture Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
  • 10. Presented by Offload ETL with Hadoop TBSS Proposed EDW architecture
  • 11. Presented by MapReduce Provides • Automatic parallelization and distribution • Fault Tolerance • Status and Monitoring Tools • A clean Abstraction for Programmers • Google Technology Roundtable : MapReduce Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
  • 12. Presented by Hadoop Vs RDBMS Hadoop RDBMS Open Source Mostly propriety Eco System Suite of java based(mostly) projects, A framework One project with multiple components Designed to support distributed architecture Designed with idea of server client Architecture Designed to run on commodity hardware High usage would expect High end server Cost efficient Costly High fault tolerance Legacy procedure Based on distributed file system like GFS, HDFS.. Rely on OS file system Very good support of unstructured data Needs structured data Flexible, evolvable and fast Needs to follow defined constraints Still evolving Has lots of very good products like oracle ,sql. Suitable for Batch processing Real time Read/Write Sequential write Arbitrary insert and update Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co
  • 13. Presented by Comparing RDMS and MapReduce Traditional RDBMS MapReduce Data Size Gigabytes (terabytes) Petabytes(Exabyte's) Access Interactive and Batch Batch Updates Read /write many times Read /write many times Structure Static schema Dynamic schema Integrity high (ACID) Low Scaling Nonlinear Linear DBA Ratio 1:40 1:3000 Contact Name: B N Reddy | eMail: bnreddy@smartmonk.co | mobile: 0091-9160000748 | Website: www.smartmonk.co