SlideShare une entreprise Scribd logo
1  sur  13
Introduction to Cloudera platform
for BIG DATA
Ahmed El-Sayed
Shouman
CDH is 100% Open Source Distribution
including Apache Hadoop.
 CDH is 100% Apache-licensed open
source.
CDH is the world’s most complete, tested,
and popular distribution of Apache Hadoop
and related projects.
 CDH includes the core elements of Hadoop plus
several additional open source projects.
Apache Yarn : (Yet Another Resource Negotiator)
Is the data operating system of Hadoop that enables
you to process data simultaneously in multiple ways.
Apache Impala : Impala combines modern, parallel
database technology with Hadoop, enabling users to
directly query data stored in HDFS and HBase.
 Hive Process data via
MapReduce, Impala is a
stand-alone MPP framework.
Apache HUE : Hue is a suite of applications that
provide web-based access to CDH components and a
platform for building custom applications.
 In addition to the previous Apache projects, there
are other projects that’s used to help
administrating your cluster such as:
Apache HIVE. Provide like SQL.
Apache Sqoop. Move data to & from BD.
Apache PIG. Scripting lang. interface.
Apache Mahout. Machine Learning.
Apache Oozie. Schedule Hadoop jobs.
Apache Flume. Servers Log Collector.
 Cloudera Manager is a unified management
interface that
makes it easy to
install, configure,
and manage a CDH
cluster through
a web interface
“Admin Console”.
1-Hortonworks :
100% Open Source Enterprise Apache Hadoop.
C.M & AMBARI HUE
 Both C.M & Ambari are
the installation manager
for Cloudera and
Hortonworks in order.
 Used for installing
Monitoring, and
Configuring Hadoop
clusters.
 Is an Apache Open source
project
 Apache Hue used for
Interacting with the services
in the cluster, and run
Commands through a Web
User interface.
2 –DataStax : is a complete big data platform,
built on Apache Cassandra™, architected to
provide scalability, Continuous availability and
operational simplicity for real-time, analytic, and
enterprise search data in the same database
cluster.
Cloudera
Cloudera

Contenu connexe

Tendances

Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configuration
prabakaranbrick
 

Tendances (20)

Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Setting High Availability in Hadoop Cluster
Setting High Availability in Hadoop ClusterSetting High Availability in Hadoop Cluster
Setting High Availability in Hadoop Cluster
 
Hadoop cluster configuration
Hadoop cluster configurationHadoop cluster configuration
Hadoop cluster configuration
 
Hadoop
HadoopHadoop
Hadoop
 
알쓸신잡
알쓸신잡알쓸신잡
알쓸신잡
 
Hadoop - Overview
Hadoop - OverviewHadoop - Overview
Hadoop - Overview
 
Learn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node ClusterLearn to setup a Hadoop Multi Node Cluster
Learn to setup a Hadoop Multi Node Cluster
 
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in HadoopOctober 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
October 2016 HUG: The Pillars of Effective Data Archiving and Tiering in Hadoop
 
Hadoop introduction seminar presentation
Hadoop introduction seminar presentationHadoop introduction seminar presentation
Hadoop introduction seminar presentation
 
Hadoop And Their Ecosystem
 Hadoop And Their Ecosystem Hadoop And Their Ecosystem
Hadoop And Their Ecosystem
 
Introduction to apache hadoop
Introduction to apache hadoopIntroduction to apache hadoop
Introduction to apache hadoop
 
Hadoop installation with an example
Hadoop installation with an exampleHadoop installation with an example
Hadoop installation with an example
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 
Configure h base hadoop and hbase client
Configure h base hadoop and hbase clientConfigure h base hadoop and hbase client
Configure h base hadoop and hbase client
 
BIG DATA: Apache Hadoop
BIG DATA: Apache HadoopBIG DATA: Apache Hadoop
BIG DATA: Apache Hadoop
 
Hadoop migration and upgradation
Hadoop migration and upgradationHadoop migration and upgradation
Hadoop migration and upgradation
 
Hadoop overview
Hadoop overviewHadoop overview
Hadoop overview
 
Hadoop ecosystem
Hadoop ecosystemHadoop ecosystem
Hadoop ecosystem
 
An introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoopAn introduction to Big-Data processing applying hadoop
An introduction to Big-Data processing applying hadoop
 

En vedette

En vedette (20)

Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
Apache hadoop and cdh(cloudera distribution) introduction 基本介紹
 
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
Modernizing Your IT Infrastructure with Hadoop - Cloudera Summer Webinar Seri...
 
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
Chicago Data Summit: Cloudera's Distribution including Apache Hadoop & Cloude...
 
Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest MindsCase study: Hadoop as ELT for Leading US Retailer - Happiest Minds
Case study: Hadoop as ELT for Leading US Retailer - Happiest Minds
 
A short introduction to Spark and its benefits
A short introduction to Spark and its benefitsA short introduction to Spark and its benefits
A short introduction to Spark and its benefits
 
Cloudera Federal Forum 2014: Cloud Deployment for the Enterprise Data Hub
Cloudera Federal Forum 2014: Cloud Deployment for the Enterprise Data HubCloudera Federal Forum 2014: Cloud Deployment for the Enterprise Data Hub
Cloudera Federal Forum 2014: Cloud Deployment for the Enterprise Data Hub
 
What is Spark
What is SparkWhat is Spark
What is Spark
 
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
Realizing the Promise of Big Data with Hadoop - Cloudera Summer Webinar Serie...
 
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
 
cdh
cdhcdh
cdh
 
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
Which Hadoop Distribution to use: Apache, Cloudera, MapR or HortonWorks?
 
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1
Using Big Data to Transform Your Customer’s Experience - Part 1

Using Big Data to Transform Your Customer’s Experience - Part 1

 
Apache Spark An Overview
Apache Spark An OverviewApache Spark An Overview
Apache Spark An Overview
 
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr
Analyzing Hadoop Data Using Sparklyr

Analyzing Hadoop Data Using Sparklyr

 
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapRHadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
Hadoop benchmark: Evaluating Cloudera, Hortonworks, and MapR
 
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the CloudData Engineering: Elastic, Low-Cost Data Processing in the Cloud
Data Engineering: Elastic, Low-Cost Data Processing in the Cloud
 
eris:db -- Typical Account Types
eris:db -- Typical Account Typeseris:db -- Typical Account Types
eris:db -- Typical Account Types
 
[DSC 2016] 系列活動:李泳泉 / 星火燎原 - Spark 機器學習初探
[DSC 2016] 系列活動:李泳泉 / 星火燎原 - Spark 機器學習初探[DSC 2016] 系列活動:李泳泉 / 星火燎原 - Spark 機器學習初探
[DSC 2016] 系列活動:李泳泉 / 星火燎原 - Spark 機器學習初探
 
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ... Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
 
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
 

Similaire à Cloudera

Similaire à Cloudera (20)

Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.Brief Introduction about Hadoop and Core Services.
Brief Introduction about Hadoop and Core Services.
 
Hadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, ProvidersHadoop Platforms - Introduction, Importance, Providers
Hadoop Platforms - Introduction, Importance, Providers
 
BIGDATA ppts
BIGDATA pptsBIGDATA ppts
BIGDATA ppts
 
Big Data Training in Ludhiana
Big Data Training in LudhianaBig Data Training in Ludhiana
Big Data Training in Ludhiana
 
Big Data Training in Mohali
Big Data Training in MohaliBig Data Training in Mohali
Big Data Training in Mohali
 
Big Data Training in Amritsar
Big Data Training in AmritsarBig Data Training in Amritsar
Big Data Training in Amritsar
 
Hadoop in action
Hadoop in actionHadoop in action
Hadoop in action
 
Hadoop online training
Hadoop online training Hadoop online training
Hadoop online training
 
Overview of big data & hadoop version 1 - Tony Nguyen
Overview of big data & hadoop   version 1 - Tony NguyenOverview of big data & hadoop   version 1 - Tony Nguyen
Overview of big data & hadoop version 1 - Tony Nguyen
 
Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
Hadoop.pptx
Hadoop.pptxHadoop.pptx
Hadoop.pptx
 
Bigdata
BigdataBigdata
Bigdata
 
Apache hadoop
Apache hadoopApache hadoop
Apache hadoop
 
Best institute for Hadoop in gurgaon
Best institute for Hadoop in gurgaonBest institute for Hadoop in gurgaon
Best institute for Hadoop in gurgaon
 
Introduction to Apache hadoop
Introduction to Apache hadoopIntroduction to Apache hadoop
Introduction to Apache hadoop
 
Apache hadoop introduction and architecture
Apache hadoop  introduction and architectureApache hadoop  introduction and architecture
Apache hadoop introduction and architecture
 
Case study on big data
Case study on big dataCase study on big data
Case study on big data
 
Big Data Technology Stack : Nutshell
Big Data Technology Stack : NutshellBig Data Technology Stack : Nutshell
Big Data Technology Stack : Nutshell
 
Hadoop training
Hadoop trainingHadoop training
Hadoop training
 
Hadoop hdfs
Hadoop hdfsHadoop hdfs
Hadoop hdfs
 

Plus de Ahmed Salman (11)

IBM Netezza
IBM NetezzaIBM Netezza
IBM Netezza
 
DR_PRESENT 1
DR_PRESENT 1DR_PRESENT 1
DR_PRESENT 1
 
Faas__Food_as_a_Service__project
Faas__Food_as_a_Service__projectFaas__Food_as_a_Service__project
Faas__Food_as_a_Service__project
 
Project_Overview_-_final
Project_Overview_-_finalProject_Overview_-_final
Project_Overview_-_final
 
TECRM 20 Presentation
TECRM 20 PresentationTECRM 20 Presentation
TECRM 20 Presentation
 
TCRM10 Pesentation
TCRM10 PesentationTCRM10 Pesentation
TCRM10 Pesentation
 
Big Data Concepts
Big Data ConceptsBig Data Concepts
Big Data Concepts
 
Big Data Course - BigData HUB
Big Data Course - BigData HUBBig Data Course - BigData HUB
Big Data Course - BigData HUB
 
Introduction to Dig Data& Hadoop
Introduction to Dig Data& HadoopIntroduction to Dig Data& Hadoop
Introduction to Dig Data& Hadoop
 
BigData HUB Workshop
BigData HUB WorkshopBigData HUB Workshop
BigData HUB Workshop
 
Hadoop Installation
Hadoop InstallationHadoop Installation
Hadoop Installation
 

Cloudera

  • 1. Introduction to Cloudera platform for BIG DATA Ahmed El-Sayed Shouman
  • 2. CDH is 100% Open Source Distribution including Apache Hadoop.  CDH is 100% Apache-licensed open source. CDH is the world’s most complete, tested, and popular distribution of Apache Hadoop and related projects.
  • 3.  CDH includes the core elements of Hadoop plus several additional open source projects.
  • 4. Apache Yarn : (Yet Another Resource Negotiator) Is the data operating system of Hadoop that enables you to process data simultaneously in multiple ways.
  • 5. Apache Impala : Impala combines modern, parallel database technology with Hadoop, enabling users to directly query data stored in HDFS and HBase.  Hive Process data via MapReduce, Impala is a stand-alone MPP framework.
  • 6. Apache HUE : Hue is a suite of applications that provide web-based access to CDH components and a platform for building custom applications.
  • 7.  In addition to the previous Apache projects, there are other projects that’s used to help administrating your cluster such as: Apache HIVE. Provide like SQL. Apache Sqoop. Move data to & from BD. Apache PIG. Scripting lang. interface. Apache Mahout. Machine Learning. Apache Oozie. Schedule Hadoop jobs. Apache Flume. Servers Log Collector.
  • 8.  Cloudera Manager is a unified management interface that makes it easy to install, configure, and manage a CDH cluster through a web interface “Admin Console”.
  • 9. 1-Hortonworks : 100% Open Source Enterprise Apache Hadoop.
  • 10. C.M & AMBARI HUE  Both C.M & Ambari are the installation manager for Cloudera and Hortonworks in order.  Used for installing Monitoring, and Configuring Hadoop clusters.  Is an Apache Open source project  Apache Hue used for Interacting with the services in the cluster, and run Commands through a Web User interface.
  • 11. 2 –DataStax : is a complete big data platform, built on Apache Cassandra™, architected to provide scalability, Continuous availability and operational simplicity for real-time, analytic, and enterprise search data in the same database cluster.