2. CDH is 100% Open Source Distribution
including Apache Hadoop.
CDH is 100% Apache-licensed open
source.
CDH is the world’s most complete, tested,
and popular distribution of Apache Hadoop
and related projects.
3. CDH includes the core elements of Hadoop plus
several additional open source projects.
4. Apache Yarn : (Yet Another Resource Negotiator)
Is the data operating system of Hadoop that enables
you to process data simultaneously in multiple ways.
5. Apache Impala : Impala combines modern, parallel
database technology with Hadoop, enabling users to
directly query data stored in HDFS and HBase.
Hive Process data via
MapReduce, Impala is a
stand-alone MPP framework.
6. Apache HUE : Hue is a suite of applications that
provide web-based access to CDH components and a
platform for building custom applications.
7. In addition to the previous Apache projects, there
are other projects that’s used to help
administrating your cluster such as:
Apache HIVE. Provide like SQL.
Apache Sqoop. Move data to & from BD.
Apache PIG. Scripting lang. interface.
Apache Mahout. Machine Learning.
Apache Oozie. Schedule Hadoop jobs.
Apache Flume. Servers Log Collector.
8. Cloudera Manager is a unified management
interface that
makes it easy to
install, configure,
and manage a CDH
cluster through
a web interface
“Admin Console”.
10. C.M & AMBARI HUE
Both C.M & Ambari are
the installation manager
for Cloudera and
Hortonworks in order.
Used for installing
Monitoring, and
Configuring Hadoop
clusters.
Is an Apache Open source
project
Apache Hue used for
Interacting with the services
in the cluster, and run
Commands through a Web
User interface.
11. 2 –DataStax : is a complete big data platform,
built on Apache Cassandra™, architected to
provide scalability, Continuous availability and
operational simplicity for real-time, analytic, and
enterprise search data in the same database
cluster.