Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Cloumon Product Introduction
1. CLOUMON
R
ENJOY HADOOP
Hadoop, Hive and Hadoop Ecosystem
Monitoring and Management System
The powerful Hadoop open-source software stack requires careful integration,
calibration, and monitoring, which is why Gruter has developed its own in-house cloud
management solution, Cloumon. With a user-friendly interface and management
console, Cloumon enables system administrators to optimize the Hadoop ecosystem
and take control of the cloud across the entire data lifecycle.
CLOUMON CH (Core Hadoop)
Hadoop (HDFS, MapReduce) and Hive
R
CLOUMON PA (Power Analytics)
R
Advanced Analysis Rule Manager, Streaming Data Processing
Manager, and Interactive Analysis Query Manager
CLOUMON EPs (Extension Packs)
Oozie, HBase, ZooKeeper, and Flume
R
2. CLOUMON KEY FEATURES
R
MANAGEMENT
ZooKeeper
DATA ANALYSIS
DATA STORAGE
ZooKeeper Node
Manager
HDFS File
HDFS File
Manager
MapReduce
MapReduce Job
Manager
Hive Query
Hive Table
Hive Table
Manager
Hive Query
Manager
HBase
HBase Data
Manager
STREAMING DATA PROCESSING
Esper
Esper Query
Manager
Pig Query
Job Workflow Manager
DATA COLLECTION
Data
Agent
•
Collector
Real-time Analysis Manager
Flume Data Flow Manager
Flume Data Flow
Manager
* Key Cloumon management zones in orange
R
MONITORING
• Collect and graph metrics from target daemon servers including NameNode, DataNode, JobTracker and TaskTracker
• Create alerts by setting thresholds on target metrics and servers
• Construct highly visible log data management views
• Monitor system resource usage
CLUSTER MANAGEMENT
• Manage integrated configurations for server groups
• Conveniently access optimized Hive and Oozie functionality
• Remotely control servers to perform stop-start maintenance routines
• Run various Hadoop distributions including Apache Hadoop 1.0.x, Apache Hadoop 2.0.x and CDH Hadoop 4.2.x
• Control multiple Hadoop clusters
DATA MANAGEMENT
• Manage entire data lifecycle from data collection to storage, batch analysis and real-time analysis
• Browse files with Hadoop File Browser; create and execute queries with Hive Query Workbench
• Design and schedule workflows with Oozie Workflow Designer; manage ZNode with ZooKeeper Manager
1
Enjoy Connecting GRUTER
3. CLOUMON CH (Core Hadoop)
R
Cloumon CH provides a streamlined environment for the operation of Hadoop and Hive, the core components of
advanced Big Data platforms. Through enhanced component visibility and task management features, Cloumon CH
gives unprecedented access to and control over Big Data systems.
HDFS Manager
HDFS Cluster Manager
KEY FEATURES
HDFS daemon status monitoring
Remote server control
· Monitor status and failures on NameNode, JournalNode, SecondaryNameNode,
DataNode and DFSZKFailoverController
· Use simple pre-configured wizards to add new servers to running clusters
(coming release — Q2 2013)
· Start and stop servers remotely
· Manage configurations in server groups
Server group configuration
· Detect servers with asymmetric configurations automatically
· Apply configurations to all clusters or specific target servers
Comprehensive metric monitoring
· Collect HDFS metrics at single minute intervals
· Track performance history and graph server metrics for thorough system analysis
· Set disk usage thresholds by server and partition
User-configured server threshold
alerts
· Set alert thresholds on all HDFS metrics
· Set SMS alerts for critical metrics via Alert Plugin
Integrated log view creation and
management
· Create one-stop views of logs from across the distributed system
Multiple cluster commissioning and
management
· Commission and manage multiple clusters as system scales out
Major HDFS distribution compatibility
· Compatible with major distributions including Apache Hadoop 0.20.x, Apache
Hadoop 1.0.x, Apache Hadoop 2.0.x, CHD 4.1.x, CDH 4.2.x
HDFS File Browser
KEY FEATURES
HDFS commands
· Execute commands including list, mkdir, delete, chown and chmod
List sorting
· Sort lists by name, size, date and owner to improve search speed
And more: Directory tree views; file block information; file data views; file download/upload capabilities
2
Enjoy Connecting GRUTER
4. MapReduce Manager
MapReduce Cluster Manager
KEY FEATURES
MapReduce daemon status monitoring
· Monitor status and failures on JobTracker and TaskTracker
Remote server control
· Use simple pre-configured wizards to add new servers to running clusters
(coming release — Q2 2013)
· Start and stop servers remotely
· Manage configurations in server groups
Server group configuration
· Detect servers with asymmetric configurations automatically
· Apply configurations to all clusters or specific target servers
· Set disk usage thresholds by server and partition
Comprehensive metric monitoring and
configurable server threshold alerts
· Set alert thresholds on all HDFS metrics
· Set SMS alerts for critical metrics via Alert Plugin
Integrated log view creation and
management
· Create one-stop views of logs from across the distributed system
Multiple cluster commissioning and
management
· Commission and manage multiple clusters as system scales out
MapReduce Job Manager
KEY FEATURES
Job management
Job status monitoring
· Manage current job information and track job history
· Filter job lists by status and period
· Monitor task status and job counter
· Track full execution history
Task profiling
· Profile task execution progress and elapsed execution time
Task control
· Abort processes through stop task functionality
Scheduler monitoring
Hive and Oozie integration
3
· Monitor fair scheduler mode queue status
· Manage queues
· Monitor Hive query executions
Enjoy Connecting GRUTER
5. Hive Manager
Hive Query and Hive Configuration
KEY FEATURES
Hive connection management
· Support multiple connections with built-in Hive delegator (Hive installation not
required)
Hive session management
· Manage driver sessions and track query execution status
Table meta viewer and table viewer
· Generate detailed table description views and data tables
Multiple query executor
· Execute multiple simultaneous queries
User-defined jar and script
management
· Upload/delete/apply UDF and Custom M/R
Progress viewer and query status
inquiry
· Check query execution progress and track execution history
Query management
· Generate saved query and Hive function description views
Table and query wizard
· Use simple pre-configured wizards to create tables and queries
· Edit and dynamically deploy Hive and Hadoop client configurations
Configuration management
· Access comprehensive storage usage, partitioning and bucket information.
Versions Supported
System Requirements
Apache Hadoop 0.20.x
HDFS
OS
Linux, Windows
Apache Hadoop 1.0.x
WebServer
Tomcat 6.x
DataBase
MySQL 5.x
Java Virtual Machine
JDK6
Apache Hadoop 2.0.x
CDH 4.1.x
CDH 4.2.x
Apache Hadoop 0.20.x
MapReduce
Apache Hadoop 1.0.x
CDH 4.x-mr1
Service SLA
Apache Hive 0.8.x
Apache Hive 0.9.x
Hive
Apache Hive 0.10.x
CDH 4.1.x Hive
CDH 4.2.x Hive
4
Web-based support
8x5
Phone support
24x7
2-24 hour initial response time
Enjoy Connecting GRUTER
6. CLOUMON PA (Power Analytics)
R
CLOUMON PACKAGE
Cloumon PA is a high-performance Big Data system which brings together a powerful set of cutting-edge
technologies and tools to help you perform advanced analytics on Hadoop and Hive.
Smart query building processes and intuitive execution flows generate sophisticated outputs in just a few clicks
without the need for complex query syntax.
Stream Processing Rule Manager
KEY FEATURES
Console for streaming data
processing
· Manage entire lifecycle of streaming data processing by registering data type,
configuring parser, managing EPL queries for analysis, and storing and querying
results, among other functionalities
Data type management and
configuration
· Define type name, column, record parser and result table
Analysis result storage management
· Use built-in storage interfaces such as HBase and MySQL
· Extend interface to add and select user-defined storage
· Manage EPL queries
Analysis query manager
· Add and delete queries dynamically in a running environment
· Have results stored automatically in selected storage
Analysis output visualization
· Visualize results using various charts and graphs according to data type
Interactive Analytics (Impala, Tajo)
KEY FEATURES
· Manage metadata such as table schemas for integration with Hive
Impala
· Use Impala query workbench
· Monitor status of Impala clusters
· Manage metadata such as table schemas
· Use Tajo query workbench
Tajo
· Monitor status of Tajo clusters
· Manage Hive/Tajo/Impala queries in an integrated fashion and choose optimal
execution platform
· Manage queries in concert with Advanced Analysis Rule Manager
5
Enjoy Connecting GRUTER
7. Advanced Analysis Rule Manager
KEY FEATURES
Hive Query Based Analysis Rule Management
Analysis target object management
Hive query builder
· Select analysis targets such as Hive table and existing queries, among others
· Manage aliases down to fields and rules via user-friendly UI
· Build complex queries with multiple “join”, “group by”, and “order by”
functions/clauses simply and quickly
· Employ variables for high re-usability and productivity
Analysis target querying
· Define materialized views as analysis targets without burden of generating
actual views
· Create fresh results at each execution or conveniently reuse previous results
at point of execution
Rule charting
· Visualize usage of individual rules and their interrelationships through charting
and graphing tools
Execution and Result Management
Scheduling
· Manage execution start time
· Track execution history
Query optimization
· Create fresh results at each execution or conveniently reuse previous results
at point of execution
Dynamic variable binding at execution
· Bind actual values to variables dynamically at point of execution
Multiple storage options
· Use Hive Table, HDFS Directory and HBase Table
Powerful viewer and APIs to access analysis outputs
6
Enjoy Connecting GRUTER
8. CLOUMON EPs (Extension Packs)
R
CLOUMON PACKAGE
Cloumon EPs provide additional monitoring and management capabilities for other key components of the Hadoop
ecosystem including Oozie, HBase, Flume and Zookeeper, granting comprehensive control of the entire data lifecycle
from data collection, storage and workflow design to task scheduling and distributed system role management.
Oozie Workflow Manager
KEY FEATURES
Wysiwyg job designer
· Upload jar files
Library file management
· Manage mapper, reducer, and writable classes
· Manage job libraries (distributed cache)
· Schedule job execution
Job execution management
· Track job execution history
· Monitor jobs via integrated Cloumon MapReduce Job Manager
HBase Manager
KEY FEATURES
HBase Cluster Management
HMaster and RegionServer alerts
Single server metric monitoring
· Collect information at single minute intervals
· Track history of metric changes over time and chart in time-series
Table and Region status monitoring
·
·
·
·
Manage Region (Q3 2013)
· Perform Region compaction, split, and merge
· Execute and schedule jobs according to user-configured rules
Fetch lists of tables and look up table schemas
Manage table region lists and region lists on RegionServers
Monitor detailed region metrics
Create and drop table (Q3 2013)
HBase Data Management
Table data scanning
Column data fetching by row query
Long type transformation
· Automatically convert byte array long to numeric long for readability
Web-based HBase shell (Q3 2013)
7
Enjoy Connecting GRUTER
9. ZooKeeper Manager
KEY FEATURES
ZooKeeper Cluster Management
Monitor server status and set alerts
View detailed ZooKeeper server
metrics
· Collect metrics at single minute intervals
Monitor ZooKeeper connections
· Monitor and inspect all connections to ZooKeeper servers
· Track metric change history
Manage multiple clusters simultaneously
ZooKeeper Node Management
Easily manage zNodes by accessing detailed information and manipulating data through convenient file browser
interface
Manage ACLs for each zNode
Manage zNode watcher registration
Flume Manager for Flume-OG (v0.9.4)
KEY FEATURES
Data flow management
· Inspect data flow between agent and collector
· Monitor workloads of each node using workload indicators
· Design data processing flows via powerful tool which allocates source, deco
and sink
Powerful configuration tool
· Easily set parameters with pre-configured forms and help tips
· Reuse and edit existing configurations
· Check overview of node status and drill down to analyze specific details
Physical/logical node status
monitoring
· List logical nodes on specific physical nodes
· Create integrated views by combining data from Flume masters and ZooKeeper
Map/unmap/decommission/purgeAll
Multiple cluster management
8
· Control the entire lifecycle of logical nodes with minimal clicks
· Use smart proxies to complete complex jobs in a single click
· Manage multiple clusters
Enjoy Connecting GRUTER
10. Gruter: Your Partner in the Big Data Revolution
Phone: +82-2-508-5911
Fax: +82-2-508-5912
E-mail: inquiries@gruter.com
Web: www.gruter.com
For demo videos, please visit: www.gruter.com/products/cloumon#video
GRUTER, INC.
5F Sehwa Office Building 889-70 Daechi-dong, Gangnam-gu, Seoul, South Korea 135-839