SlideShare a Scribd company logo
1 of 42
Streamline Hadoop DevOps with
Apache Ambari
Alejandro Fernandez
Melbourn
e
Speaker
Alejandro Fernandez
Sr. Software Engineer @ Hortonworks
Apache Ambari PMC
alejandro@apache.org
What is Apache Ambari?
Apache Ambari is the open-source platform to
provision, manage and monitor Hadoop clusters
ApacheAmbariistheopen-sourceplatformto
provision,manageandmonitorHadoopclusters
4 years old
Exciting Enterprise Features in Ambari 2.4
• New Services: Log Search, Zeppelin, Hive LLAP
• Role Based Access Control
• Management Packs
• Grafana UI for Ambari Metrics System
• New Views: Zeppelin, Storm
More in Ambari 2.4
• Alerts: Customizable props and thresholds
(AMBARI-14898)
• Alerts: Retry tolerance (AMBARI-15686)
• Alerts: New HDFS Alerts (AMBARI-14800)
• New Host Page Filtering (AMBARI-15210)
• Remove Service from UI (AMBARI-14759)
• Support for SLES 12 (AMBARI-16007)
• Stability: Database Consistency Checking
(AMBARI-16258)
• Customizable Ambari Log + PID Dirs
(AMBARI-15300)
• New Version Registration Experience
(AMBARI-15724)
• Log Search Technical Preview (AMBARI-
14927)
• Operational Audit Logging (AMBARI-15241)
• Role-Based Access Control (AMBARI-13977)
• Automated Setup of Ambari Kerberos through
Blueprints (AMBARI-15561)
• Automated Setup of Ambari Proxy User
(AMBARI-15561)
• Customizable Host Reg. SSH Port (AMBARI-
13450)
Core Features Security Features
• View URLs for bookmarks (AMBARI-15821),
View Refresh (AMBARI-15682)
• Inherit Cluster Permissions (AMBARI-16177)
• Remote Cluster Registration (AMBARI-
16274)
Views Framework
Features
Deploy
Secure/L
DAP
Smart
Configs
Upgrade
Monitor
Scale,
Extend,
Analyze
Simply Operations - Lifecycle
Ease-of-Use Deploy
Deploy On Premise
Ambari UI wizard handles all of these
combinations and makes recommendations
based on host specs.
Deploy On The Cloud
Certified environments
Sysprepped VMs
Hundreds of similar clusters
Deploy with Blueprints
• Systematic way of defining a cluster
• Export existing cluster into blueprint
/api/v1/clusters/:clusterName?format=blueprint
Config
s
Topology Hosts Cluster
Create a cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
Create a cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
Create a cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
Create a cluster with Blueprints
{
"configurations" : [
{
"hdfs-site" : {
"dfs.datanode.data.dir" : "/hadoop/1,
/hadoop/2,/hadoop/3"
}
}
],
"host_groups" : [
{
"name" : "master-host",
"components" : [
{ "name" : "NAMENODE” },
{ "name" : "RESOURCEMANAGER” },
…
],
"cardinality" : "1"
},
{
"name" : "worker-host",
"components" : [
{ "name" : "DATANODE" },
{ "name" : "NODEMANAGER” },
…
],
"cardinality" : "1+"
},
],
"Blueprints" : {
"stack_name" : "HDP",
"stack_version" : "2.5"
}
}
{
"blueprint" : "my-blueprint",
"host_groups" :[
{
"name" : "master-host",
"hosts" : [
{
"fqdn" : "master001.ambari.apache.org"
}
]
},
{
"name" : "worker-host",
"hosts" : [
{
"fqdn" : "worker001.ambari.apache.org"
},
{
"fqdn" : "worker002.ambari.apache.org"
},
…
{
"fqdn" : "worker099.ambari.apache.org"
}
]
}
]
}
1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
Blueprints for Large Scale
• Kerberos, secure out-of-the-box
• High Availability is setup initially for
NameNode, YARN, Hive, Oozie, etc
• Host Discovery allows Ambari to
automatically install services for a Host
when it comes online
• Stack Advisor recommendations
POST /api/v1/clusters/MyCluster/hosts
[
{
"blueprint" : "single-node-hdfs-test2",
"host_groups" :[
{
"host_group" : "slave",
"host_count" : 3,
"host_predicate" : "Hosts/cpu_count>1”
}, {
"host_group" : "super-slave",
"host_count" : 5,
"host_predicate" : "Hosts/cpu_count>2&
Hosts/total_mem>3000000"
}
]
}
]
Blueprint Host Discovery
Comprehensive Security
LDAP/AD
• User auth
• Sync
Kerberos
• MIT KDC
• Keytab
management
Atlas
• Governance
• Compliance
• Linage & history
• Data classification
Ranger
• Security policies
• Audit
• Authorization
Knox
• Perimeter security
• Supports LDAP/AD
• Sec. for
REST/HTTP
• SSL
Kerberos
Ambari manages Kerberos principals and keytabs
Works with existing MIT KDC or Active Directory
Once Kerberized, handles
1. Adding hosts
2. Adding components
to existing hosts
3. Adding services
4. Moving components
to different hosts
Management Packs
• Improved Release Management:
Decouple Ambari core from stacks
releases
• Support Add-ons:
Release vehicle for 3rd party services, views
Self-contained release artifacts
Stack is an overlay of multiple management
packs
Overlay of Management Packs
inherits from 2.3
inherits from 2.4
inherits from 2.5
Management Pack++
Short Term Goals (Ambari 2.4)
• Retrofit in Stack Processing Framework
• Enable 3rd party to ship add-on services
Future Goals
• Management Pack Framework
• Deliver Views
Role Based Access Control (RBAC)
As Ambari & organizations grow,
so do security needs
Ambari integrates with external
authentication systems & LDAP
RBAC Terms
Users belong to groups
A group has a role
Users can also have additional roles
Roles are applied to Resources. E.g.,
Ambari, particular Cluster, particular View
Roles have permissions
e.g., add services to cluster
New RBAC Roles
only view
↑, except change configs
↑, except alter cluster topology
or install components
Ambari Admin
Cluster Admin
Cluster Op
Service Admin
Service Op
Read-Only
↑, except add services, Kerberos,
manage alerts & upgrades
↑, except manage permissions
all
Service Layout
Common Services Stack Override
Stack Advisor
Kerberos
HTTPS
Zookeeper Servers
Memory Settings
…
High Availability
atlas.rest.address =
http(s)://host:port
# Atlas Servers
atlas.enabletTLS = true|false
atlas.server.http.port = 21000
atlas.server.https.port = 21443
Example
Configuration
s
Background: Upgrade Terminology
Manual
Upgrade
The user follows instructions to upgrade
the stack
Incurs downtime
Background: Upgrade Terminology
Manual
Upgrade
The user follows instructions to upgrade
the stack
Incurs downtime
Rolling
Upgrade
Automated
Upgrades one component
per host at a time
Preserves cluster operation
and minimizes service impact
Background: Upgrade Terminology
Express
Upgrade
Automated
Runs in parallel across hosts
Incurs downtime
Manual
Upgrade
The user follows instructions to upgrade
the stack
Incurs downtime
Rolling
Upgrade
Automated
Upgrades one component
per host at a time
Preserves cluster operation
and minimizes service impact
Automated Upgrade: Rolling or Express
Check
Prerequisites
Review the
prereqs to
confirm
your cluster
configs are
ready
Prepare
Take
backups of
critical
cluster
metadata
Perform
Upgrade
Perform the
HDP
upgrade.
The steps
depend on
upgrade
method:
Rolling or
Express
Register +
Install
Register the
HDP
repository
and install
the target
HDP version
on the
cluster
Finalize
Finalize the
upgrade,
making the
target
version the
current
version
Process: Rolling Upgrade
ZooKeeper
Ranger
Hive
Oozie
Falcon
Kafka
Knox
Storm
Slider
Flume
Finalize or
Downgrade
Clients HDFS, YARN, MR, Tez,
HBase, Pig. Hive, etc.
Core
Masters
Core Slaves
HDFS
YARN
HBas
e
Alerting Framework
Alert Type Description Thresholds (units)
WEB Connects to a Web URL. Alert status is
based on the HTTP response code
Response Code (n/a)
Connection Timeout (seconds)
PORT Connects to a port. Alert status is based on
response time
Response (seconds)
METRIC Checks the value of a service metric. Units
vary, based on the metric being checked
Metric Value (units vary)
Connection Timeout (seconds)
AGGREGA
TE
Aggregates the status for another alert % Affected (percentage)
SCRIPT Executes a script to handle the alert check Varies
SERVER Executes a server-side runnable class to
handle the alert check
Varies
Alert Check Counts
• Customize the number of times an alert is
checked before dispatching a notification
• Avoid dispatching an alert notification (email, snmp)
in case of transient issues
Alerts - Configuring the Check Count
Set globally for all alerts, or override for a specific alert
Global
Setting Alert
Override
Storm Monitoring View
Grafana for Ambari Metrics
• Grafana as a “Native UI”
for Ambari Metrics
• Pre-built Dashboards
Host-level, Service-level
• Supports HTTPS
• System Home, Servers
• HDFS Home,
NameNodes, DataNodes
• YARN Home,
Applications, Job History
Server
• HBase Home,
Performance
FEATURES DASHBOARDS
Grafana includes pre-built
dashboards for visualizing the most
important cluster metrics.
The HDFS NameNode
dashboard highlights
file system activity.
Log Search
Search and index HDP logs!
Capabilities
• Rapid Search of all HDP component logs
• Search across time ranges, log levels, and for
keywords
Solr
Logsearch
Ambari
Log Search
WO R K E R
N O D E
L O G
F E E D E R
Solr
LO G
S EA RC H
U I
Solr
Solr
A M BA R I
Java Process
Multi-output Support
Grok filters
Solr Cloud
Local Disk Storage
Future of Ambari
• Cloud features
• Service multi-instance (two ZK quorums)
• Service multi-versions (Spark 1.6 & Spark 2.0)
• YARN assemblies
• Patch Upgrades: upgrade individual components
in the same stack version, e.g., just DN and RM in
HDP 2.5.*.* with zero downtime
• Ambari High Availability
As good as

More Related Content

What's hot

Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016alanfgates
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastDataWorks Summit
 
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingApache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingDataWorks Summit/Hadoop Summit
 
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?Hortonworks
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresSteve Loughran
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseDataWorks Summit
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduDataWorks Summit
 
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BIDataWorks Summit
 
An Overview on Optimization in Apache Hive: Past, Present, Future
An Overview on Optimization in Apache Hive: Past, Present, FutureAn Overview on Optimization in Apache Hive: Past, Present, Future
An Overview on Optimization in Apache Hive: Past, Present, FutureDataWorks Summit
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem DataWorks Summit/Hadoop Summit
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwordsSzehon Ho
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureHortonworks
 
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...DataWorks Summit/Hadoop Summit
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseDataWorks Summit/Hadoop Summit
 
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer toolsMay 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer toolsYahoo Developer Network
 
Running a container cloud on YARN
Running a container cloud on YARNRunning a container cloud on YARN
Running a container cloud on YARNDataWorks Summit
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkDataWorks Summit
 
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
Sqoop2 refactoring for generic data transfer - NYC Sqoop MeetupSqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetupgethue
 
Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014alanfgates
 

What's hot (20)

Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
Hive2.0 sql speed-scale--hadoop-summit-dublin-apr-2016
 
Troubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the BeastTroubleshooting Kerberos in Hadoop: Taming the Beast
Troubleshooting Kerberos in Hadoop: Taming the Beast
 
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and TroubleshootingApache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
Apache Ambari - HDP Cluster Upgrades Operational Deep Dive and Troubleshooting
 
S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?S3Guard: What's in your consistency model?
S3Guard: What's in your consistency model?
 
Hadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object StoresHadoop, Hive, Spark and Object Stores
Hadoop, Hive, Spark and Object Stores
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Low latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache KuduLow latency high throughput streaming using Apache Apex and Apache Kudu
Low latency high throughput streaming using Apache Apex and Apache Kudu
 
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BI
 
An Overview on Optimization in Apache Hive: Past, Present, Future
An Overview on Optimization in Apache Hive: Past, Present, FutureAn Overview on Optimization in Apache Hive: Past, Present, Future
An Overview on Optimization in Apache Hive: Past, Present, Future
 
Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem Large-Scale Stream Processing in the Hadoop Ecosystem
Large-Scale Stream Processing in the Hadoop Ecosystem
 
Hive on spark berlin buzzwords
Hive on spark berlin buzzwordsHive on spark berlin buzzwords
Hive on spark berlin buzzwords
 
Apache Ambari: Past, Present, Future
Apache Ambari: Past, Present, FutureApache Ambari: Past, Present, Future
Apache Ambari: Past, Present, Future
 
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
Near Real-Time Network Anomaly Detection and Traffic Analysis using Spark bas...
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer toolsMay 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
May 2013 HUG: Apache Sqoop 2 - A next generation of data transfer tools
 
The Heterogeneous Data lake
The Heterogeneous Data lakeThe Heterogeneous Data lake
The Heterogeneous Data lake
 
Running a container cloud on YARN
Running a container cloud on YARNRunning a container cloud on YARN
Running a container cloud on YARN
 
Flexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache FlinkFlexible and Real-Time Stream Processing with Apache Flink
Flexible and Real-Time Stream Processing with Apache Flink
 
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
Sqoop2 refactoring for generic data transfer - NYC Sqoop MeetupSqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
Sqoop2 refactoring for generic data transfer - NYC Sqoop Meetup
 
Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014Hive analytic workloads hadoop summit san jose 2014
Hive analytic workloads hadoop summit san jose 2014
 

Viewers also liked

DevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 ConferenceDevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 ConferenceGrid Dynamics
 
Building Hadoop with Chef
Building Hadoop with ChefBuilding Hadoop with Chef
Building Hadoop with ChefJohn Martin
 
CIS13: Managing the Keys to the Kingdom: Next-Gen Role-based Access Control a...
CIS13: Managing the Keys to the Kingdom: Next-Gen Role-based Access Control a...CIS13: Managing the Keys to the Kingdom: Next-Gen Role-based Access Control a...
CIS13: Managing the Keys to the Kingdom: Next-Gen Role-based Access Control a...CloudIDSummit
 
San Francisco Best Places to Work Roadshow | Centrify
San Francisco Best Places to Work Roadshow | CentrifySan Francisco Best Places to Work Roadshow | Centrify
San Francisco Best Places to Work Roadshow | CentrifyGlassdoor
 
Deploying Hadoop-Based Bigdata Environments
Deploying Hadoop-Based Bigdata EnvironmentsDeploying Hadoop-Based Bigdata Environments
Deploying Hadoop-Based Bigdata EnvironmentsPuppet
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowDataWorks Summit
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHortonworks
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifyHortonworks
 
Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Hortonworks
 
Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13TeamQuest Corporation
 
How the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOpsHow the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOpsCA Technologies
 
What Big Data Folks Need to Know About DevOps
What Big Data Folks Need to Know About DevOpsWhat Big Data Folks Need to Know About DevOps
What Big Data Folks Need to Know About DevOpsMatt Ray
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache AmbariHortonworks
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview Hortonworks
 
Dev ops for big data cluster management tools
Dev ops for big data  cluster management toolsDev ops for big data  cluster management tools
Dev ops for big data cluster management toolsRan Silberman
 
Welcome - Keynote - AWSome Day Helsinki 2017
Welcome - Keynote - AWSome Day Helsinki 2017Welcome - Keynote - AWSome Day Helsinki 2017
Welcome - Keynote - AWSome Day Helsinki 2017Amazon Web Services
 

Viewers also liked (20)

Protecting Enterprise Data in Apache Hadoop
Protecting Enterprise Data in Apache HadoopProtecting Enterprise Data in Apache Hadoop
Protecting Enterprise Data in Apache Hadoop
 
DevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 ConferenceDevOps for Big Data - Data 360 2014 Conference
DevOps for Big Data - Data 360 2014 Conference
 
Building Hadoop with Chef
Building Hadoop with ChefBuilding Hadoop with Chef
Building Hadoop with Chef
 
CIS13: Managing the Keys to the Kingdom: Next-Gen Role-based Access Control a...
CIS13: Managing the Keys to the Kingdom: Next-Gen Role-based Access Control a...CIS13: Managing the Keys to the Kingdom: Next-Gen Role-based Access Control a...
CIS13: Managing the Keys to the Kingdom: Next-Gen Role-based Access Control a...
 
San Francisco Best Places to Work Roadshow | Centrify
San Francisco Best Places to Work Roadshow | CentrifySan Francisco Best Places to Work Roadshow | Centrify
San Francisco Best Places to Work Roadshow | Centrify
 
Deploying Hadoop-Based Bigdata Environments
Deploying Hadoop-Based Bigdata EnvironmentsDeploying Hadoop-Based Bigdata Environments
Deploying Hadoop-Based Bigdata Environments
 
Hadoop Security Today and Tomorrow
Hadoop Security Today and TomorrowHadoop Security Today and Tomorrow
Hadoop Security Today and Tomorrow
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and CentrifySimplify and Secure your Hadoop Environment with Hortonworks and Centrify
Simplify and Secure your Hadoop Environment with Hortonworks and Centrify
 
Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0
 
BI + Big Data
BI + Big DataBI + Big Data
BI + Big Data
 
Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13Big Data - Marrying Service Management With Service Delivery - #Pink13
Big Data - Marrying Service Management With Service Delivery - #Pink13
 
How the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOpsHow the Big Data of APM can Supercharge DevOps
How the Big Data of APM can Supercharge DevOps
 
Simplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & TroubleshootingSimplified Cluster Operation & Troubleshooting
Simplified Cluster Operation & Troubleshooting
 
What Big Data Folks Need to Know About DevOps
What Big Data Folks Need to Know About DevOpsWhat Big Data Folks Need to Know About DevOps
What Big Data Folks Need to Know About DevOps
 
Hortonworks Technical Workshop: Apache Ambari
Hortonworks Technical Workshop:   Apache AmbariHortonworks Technical Workshop:   Apache Ambari
Hortonworks Technical Workshop: Apache Ambari
 
Hdp security overview
Hdp security overview Hdp security overview
Hdp security overview
 
Securing Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise ContextSecuring Hadoop in an Enterprise Context
Securing Hadoop in an Enterprise Context
 
Dev ops for big data cluster management tools
Dev ops for big data  cluster management toolsDev ops for big data  cluster management tools
Dev ops for big data cluster management tools
 
Welcome - Keynote - AWSome Day Helsinki 2017
Welcome - Keynote - AWSome Day Helsinki 2017Welcome - Keynote - AWSome Day Helsinki 2017
Welcome - Keynote - AWSome Day Helsinki 2017
 

Similar to Streamline Hadoop DevOps with Apache Ambari

Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariAlejandro Fernandez
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariAlejandro Fernandez
 
Apache Ambari - What's New in 1.6.0
Apache Ambari - What's New in 1.6.0Apache Ambari - What's New in 1.6.0
Apache Ambari - What's New in 1.6.0Hortonworks
 
Puppet and Apache CloudStack
Puppet and Apache CloudStackPuppet and Apache CloudStack
Puppet and Apache CloudStackPuppet
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache MesosJoe Stein
 
Zero to Sixty: AWS CloudFormation (DMG201) | AWS re:Invent 2013
Zero to Sixty: AWS CloudFormation (DMG201) | AWS re:Invent 2013Zero to Sixty: AWS CloudFormation (DMG201) | AWS re:Invent 2013
Zero to Sixty: AWS CloudFormation (DMG201) | AWS re:Invent 2013Amazon Web Services
 
Ansible with oci
Ansible with ociAnsible with oci
Ansible with ociDonghuKIM2
 
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine YardHow I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine YardSV Ruby on Rails Meetup
 
Bare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and ChefBare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and ChefMatt Ray
 
Building a Dev/Test Cloud with Apache CloudStack
Building a Dev/Test Cloud with Apache CloudStackBuilding a Dev/Test Cloud with Apache CloudStack
Building a Dev/Test Cloud with Apache CloudStackke4qqq
 
Apache Ambari Stack Extensibility
Apache Ambari Stack ExtensibilityApache Ambari Stack Extensibility
Apache Ambari Stack ExtensibilityJayush Luniya
 
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014Rackspace Academy
 
Automation with Packer and TerraForm
Automation with Packer and TerraFormAutomation with Packer and TerraForm
Automation with Packer and TerraFormWesley Charles Blake
 
10 things I learned building Nomad packs
10 things I learned building Nomad packs10 things I learned building Nomad packs
10 things I learned building Nomad packsBram Vogelaar
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalleybuildacloud
 
fog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloudfog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the CloudWesley Beary
 
LF_APIStrat17_REST API Microversions
LF_APIStrat17_REST API Microversions LF_APIStrat17_REST API Microversions
LF_APIStrat17_REST API Microversions LF_APIStrat
 

Similar to Streamline Hadoop DevOps with Apache Ambari (20)

Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Simplified Cluster Operation and Troubleshooting
Simplified Cluster Operation and TroubleshootingSimplified Cluster Operation and Troubleshooting
Simplified Cluster Operation and Troubleshooting
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Apache Ambari - What's New in 1.6.0
Apache Ambari - What's New in 1.6.0Apache Ambari - What's New in 1.6.0
Apache Ambari - What's New in 1.6.0
 
Puppet and Apache CloudStack
Puppet and Apache CloudStackPuppet and Apache CloudStack
Puppet and Apache CloudStack
 
Introduction To Apache Mesos
Introduction To Apache MesosIntroduction To Apache Mesos
Introduction To Apache Mesos
 
Zero to Sixty: AWS CloudFormation (DMG201) | AWS re:Invent 2013
Zero to Sixty: AWS CloudFormation (DMG201) | AWS re:Invent 2013Zero to Sixty: AWS CloudFormation (DMG201) | AWS re:Invent 2013
Zero to Sixty: AWS CloudFormation (DMG201) | AWS re:Invent 2013
 
Ansible with oci
Ansible with ociAnsible with oci
Ansible with oci
 
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine YardHow I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
How I Learned to Stop Worrying and Love the Cloud - Wesley Beary, Engine Yard
 
Bare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and ChefBare Metal to OpenStack with Razor and Chef
Bare Metal to OpenStack with Razor and Chef
 
Building a Dev/Test Cloud with Apache CloudStack
Building a Dev/Test Cloud with Apache CloudStackBuilding a Dev/Test Cloud with Apache CloudStack
Building a Dev/Test Cloud with Apache CloudStack
 
Apache Ambari Stack Extensibility
Apache Ambari Stack ExtensibilityApache Ambari Stack Extensibility
Apache Ambari Stack Extensibility
 
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
Software as a Service workshop / Unlocked: the Hybrid Cloud 12th May 2014
 
Automation with Packer and TerraForm
Automation with Packer and TerraFormAutomation with Packer and TerraForm
Automation with Packer and TerraForm
 
10 things I learned building Nomad packs
10 things I learned building Nomad packs10 things I learned building Nomad packs
10 things I learned building Nomad packs
 
Introduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David NalleyIntroduction to Apache CloudStack by David Nalley
Introduction to Apache CloudStack by David Nalley
 
fog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloudfog or: How I Learned to Stop Worrying and Love the Cloud
fog or: How I Learned to Stop Worrying and Love the Cloud
 
LF_APIStrat17_REST API Microversions
LF_APIStrat17_REST API Microversions LF_APIStrat17_REST API Microversions
LF_APIStrat17_REST API Microversions
 
Logstash
LogstashLogstash
Logstash
 

More from DataWorks Summit/Hadoop Summit

Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerDataWorks Summit/Hadoop Summit
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformDataWorks Summit/Hadoop Summit
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDataWorks Summit/Hadoop Summit
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...DataWorks Summit/Hadoop Summit
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...DataWorks Summit/Hadoop Summit
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLDataWorks Summit/Hadoop Summit
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)DataWorks Summit/Hadoop Summit
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...DataWorks Summit/Hadoop Summit
 

More from DataWorks Summit/Hadoop Summit (20)

Running Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in ProductionRunning Apache Spark & Apache Zeppelin in Production
Running Apache Spark & Apache Zeppelin in Production
 
State of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache ZeppelinState of Security: Apache Spark & Apache Zeppelin
State of Security: Apache Spark & Apache Zeppelin
 
Unleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache RangerUnleashing the Power of Apache Atlas with Apache Ranger
Unleashing the Power of Apache Atlas with Apache Ranger
 
Enabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science PlatformEnabling Digital Diagnostics with a Data Science Platform
Enabling Digital Diagnostics with a Data Science Platform
 
Revolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and ZeppelinRevolutionize Text Mining with Spark and Zeppelin
Revolutionize Text Mining with Spark and Zeppelin
 
Double Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSenseDouble Your Hadoop Performance with Hortonworks SmartSense
Double Your Hadoop Performance with Hortonworks SmartSense
 
Hadoop Crash Course
Hadoop Crash CourseHadoop Crash Course
Hadoop Crash Course
 
Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Apache Spark Crash Course
Apache Spark Crash CourseApache Spark Crash Course
Apache Spark Crash Course
 
Dataflow with Apache NiFi
Dataflow with Apache NiFiDataflow with Apache NiFi
Dataflow with Apache NiFi
 
Schema Registry - Set you Data Free
Schema Registry - Set you Data FreeSchema Registry - Set you Data Free
Schema Registry - Set you Data Free
 
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
Building a Large-Scale, Adaptive Recommendation Engine with Apache Flink and ...
 
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
Real-Time Anomaly Detection using LSTM Auto-Encoders with Deep Learning4J on ...
 
Mool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and MLMool - Automated Log Analysis using Data Science and ML
Mool - Automated Log Analysis using Data Science and ML
 
How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient How Hadoop Makes the Natixis Pack More Efficient
How Hadoop Makes the Natixis Pack More Efficient
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)The Challenge of Driving Business Value from the Analytics of Things (AOT)
The Challenge of Driving Business Value from the Analytics of Things (AOT)
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
From Regulatory Process Verification to Predictive Maintenance and Beyond wit...
 
Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop Backup and Disaster Recovery in Hadoop
Backup and Disaster Recovery in Hadoop
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 

Streamline Hadoop DevOps with Apache Ambari

  • 1. Streamline Hadoop DevOps with Apache Ambari Alejandro Fernandez Melbourn e
  • 2. Speaker Alejandro Fernandez Sr. Software Engineer @ Hortonworks Apache Ambari PMC alejandro@apache.org
  • 3. What is Apache Ambari? Apache Ambari is the open-source platform to provision, manage and monitor Hadoop clusters ApacheAmbariistheopen-sourceplatformto provision,manageandmonitorHadoopclusters
  • 4.
  • 6. Exciting Enterprise Features in Ambari 2.4 • New Services: Log Search, Zeppelin, Hive LLAP • Role Based Access Control • Management Packs • Grafana UI for Ambari Metrics System • New Views: Zeppelin, Storm
  • 7. More in Ambari 2.4 • Alerts: Customizable props and thresholds (AMBARI-14898) • Alerts: Retry tolerance (AMBARI-15686) • Alerts: New HDFS Alerts (AMBARI-14800) • New Host Page Filtering (AMBARI-15210) • Remove Service from UI (AMBARI-14759) • Support for SLES 12 (AMBARI-16007) • Stability: Database Consistency Checking (AMBARI-16258) • Customizable Ambari Log + PID Dirs (AMBARI-15300) • New Version Registration Experience (AMBARI-15724) • Log Search Technical Preview (AMBARI- 14927) • Operational Audit Logging (AMBARI-15241) • Role-Based Access Control (AMBARI-13977) • Automated Setup of Ambari Kerberos through Blueprints (AMBARI-15561) • Automated Setup of Ambari Proxy User (AMBARI-15561) • Customizable Host Reg. SSH Port (AMBARI- 13450) Core Features Security Features • View URLs for bookmarks (AMBARI-15821), View Refresh (AMBARI-15682) • Inherit Cluster Permissions (AMBARI-16177) • Remote Cluster Registration (AMBARI- 16274) Views Framework Features
  • 9. Deploy On Premise Ambari UI wizard handles all of these combinations and makes recommendations based on host specs.
  • 10. Deploy On The Cloud Certified environments Sysprepped VMs Hundreds of similar clusters
  • 11. Deploy with Blueprints • Systematic way of defining a cluster • Export existing cluster into blueprint /api/v1/clusters/:clusterName?format=blueprint Config s Topology Hosts Cluster
  • 12. Create a cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 13. Create a cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 14. Create a cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 15. Create a cluster with Blueprints { "configurations" : [ { "hdfs-site" : { "dfs.datanode.data.dir" : "/hadoop/1, /hadoop/2,/hadoop/3" } } ], "host_groups" : [ { "name" : "master-host", "components" : [ { "name" : "NAMENODE” }, { "name" : "RESOURCEMANAGER” }, … ], "cardinality" : "1" }, { "name" : "worker-host", "components" : [ { "name" : "DATANODE" }, { "name" : "NODEMANAGER” }, … ], "cardinality" : "1+" }, ], "Blueprints" : { "stack_name" : "HDP", "stack_version" : "2.5" } } { "blueprint" : "my-blueprint", "host_groups" :[ { "name" : "master-host", "hosts" : [ { "fqdn" : "master001.ambari.apache.org" } ] }, { "name" : "worker-host", "hosts" : [ { "fqdn" : "worker001.ambari.apache.org" }, { "fqdn" : "worker002.ambari.apache.org" }, … { "fqdn" : "worker099.ambari.apache.org" } ] } ] } 1. POST /api/v1/blueprints/my-blueprint 2. POST /api/v1/clusters/my-cluster
  • 16. Blueprints for Large Scale • Kerberos, secure out-of-the-box • High Availability is setup initially for NameNode, YARN, Hive, Oozie, etc • Host Discovery allows Ambari to automatically install services for a Host when it comes online • Stack Advisor recommendations
  • 17. POST /api/v1/clusters/MyCluster/hosts [ { "blueprint" : "single-node-hdfs-test2", "host_groups" :[ { "host_group" : "slave", "host_count" : 3, "host_predicate" : "Hosts/cpu_count>1” }, { "host_group" : "super-slave", "host_count" : 5, "host_predicate" : "Hosts/cpu_count>2& Hosts/total_mem>3000000" } ] } ] Blueprint Host Discovery
  • 18. Comprehensive Security LDAP/AD • User auth • Sync Kerberos • MIT KDC • Keytab management Atlas • Governance • Compliance • Linage & history • Data classification Ranger • Security policies • Audit • Authorization Knox • Perimeter security • Supports LDAP/AD • Sec. for REST/HTTP • SSL
  • 19. Kerberos Ambari manages Kerberos principals and keytabs Works with existing MIT KDC or Active Directory Once Kerberized, handles 1. Adding hosts 2. Adding components to existing hosts 3. Adding services 4. Moving components to different hosts
  • 20. Management Packs • Improved Release Management: Decouple Ambari core from stacks releases • Support Add-ons: Release vehicle for 3rd party services, views Self-contained release artifacts Stack is an overlay of multiple management packs
  • 21. Overlay of Management Packs inherits from 2.3 inherits from 2.4 inherits from 2.5
  • 22. Management Pack++ Short Term Goals (Ambari 2.4) • Retrofit in Stack Processing Framework • Enable 3rd party to ship add-on services Future Goals • Management Pack Framework • Deliver Views
  • 23. Role Based Access Control (RBAC) As Ambari & organizations grow, so do security needs Ambari integrates with external authentication systems & LDAP
  • 24. RBAC Terms Users belong to groups A group has a role Users can also have additional roles Roles are applied to Resources. E.g., Ambari, particular Cluster, particular View Roles have permissions e.g., add services to cluster
  • 25. New RBAC Roles only view ↑, except change configs ↑, except alter cluster topology or install components Ambari Admin Cluster Admin Cluster Op Service Admin Service Op Read-Only ↑, except add services, Kerberos, manage alerts & upgrades ↑, except manage permissions all
  • 27. Stack Advisor Kerberos HTTPS Zookeeper Servers Memory Settings … High Availability atlas.rest.address = http(s)://host:port # Atlas Servers atlas.enabletTLS = true|false atlas.server.http.port = 21000 atlas.server.https.port = 21443 Example Configuration s
  • 28. Background: Upgrade Terminology Manual Upgrade The user follows instructions to upgrade the stack Incurs downtime
  • 29. Background: Upgrade Terminology Manual Upgrade The user follows instructions to upgrade the stack Incurs downtime Rolling Upgrade Automated Upgrades one component per host at a time Preserves cluster operation and minimizes service impact
  • 30. Background: Upgrade Terminology Express Upgrade Automated Runs in parallel across hosts Incurs downtime Manual Upgrade The user follows instructions to upgrade the stack Incurs downtime Rolling Upgrade Automated Upgrades one component per host at a time Preserves cluster operation and minimizes service impact
  • 31. Automated Upgrade: Rolling or Express Check Prerequisites Review the prereqs to confirm your cluster configs are ready Prepare Take backups of critical cluster metadata Perform Upgrade Perform the HDP upgrade. The steps depend on upgrade method: Rolling or Express Register + Install Register the HDP repository and install the target HDP version on the cluster Finalize Finalize the upgrade, making the target version the current version
  • 32. Process: Rolling Upgrade ZooKeeper Ranger Hive Oozie Falcon Kafka Knox Storm Slider Flume Finalize or Downgrade Clients HDFS, YARN, MR, Tez, HBase, Pig. Hive, etc. Core Masters Core Slaves HDFS YARN HBas e
  • 33. Alerting Framework Alert Type Description Thresholds (units) WEB Connects to a Web URL. Alert status is based on the HTTP response code Response Code (n/a) Connection Timeout (seconds) PORT Connects to a port. Alert status is based on response time Response (seconds) METRIC Checks the value of a service metric. Units vary, based on the metric being checked Metric Value (units vary) Connection Timeout (seconds) AGGREGA TE Aggregates the status for another alert % Affected (percentage) SCRIPT Executes a script to handle the alert check Varies SERVER Executes a server-side runnable class to handle the alert check Varies
  • 34. Alert Check Counts • Customize the number of times an alert is checked before dispatching a notification • Avoid dispatching an alert notification (email, snmp) in case of transient issues
  • 35. Alerts - Configuring the Check Count Set globally for all alerts, or override for a specific alert Global Setting Alert Override
  • 37. Grafana for Ambari Metrics • Grafana as a “Native UI” for Ambari Metrics • Pre-built Dashboards Host-level, Service-level • Supports HTTPS • System Home, Servers • HDFS Home, NameNodes, DataNodes • YARN Home, Applications, Job History Server • HBase Home, Performance FEATURES DASHBOARDS
  • 38. Grafana includes pre-built dashboards for visualizing the most important cluster metrics.
  • 39. The HDFS NameNode dashboard highlights file system activity.
  • 40. Log Search Search and index HDP logs! Capabilities • Rapid Search of all HDP component logs • Search across time ranges, log levels, and for keywords Solr Logsearch Ambari
  • 41. Log Search WO R K E R N O D E L O G F E E D E R Solr LO G S EA RC H U I Solr Solr A M BA R I Java Process Multi-output Support Grok filters Solr Cloud Local Disk Storage
  • 42. Future of Ambari • Cloud features • Service multi-instance (two ZK quorums) • Service multi-versions (Spark 1.6 & Spark 2.0) • YARN assemblies • Patch Upgrades: upgrade individual components in the same stack version, e.g., just DN and RM in HDP 2.5.*.* with zero downtime • Ambari High Availability As good as

Editor's Notes

  1. Single pane of glass. Provision on the cloud Metrics Services Config Security Alerts Host management Views framework
  2. 0.9 in Sep 2012 1.5 in April 2014 1.6 in July 2014 2.3.0 was not used 2.4.0 is slated with a ton of new features. 2179 Jiras. Cadence is 2-3 major releases per year, with follow up maintenance releases in the months after. http://jsfiddle.net/mp8rqq5x/2/
  3. Log Search : Solr, Logfeeder (similar to Logstash), and Grafana UI Zeppelin for data exploration and visualization that can plugin to multiple data backends Role Based Access Control
  4. Alerts, Stability EU/RU experience LogSearch Security automation Views Framework ease of use
  5. Deploy: Blueprints with Host Discovery Secure: Kerberos, LDAP sync Smart Configs: stack advisor, painful to configure a thousand related knobs. E.g, change zoozkeeper quorum then that has an effect on several services. Log folder, then affects log search. Upgrade: Rolling and Express Upgrade, get patches Monitor: Ambari Alerts, Ambari Metrics Analyze, Scale, Extend: Views, Management Packs
  6. Cloudbreak can install on Amazon EC2, MSFT Azure, Cluster install takes 5-10 mins, mostly downloading packages, installing bits, and starting services.
  7. Used by HDInsight (Microsoft Azure) and Hortonworks QA Allow cluster creation or scaling to be started via the REST API prior to all/any hosts being available. As hosts register with Ambari server they will be matched to request host groups and provisioned according to the requested topology Allow host predicates to be specified along with host count to provide more flexibility in matching hosts to host groups. This will allow for host flavors where different host groups are matched to different host flavors Break up the current monolithic provisioning request into a request for each host operation. For example, install on host A, start on host A, install on hostB, etc. This will allow hosts to make progress even when another host encounters a failure. Allow a host count to be specified in the cluster creation template instead of host names. This is documented in https://issues.apache.org/jira/browse/AMBARI-6275 Install a cluster with two API calls
  8. The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  9. The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  10. The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  11. The blueprint contains the configs, assignment of topology to host group, stack version The creation actually assigns hosts to each host group.
  12. Dynamic availability Allow host_count to be specified instead of host_names As hosts register, they will be matched to the request host groups and provisioned according to to the requested topology When specifying a host_count, a predicate can also be specified for finer-grained control
  13. Dynamic availability Allow host_count to be specified instead of host_names As hosts register, they will be matched to the request host groups and provisioned according to to the requested topology When specifying a host_count, a predicate can also be specified for finer-grained control 3 Terabytes since units is in MB
  14. Kerberos: LDAP/AD Services: Ranger, Atlas, Knox. Ranger: setup security policies on who can access what. Authorization of audit files, plugins for other services like HDFS, Hive, Storm, etc. Atlas: Lineage of data, compliance, especially in health care and financial institutions Knox: perimeter security for HTTP and REST calls in the Hadoop Services. Works with SSL, Kerberos. Kerberos Key Distribution Center so we can define service principals and keytabs.
  15. Can use existing KDC (key distribution center) or install one for Hadoop Hadoop uses a rule-based system to create mappings between service principals and their related UNIX username
  16. As Ambari grows and organizations grow, so do security needs Users have fine-grained roles over the cluster and individual views. Granular authorization checks to distribute the responsibilities and privileges of authenticated users
  17. Configuration files Package with python scripts and templates Alert definitions Kerberos configurations, principals, identities, keytabs Meta data Metric details UI controls and widgets
  18. Stack Advisor, can now ship the recommendations for a service with the service itself, instead of a monolithic stack advisor for the entire stack. Makes it easier to integrate customer services
  19. Express Upgrade: fasted method to upgrade the stack since upgrades an entire component in batches of 100 hosts at a time Rolling Upgrade, one component at a time per host, which can take up to 1 min. For a 100 node cluster with
  20. Express Upgrade: fasted method to upgrade the stack since upgrades an entire component in batches of 100 hosts at a time Rolling Upgrade, one component at a time per host, which can take up to 1 min. For a 100 node cluster with
  21. Express Upgrade: fasted method to upgrade the stack since upgrades an entire component in batches of 100 hosts at a time Rolling Upgrade, one component at a time per host, which can take up to 1 min. For a 100 node cluster with
  22. This Grafana instance is specifically for AMS, not meant to be general-purpose If customer is already using Grafana, this is not a replacement. Grafana will support read-only access for anonymous users, and HTTPS Aggregates across entire cluster, filter by host, top/bottom x, functions like avg/sum/min/max, filter by date range
  23. This is not HDP Search, it is not something that the customer has to separately license, it is an embedded Solr instance
  24. Agent/Collection process running on each host Written in Java Tails all service log files Parses logs using Grok/regex. Can merge multiple line logs, e.g. stack trace On restart, can resume from last read line. Uses checkpoint files to maintain state Extendable design to send logs to multiple destination type. Currently can send logs to Solr and Kafka
  25. Major themes. Goal is to make it as good as Australian pies