SlideShare a Scribd company logo
1 of 63
1 © Hortonworks Inc. 2011–2018. All rights reserved
Running Enterprise Workloads in the
Cloud
Richard Doktorics
Peter Darvasi
2 © Hortonworks Inc. 2011–2018. All rights reserved
Who we are?
⬢ Peter Darvasi
- Partner Engineer at Hortonworks
- @pdarvasi
⬢ Richard Doktorics
- Software Engineer
- @doktoric
3 © Hortonworks Inc. 2011–2018. All rights reserved
Agenda
⬢ What is Cloudbreak?
⬢ Enterprise checklist for big data in the cloud
⬢ Cloudbreak in da house
⬢ Questions
4 © Hortonworks Inc. 2011–2018. All rights reserved
What is Cloudbreak?
5 © Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak is a tool for provisioning Hadoop
clusters on any cloud infrastructure
Simplified Cluster Provisioning - prescriptive setup,
simple automation
6 © Hortonworks Inc. 2011–2018. All rights reserved
Deploy on Public or Private
Clouds
Dynamically configure and manage
clusters on public or private clouds
(Amazon Web Services, Microsoft
Azure, Google Cloud Platform and
OpenStack)
Automated Scaling
Seamlessly manage elasticity
requirements as cluster workloads
change (Ambari Metrics / Prometheus)
Secured Cluster Access
Supports configuration defining
network boundaries and configuring
security groups
Highly Extensible
Recipes to run custom commands
Custom images
7 © Hortonworks Inc. 2011–2018. All rights reserved
⬢ Cloudbreak Deployer (CBD)
– Written in Go and Bash
– Compiled into single binary
⬢ Micro-service architecture
– Each service runs in a Docker
container
– Each container is replaceable
with custom ones
– Services are handled with
docker-compose
Single node deployment
8 © Hortonworks Inc. 2011–2018. All rights reserved
Enterprise checklist for big data in cloud
9 © Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
1
0
© Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
✓ Simple UX
✓ Powerful CLI
✓ Autoscaling
1
1
© Hortonworks Inc. 2011–2018. All rights reserved
Simplified UX
1
2
© Hortonworks Inc. 2011–2018. All rights reserved
Create Credential Experience
1
3
© Hortonworks Inc. 2011–2018. All rights reserved
Built-In Blueprints
1
4
© Hortonworks Inc. 2011–2018. All rights reserved
Basic and Advanced Cluster Creation Experiences
BASIC ADVANCED
1
5
© Hortonworks Inc. 2011–2018. All rights reserved
New Network and Security Group Choices
⬢ Network
– Create new Network and new
Subnet
– Choose existing Network and
existing Subnet
⬢ Security Groups
– Create new SGs
• Choose default SGs
(minimal set of ports)
• Create customized
– Choose existing SGs
1
6
© Hortonworks Inc. 2011–2018. All rights reserved
Powerful CLI
1
7
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak CLI: Designed for DevOps
1
8
© Hortonworks Inc. 2011–2018. All rights reserved
“Show cli command” for every request
1
9
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-scaling
2
0
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling
⬢ Alerts: Create metric or time-based alerts for cluster scaling
⬢ Policies: Scaling policies adjust cluster size based on activity and workload
alerts
⬢ General Configurations: Boundaries and cooldown period
2
1
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling Time-Based Alert
Fire at 10:15 am everyday
2
2
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling Metric-Based Alert
Fire after NodeManagers are in
CRITICAL state for 10 minutes
2
3
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling Policies
⬢ Define the Scale Adjustment (Node Count/Percentage/Exact size)
⬢ Select the HostGroup (to Scale)
⬢ Select Alert (which when fired, executes the Policy)
2
4
© Hortonworks Inc. 2011–2018. All rights reserved
Auto-Scaling General Configurations
⬢ Cooldown Period (between scaling actions)
⬢ Minimum and Maximum Cluster size (boundaries)
Cluster size
boundaries
Time Interval between
two Autoscale events
2
5
© Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
✓ Cloud Resources
✓ Hortonworks DataFlow
✓ Custom Images
2
6
© Hortonworks Inc. 2011–2018. All rights reserved
Cloud Resources:
RDBMS + LDAP
2
7
© Hortonworks Inc. 2011–2018. All rights reserved
Cloud Resources: RDBMS and LDAP/AD = Dynamic Blueprints
⬢ Background:
– Cluster configuration often includes external database (for Hive, Ranger, etc) and LDAP/AD configs
– It’s a challenge to know the different Blueprint configuration choices per service across the stack
⬢ Dynamic Blueprints:
– Ability to manage External Sources (e.g. RDBMS and LDAP/AD) outside of your Blueprint
– Cloudbreak will inject the configurations into your Blueprint
– Simplifies reuse of external cloud resources
– Simplifies your Blueprints -> don’t have to know all the configurations for each component
2
8
© Hortonworks Inc. 2011–2018. All rights reserved
Dynamic Blueprints: RDBMS/LDAP
⬢ Built-In Components:
– Atlas, Ranger, Hadoop, Hive LLAP, Hive, Ambari, Oozie, Druid, SuperSet
JDBC/LDAP
properties in
Blueprint for the
Component?
Yes
Use Blueprint as-is,
no Component
configuration
property injection
No Inject Component
configuration
properties
Perform property
variable
replacement
S
E
2
9
© Hortonworks Inc. 2011–2018. All rights reserved
At-Motion Workloads:
Hortonworks DataFlow
3
0
© Hortonworks Inc. 2011–2018. All rights reserved
Hortonworks DataFlow in CloudBreak
⬢ Default blueprint: “Flow Management: Apache NiFi”
HDF 3.1: NiFi, Ambari, Ambari Metrics, ZooKeeper
3
1
© Hortonworks Inc. 2011–2018. All rights reserved
HDF - cluster creation
3
2
© Hortonworks Inc. 2011–2018. All rights reserved
HDF - cluster creation
3
3
© Hortonworks Inc. 2011–2018. All rights reserved
Custom Images
3
4
© Hortonworks Inc. 2011–2018. All rights reserved
Background: Cloudbreak
1. Cloudbreak creates VM instances using a default base images.
2. Cloudbreak installs Ambari on a VM instance.
3. Cloudbreak instructs Ambari to install an HDP Cluster on other VM instances.
Cloudbreak
RHEL 7
HDP Node
VM
HDP Node
VM
HDP Node
VM
HDP Node
VM
HDP Node
VM
HDP Node
VM
HDP Cluster
3
5
© Hortonworks Inc. 2011–2018. All rights reserved
Background: Cloudbreak Default Images
⬢ By default, Cloudbreak uses default base public images when creating VM instances.
Cloud Standard Image Operating System
AWS Amazon Linux 2017
Azure CentOS 7.x
Google Cloud Platform CentOS 7.x
OpenStack CentOS 7.x
Support for Custom Images provides a way for Cloudbreak
users to leverage their own custom image (not the default
image) when creating VM instances.
3
6
© Hortonworks Inc. 2011–2018. All rights reserved
Making a Custom Image: Overview
Create the
Custom Image
Register the
Custom Image
in Cloudbreak
Use the Custom
Image when
Creating a
Cluster
1 2 3
3
7
© Hortonworks Inc. 2011–2018. All rights reserved
Creating the Image: Code Repository
⬢ Instructions, Packer scripts and Salt states in public GitHub repository
– https://github.com/hortonworks/cloudbreak-images
⬢ An understanding of Packer and Salt is useful
– Packer creates infrastructure
– Packer runs Salt provisioner
⬢ Customer should clone the repository and build on it
3
8
© Hortonworks Inc. 2011–2018. All rights reserved
Creating the Image: Example Scenarios
SCENARIO APPROACH
For AWS: I don’t want Amazon Linux
and instead want RHEL 7
1. Setup repository and AWS environment
2. Use the repository tools to build a RHEL 7 image
make build-aws-rhel7
I don’t want OpenJDK and instead
want Oracle JDK
1. Setup repository and environment
2. Turn on Oracle optional state
3. Use the repository tools to build an image
For AWS: I don’t want Amazon Linux
and instead want MY RHEL 7
** This is an advanced scenario**
1. Setup repository and AWS environment
2. Change the source base image
3. Use the repository tools to build a RHEL 7 image
make build-aws-rhel7
3
9
© Hortonworks Inc. 2011–2018. All rights reserved
Use the Custom Image: Create Cluster (UI)
⬢ Create Cluster > General Configuration > Advanced
Choose image
catalog
Adjust the Ambari +
HDP repos (if you want)
Choose image
you registered
4
0
© Hortonworks Inc. 2011–2018. All rights reserved
Pre-Warmed Images
PROS CONS
Prewarmed: OS + pre-installed Ambari and
HDP
Cluster installs are faster
No internet connection is needed
Cannot change the Ambari or HDP versions,
cannot use local repositories
Base: OS only Cluster installs take longer Can change the Ambari or HDP Versions, or
use local repositories
Base Images Prewarmed Images
4
1
© Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
✓ Kerberos support
✓ LDAP integration
✓ Proxy configuration
4
2
© Hortonworks Inc. 2011–2018. All rights reserved
Cluster Security:
Kerberos
4
3
© Hortonworks Inc. 2011–2018. All rights reserved
What is Kerberos
⬢ Strongly authenticating and establishing a user’s identity is the basis for secure access in
Hadoop. Users need to be able to reliably “identify” themselves and then have that
identity propagated throughout the Hadoop cluster.
⬢ Kerberos is the de-facto system for authenticating access to distributed services
4
4
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak: Support for Enabling Kerberos
Goal
Provide a way for Cloudbreak users to create clusters that
are Kerberos enabled
Approach
Ambari exposes a lot of Kerberos options
Leverage Ambari Kerberos options and avoid re-creating
Ambari Kerberos experience
Pragmatic prescriptive options on-top
4
5
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak: Enable Kerberos Security
⬢ Create Cluster > Security > Advanced
⬢ [ ] Enable Kerberos Security
4
6
© Hortonworks Inc. 2011–2018. All rights reserved
Options: Use Existing KDC or Use Test KDC
Use Existing
KDC
Use Test KDC
Advanced
Basic
- Not for production use. For testing and
evaluation purposes only.
- Installs and configures an MIT KDC on the
master node.
- Configures the cluster to leverage that KDC.
- Provide basic information
about your existing KDC.
- Ambari Kerberos descriptors
are generated automatically.
- Provide basic information
about your existing KDC.
- Provide your own Ambari
Kerberos descriptors.
4
7
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak + LDAP/AD
4
8
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak User AuthN
⬢ Goal: Configure Cloudbreak to provide for external User AuthN to LDAP/AD
– CloudFoundry UAA (User Account and Authentication Server) is the foundation
https://github.com/cloudfoundry/uaa
⬢ Two parts:
1. Configure Cloudbreak to talk to external LDAP/AD
2. Configure which group(s) can access Cloudbreak
4
9
© Hortonworks Inc. 2011–2018. All rights reserved
Step 1: Configure Cloudbreak to talk to LDAP/AD
⬢ On the Cloudbreak host, create:
/var/lib/cloudbreak-deployment/uaa-changes.yml
⬢ Define LDAP profile for users and groups
Cloudbreak LDAP/AD
5
0
© Hortonworks Inc. 2011–2018. All rights reserved
Step 2: Configure which group(s) can access Cloudbreak
⬢ Configure which group(s) are authorized to access Cloudbreak:
cbd util execute-ldap-mapping [group]
cbd util delete-ldap-mapping [group]
⬢ To authorize users in the ”Analysts” group to access Cloudbreak:
cbd util execute-ldap-mapping cn=Analysts,ou=Groups,dc=hortonworks,dc=local
5
1
© Hortonworks Inc. 2011–2018. All rights reserved
Proxy configuration
5
2
© Hortonworks Inc. 2011–2018. All rights reserved
Limited Outbound Internet Access
⬢ Handle enterprise scenarios where:
– Limited (or restricted) outbound internet access, and/or
– Required use of a Proxy to obtain internet access
Cloudbreak
Cluster Hosts
Cloudbreak
• Docker Hub
• Cloudbreak dependencies
• Default Image Catalog
Cloudbreak and Cluster Hosts
• Cloud Provider APIs
• HDP or HDF platform repositories
http/sproxy
(optional)
5
3
© Hortonworks Inc. 2011–2018. All rights reserved
Internet Access via Proxy
Cloudbreak
Proxy Setup
Clusters Proxy
Setup
How does Cloudbreak
communicate thru a proxy to
get to the internet (and to the
cluster hosts)?
How do the Cluster Hosts
communicate thru a proxy to
get to the Internet?
5
4
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak: Proxy Setup
⬢ Setup Docker Environment to use Proxy
– Modify the Docker service to set HTTP_PROXY and HTTPS_PROXY (and NO_PROXY)
https://docs.docker.com/config/daemon/systemd/#httphttps-proxy
⬢ Setup Cloudbreak to use Proxy in Profile
⬢ Advanced Profile option “HTTPS_PROXYFORCLUSTERCONNECTION=true|false”
– Defaults to “false”
HTTP_PROXY_HOST=your-proxy-host
HTTPS_PROXY_HOST=your-proxy-host
PROXY_PORT=your-proxy-port
PROXY_USER=your-proxy-user
PROXY_PASSWORD=your-proxy-password
5
5
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak: Advanced Proxy Scenarios
SCENARIO #1: Proxy for internet, not clusters SCENARIO #2: Proxy for internet and clusters
5
6
© Hortonworks Inc. 2011–2018. All rights reserved
Clusters: Register Proxy Configuration
⬢ External Sources > Proxy Configurations
(optional)
if proxy requires
authentication
5
7
© Hortonworks Inc. 2011–2018. All rights reserved
Clusters: Configure Proxy for Cluster Hosts
⬢ Create Cluster > Advanced > External Sources > Configure Proxy
• Configures yum with “proxy” settings
• Configures Ambari Server with “httpProxy”
settings
5
8
© Hortonworks Inc. 2011–2018. All rights reserved
✓ Control and Automation
✓ Cloudy Services
✓ Security
✓ Enterprise-Grade Support
Checklist for enterprises in the cloud
✓ SmartSense integration
✓ Flex support
5
9
© Hortonworks Inc. 2011–2018. All rights reserved
Cloudbreak in da house
6
0
© Hortonworks Inc. 2011–2018. All rights reserved
⬢ Have an internal hosted Cloudbreak service for…
– our CI/CD pipeline
– testing and prototyping HDP and HDF services
– have self-service clusters for QE/SE/PS teams
Main use cases
6
1
© Hortonworks Inc. 2011–2018. All rights reserved
⬢ Run Cloudbreak in HA (High Availability) mode
– Ability to recover flows in case of node failure
– Avoid master-slave design / leader election problems
⬢ Scale Cloudbreak as we desire
– Distribute each cluster related flow
– Cannot run 2 flows for the same cluster at the same time (e.g: 2 upscale flows)
– Flow cancellation must be handled
⬢ Scale the Web UI
– Had to introduce a Redis cluster for the session store
⬢ Scale every other service as well
⬢ Find a tool that makes it easy to deploy these services to multiple nodes
Our technical goals
6
2
© Hortonworks Inc. 2011–2018. All rights reserved
⬢ Not because it’s fancy..
⬢ Evaluated Kubernetes, Swarm, Mesos, Rancher
⬢ Open source / Active community with hands-on experience
⬢ Many cloud providers already supports it
⬢ Lots of tooling behind it / API / CLI / Helm / Ansible / Salt
⬢ Integration with most of the cloud providers
– Provision Load Balancer (GCP, AWS, Azure)
– Use object stores to share data (Ceph, S3, GCP bucket, Azure Storage Account)
– Dynamic volume provisioning / Persistent disk (EBS, Azure Blob)
Why Kubernetes?
6
3
© Hortonworks Inc. 2011–2018. All rights reserved
Thank you

More Related Content

What's hot

Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiDataWorks Summit
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudDataWorks Summit
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionDataWorks Summit
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3DataWorks Summit
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopHortonworks
 
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present FutureAn Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present FutureDataWorks Summit/Hadoop Summit
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFiHortonworks
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseDataWorks Summit
 
Hive ACID Apache BigData 2016
Hive ACID Apache BigData 2016Hive ACID Apache BigData 2016
Hive ACID Apache BigData 2016alanfgates
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseDataWorks Summit
 
Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...
Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...
Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...DataWorks Summit
 
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...DataWorks Summit
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseDataWorks Summit
 
The Future of Apache Ambari
The Future of Apache AmbariThe Future of Apache Ambari
The Future of Apache AmbariDataWorks Summit
 
Enabling ABAC with Accumulo and Ranger integration
Enabling ABAC with Accumulo and Ranger integrationEnabling ABAC with Accumulo and Ranger integration
Enabling ABAC with Accumulo and Ranger integrationDataWorks Summit
 
Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...DataWorks Summit
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolutionDataWorks Summit
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0DataWorks Summit
 

What's hot (20)

Connecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFiConnecting the Drops with Apache NiFi & Apache MiNiFi
Connecting the Drops with Apache NiFi & Apache MiNiFi
 
Running Enterprise Workloads in the Cloud
Running Enterprise Workloads in the CloudRunning Enterprise Workloads in the Cloud
Running Enterprise Workloads in the Cloud
 
Apache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the unionApache Hadoop YARN: state of the union
Apache Hadoop YARN: state of the union
 
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3Deep learning on yarn  running distributed tensorflow etc on hadoop cluster v3
Deep learning on yarn running distributed tensorflow etc on hadoop cluster v3
 
Ozone- Object store for Apache Hadoop
Ozone- Object store for Apache HadoopOzone- Object store for Apache Hadoop
Ozone- Object store for Apache Hadoop
 
An Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present FutureAn Overview on Optimization in Apache Hive: Past, Present Future
An Overview on Optimization in Apache Hive: Past, Present Future
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Hive ACID Apache BigData 2016
Hive ACID Apache BigData 2016Hive ACID Apache BigData 2016
Hive ACID Apache BigData 2016
 
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterpriseUsing Spark Streaming and NiFi for the next generation of ETL in the enterprise
Using Spark Streaming and NiFi for the next generation of ETL in the enterprise
 
Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...
Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...
Comparative Performance Analysis of AWS EC2 Instance Types Commonly Used for ...
 
Streamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache AmbariStreamline Hadoop DevOps with Apache Ambari
Streamline Hadoop DevOps with Apache Ambari
 
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
Dancing Elephants - Efficiently Working with Object Stores from Apache Spark ...
 
An Apache Hive Based Data Warehouse
An Apache Hive Based Data WarehouseAn Apache Hive Based Data Warehouse
An Apache Hive Based Data Warehouse
 
Row/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache SparkRow/Column- Level Security in SQL for Apache Spark
Row/Column- Level Security in SQL for Apache Spark
 
The Future of Apache Ambari
The Future of Apache AmbariThe Future of Apache Ambari
The Future of Apache Ambari
 
Enabling ABAC with Accumulo and Ranger integration
Enabling ABAC with Accumulo and Ranger integrationEnabling ABAC with Accumulo and Ranger integration
Enabling ABAC with Accumulo and Ranger integration
 
Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...Present and future of unified, portable and efficient data processing with Ap...
Present and future of unified, portable and efficient data processing with Ap...
 
Ozone and HDFS’s evolution
Ozone and HDFS’s evolutionOzone and HDFS’s evolution
Ozone and HDFS’s evolution
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 

Similar to Running Enterprise Workloads in the Cloud

Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureDataWorks Summit
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureDataWorks Summit
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash CourseDataWorks Summit
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash CourseDataWorks Summit
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesKrisztián Horváth
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakHortonworks
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoopGergely Devenyi
 
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxFortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxDataWorks Summit
 
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS Hortonworks
 
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakHadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakSean Roberts
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...Hortonworks
 
Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not laterDataWorks Summit
 
Sql on everything with drill
Sql on everything with drillSql on everything with drill
Sql on everything with drillJulien Le Dem
 
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemManaging enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemDataWorks Summit
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the CloudDataWorks Summit
 
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingThe Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingDataWorks Summit
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...DataWorks Summit
 

Similar to Running Enterprise Workloads in the Cloud (20)

Hadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and FutureHadoop Operations - Past, Present, and Future
Hadoop Operations - Past, Present, and Future
 
Hadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and FutureHadoop Operations – Past, Present, and Future
Hadoop Operations – Past, Present, and Future
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash Course
 
Data in the Cloud Crash Course
Data in the Cloud Crash CourseData in the Cloud Crash Course
Data in the Cloud Crash Course
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
 
Running Cloudbreak on Kubernetes
Running Cloudbreak on KubernetesRunning Cloudbreak on Kubernetes
Running Cloudbreak on Kubernetes
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Micro services vs hadoop
Micro services vs hadoopMicro services vs hadoop
Micro services vs hadoop
 
Cloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep DiveCloudbreak - Technical Deep Dive
Cloudbreak - Technical Deep Dive
 
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache KnoxFortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
Fortifying Multi-Cluster Hybrid Cloud Data Lakes using Apache Knox
 
Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS Hortonworks Data Cloud for AWS
Hortonworks Data Cloud for AWS
 
Hadoop Everywhere & Cloudbreak
Hadoop Everywhere & CloudbreakHadoop Everywhere & Cloudbreak
Hadoop Everywhere & Cloudbreak
 
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...Hortonworks Technical Workshop:   HDP everywhere - cloud considerations using...
Hortonworks Technical Workshop: HDP everywhere - cloud considerations using...
 
Saving the elephant—now, not later
Saving the elephant—now, not laterSaving the elephant—now, not later
Saving the elephant—now, not later
 
Sql on everything with drill
Sql on everything with drillSql on everything with drill
Sql on everything with drill
 
Managing enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystemManaging enterprise users in Hadoop ecosystem
Managing enterprise users in Hadoop ecosystem
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
The Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral ProcessingThe Unbearable Lightness of Ephemeral Processing
The Unbearable Lightness of Ephemeral Processing
 
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
Bridle your Flying Islands and Castles in the Sky: Built-in Governance and Se...
 
Hp
HpHp
Hp
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Recently uploaded (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Running Enterprise Workloads in the Cloud

  • 1. 1 © Hortonworks Inc. 2011–2018. All rights reserved Running Enterprise Workloads in the Cloud Richard Doktorics Peter Darvasi
  • 2. 2 © Hortonworks Inc. 2011–2018. All rights reserved Who we are? ⬢ Peter Darvasi - Partner Engineer at Hortonworks - @pdarvasi ⬢ Richard Doktorics - Software Engineer - @doktoric
  • 3. 3 © Hortonworks Inc. 2011–2018. All rights reserved Agenda ⬢ What is Cloudbreak? ⬢ Enterprise checklist for big data in the cloud ⬢ Cloudbreak in da house ⬢ Questions
  • 4. 4 © Hortonworks Inc. 2011–2018. All rights reserved What is Cloudbreak?
  • 5. 5 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak is a tool for provisioning Hadoop clusters on any cloud infrastructure Simplified Cluster Provisioning - prescriptive setup, simple automation
  • 6. 6 © Hortonworks Inc. 2011–2018. All rights reserved Deploy on Public or Private Clouds Dynamically configure and manage clusters on public or private clouds (Amazon Web Services, Microsoft Azure, Google Cloud Platform and OpenStack) Automated Scaling Seamlessly manage elasticity requirements as cluster workloads change (Ambari Metrics / Prometheus) Secured Cluster Access Supports configuration defining network boundaries and configuring security groups Highly Extensible Recipes to run custom commands Custom images
  • 7. 7 © Hortonworks Inc. 2011–2018. All rights reserved ⬢ Cloudbreak Deployer (CBD) – Written in Go and Bash – Compiled into single binary ⬢ Micro-service architecture – Each service runs in a Docker container – Each container is replaceable with custom ones – Services are handled with docker-compose Single node deployment
  • 8. 8 © Hortonworks Inc. 2011–2018. All rights reserved Enterprise checklist for big data in cloud
  • 9. 9 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud
  • 10. 1 0 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud ✓ Simple UX ✓ Powerful CLI ✓ Autoscaling
  • 11. 1 1 © Hortonworks Inc. 2011–2018. All rights reserved Simplified UX
  • 12. 1 2 © Hortonworks Inc. 2011–2018. All rights reserved Create Credential Experience
  • 13. 1 3 © Hortonworks Inc. 2011–2018. All rights reserved Built-In Blueprints
  • 14. 1 4 © Hortonworks Inc. 2011–2018. All rights reserved Basic and Advanced Cluster Creation Experiences BASIC ADVANCED
  • 15. 1 5 © Hortonworks Inc. 2011–2018. All rights reserved New Network and Security Group Choices ⬢ Network – Create new Network and new Subnet – Choose existing Network and existing Subnet ⬢ Security Groups – Create new SGs • Choose default SGs (minimal set of ports) • Create customized – Choose existing SGs
  • 16. 1 6 © Hortonworks Inc. 2011–2018. All rights reserved Powerful CLI
  • 17. 1 7 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak CLI: Designed for DevOps
  • 18. 1 8 © Hortonworks Inc. 2011–2018. All rights reserved “Show cli command” for every request
  • 19. 1 9 © Hortonworks Inc. 2011–2018. All rights reserved Auto-scaling
  • 20. 2 0 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling ⬢ Alerts: Create metric or time-based alerts for cluster scaling ⬢ Policies: Scaling policies adjust cluster size based on activity and workload alerts ⬢ General Configurations: Boundaries and cooldown period
  • 21. 2 1 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling Time-Based Alert Fire at 10:15 am everyday
  • 22. 2 2 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling Metric-Based Alert Fire after NodeManagers are in CRITICAL state for 10 minutes
  • 23. 2 3 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling Policies ⬢ Define the Scale Adjustment (Node Count/Percentage/Exact size) ⬢ Select the HostGroup (to Scale) ⬢ Select Alert (which when fired, executes the Policy)
  • 24. 2 4 © Hortonworks Inc. 2011–2018. All rights reserved Auto-Scaling General Configurations ⬢ Cooldown Period (between scaling actions) ⬢ Minimum and Maximum Cluster size (boundaries) Cluster size boundaries Time Interval between two Autoscale events
  • 25. 2 5 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud ✓ Cloud Resources ✓ Hortonworks DataFlow ✓ Custom Images
  • 26. 2 6 © Hortonworks Inc. 2011–2018. All rights reserved Cloud Resources: RDBMS + LDAP
  • 27. 2 7 © Hortonworks Inc. 2011–2018. All rights reserved Cloud Resources: RDBMS and LDAP/AD = Dynamic Blueprints ⬢ Background: – Cluster configuration often includes external database (for Hive, Ranger, etc) and LDAP/AD configs – It’s a challenge to know the different Blueprint configuration choices per service across the stack ⬢ Dynamic Blueprints: – Ability to manage External Sources (e.g. RDBMS and LDAP/AD) outside of your Blueprint – Cloudbreak will inject the configurations into your Blueprint – Simplifies reuse of external cloud resources – Simplifies your Blueprints -> don’t have to know all the configurations for each component
  • 28. 2 8 © Hortonworks Inc. 2011–2018. All rights reserved Dynamic Blueprints: RDBMS/LDAP ⬢ Built-In Components: – Atlas, Ranger, Hadoop, Hive LLAP, Hive, Ambari, Oozie, Druid, SuperSet JDBC/LDAP properties in Blueprint for the Component? Yes Use Blueprint as-is, no Component configuration property injection No Inject Component configuration properties Perform property variable replacement S E
  • 29. 2 9 © Hortonworks Inc. 2011–2018. All rights reserved At-Motion Workloads: Hortonworks DataFlow
  • 30. 3 0 © Hortonworks Inc. 2011–2018. All rights reserved Hortonworks DataFlow in CloudBreak ⬢ Default blueprint: “Flow Management: Apache NiFi” HDF 3.1: NiFi, Ambari, Ambari Metrics, ZooKeeper
  • 31. 3 1 © Hortonworks Inc. 2011–2018. All rights reserved HDF - cluster creation
  • 32. 3 2 © Hortonworks Inc. 2011–2018. All rights reserved HDF - cluster creation
  • 33. 3 3 © Hortonworks Inc. 2011–2018. All rights reserved Custom Images
  • 34. 3 4 © Hortonworks Inc. 2011–2018. All rights reserved Background: Cloudbreak 1. Cloudbreak creates VM instances using a default base images. 2. Cloudbreak installs Ambari on a VM instance. 3. Cloudbreak instructs Ambari to install an HDP Cluster on other VM instances. Cloudbreak RHEL 7 HDP Node VM HDP Node VM HDP Node VM HDP Node VM HDP Node VM HDP Node VM HDP Cluster
  • 35. 3 5 © Hortonworks Inc. 2011–2018. All rights reserved Background: Cloudbreak Default Images ⬢ By default, Cloudbreak uses default base public images when creating VM instances. Cloud Standard Image Operating System AWS Amazon Linux 2017 Azure CentOS 7.x Google Cloud Platform CentOS 7.x OpenStack CentOS 7.x Support for Custom Images provides a way for Cloudbreak users to leverage their own custom image (not the default image) when creating VM instances.
  • 36. 3 6 © Hortonworks Inc. 2011–2018. All rights reserved Making a Custom Image: Overview Create the Custom Image Register the Custom Image in Cloudbreak Use the Custom Image when Creating a Cluster 1 2 3
  • 37. 3 7 © Hortonworks Inc. 2011–2018. All rights reserved Creating the Image: Code Repository ⬢ Instructions, Packer scripts and Salt states in public GitHub repository – https://github.com/hortonworks/cloudbreak-images ⬢ An understanding of Packer and Salt is useful – Packer creates infrastructure – Packer runs Salt provisioner ⬢ Customer should clone the repository and build on it
  • 38. 3 8 © Hortonworks Inc. 2011–2018. All rights reserved Creating the Image: Example Scenarios SCENARIO APPROACH For AWS: I don’t want Amazon Linux and instead want RHEL 7 1. Setup repository and AWS environment 2. Use the repository tools to build a RHEL 7 image make build-aws-rhel7 I don’t want OpenJDK and instead want Oracle JDK 1. Setup repository and environment 2. Turn on Oracle optional state 3. Use the repository tools to build an image For AWS: I don’t want Amazon Linux and instead want MY RHEL 7 ** This is an advanced scenario** 1. Setup repository and AWS environment 2. Change the source base image 3. Use the repository tools to build a RHEL 7 image make build-aws-rhel7
  • 39. 3 9 © Hortonworks Inc. 2011–2018. All rights reserved Use the Custom Image: Create Cluster (UI) ⬢ Create Cluster > General Configuration > Advanced Choose image catalog Adjust the Ambari + HDP repos (if you want) Choose image you registered
  • 40. 4 0 © Hortonworks Inc. 2011–2018. All rights reserved Pre-Warmed Images PROS CONS Prewarmed: OS + pre-installed Ambari and HDP Cluster installs are faster No internet connection is needed Cannot change the Ambari or HDP versions, cannot use local repositories Base: OS only Cluster installs take longer Can change the Ambari or HDP Versions, or use local repositories Base Images Prewarmed Images
  • 41. 4 1 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud ✓ Kerberos support ✓ LDAP integration ✓ Proxy configuration
  • 42. 4 2 © Hortonworks Inc. 2011–2018. All rights reserved Cluster Security: Kerberos
  • 43. 4 3 © Hortonworks Inc. 2011–2018. All rights reserved What is Kerberos ⬢ Strongly authenticating and establishing a user’s identity is the basis for secure access in Hadoop. Users need to be able to reliably “identify” themselves and then have that identity propagated throughout the Hadoop cluster. ⬢ Kerberos is the de-facto system for authenticating access to distributed services
  • 44. 4 4 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak: Support for Enabling Kerberos Goal Provide a way for Cloudbreak users to create clusters that are Kerberos enabled Approach Ambari exposes a lot of Kerberos options Leverage Ambari Kerberos options and avoid re-creating Ambari Kerberos experience Pragmatic prescriptive options on-top
  • 45. 4 5 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak: Enable Kerberos Security ⬢ Create Cluster > Security > Advanced ⬢ [ ] Enable Kerberos Security
  • 46. 4 6 © Hortonworks Inc. 2011–2018. All rights reserved Options: Use Existing KDC or Use Test KDC Use Existing KDC Use Test KDC Advanced Basic - Not for production use. For testing and evaluation purposes only. - Installs and configures an MIT KDC on the master node. - Configures the cluster to leverage that KDC. - Provide basic information about your existing KDC. - Ambari Kerberos descriptors are generated automatically. - Provide basic information about your existing KDC. - Provide your own Ambari Kerberos descriptors.
  • 47. 4 7 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak + LDAP/AD
  • 48. 4 8 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak User AuthN ⬢ Goal: Configure Cloudbreak to provide for external User AuthN to LDAP/AD – CloudFoundry UAA (User Account and Authentication Server) is the foundation https://github.com/cloudfoundry/uaa ⬢ Two parts: 1. Configure Cloudbreak to talk to external LDAP/AD 2. Configure which group(s) can access Cloudbreak
  • 49. 4 9 © Hortonworks Inc. 2011–2018. All rights reserved Step 1: Configure Cloudbreak to talk to LDAP/AD ⬢ On the Cloudbreak host, create: /var/lib/cloudbreak-deployment/uaa-changes.yml ⬢ Define LDAP profile for users and groups Cloudbreak LDAP/AD
  • 50. 5 0 © Hortonworks Inc. 2011–2018. All rights reserved Step 2: Configure which group(s) can access Cloudbreak ⬢ Configure which group(s) are authorized to access Cloudbreak: cbd util execute-ldap-mapping [group] cbd util delete-ldap-mapping [group] ⬢ To authorize users in the ”Analysts” group to access Cloudbreak: cbd util execute-ldap-mapping cn=Analysts,ou=Groups,dc=hortonworks,dc=local
  • 51. 5 1 © Hortonworks Inc. 2011–2018. All rights reserved Proxy configuration
  • 52. 5 2 © Hortonworks Inc. 2011–2018. All rights reserved Limited Outbound Internet Access ⬢ Handle enterprise scenarios where: – Limited (or restricted) outbound internet access, and/or – Required use of a Proxy to obtain internet access Cloudbreak Cluster Hosts Cloudbreak • Docker Hub • Cloudbreak dependencies • Default Image Catalog Cloudbreak and Cluster Hosts • Cloud Provider APIs • HDP or HDF platform repositories http/sproxy (optional)
  • 53. 5 3 © Hortonworks Inc. 2011–2018. All rights reserved Internet Access via Proxy Cloudbreak Proxy Setup Clusters Proxy Setup How does Cloudbreak communicate thru a proxy to get to the internet (and to the cluster hosts)? How do the Cluster Hosts communicate thru a proxy to get to the Internet?
  • 54. 5 4 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak: Proxy Setup ⬢ Setup Docker Environment to use Proxy – Modify the Docker service to set HTTP_PROXY and HTTPS_PROXY (and NO_PROXY) https://docs.docker.com/config/daemon/systemd/#httphttps-proxy ⬢ Setup Cloudbreak to use Proxy in Profile ⬢ Advanced Profile option “HTTPS_PROXYFORCLUSTERCONNECTION=true|false” – Defaults to “false” HTTP_PROXY_HOST=your-proxy-host HTTPS_PROXY_HOST=your-proxy-host PROXY_PORT=your-proxy-port PROXY_USER=your-proxy-user PROXY_PASSWORD=your-proxy-password
  • 55. 5 5 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak: Advanced Proxy Scenarios SCENARIO #1: Proxy for internet, not clusters SCENARIO #2: Proxy for internet and clusters
  • 56. 5 6 © Hortonworks Inc. 2011–2018. All rights reserved Clusters: Register Proxy Configuration ⬢ External Sources > Proxy Configurations (optional) if proxy requires authentication
  • 57. 5 7 © Hortonworks Inc. 2011–2018. All rights reserved Clusters: Configure Proxy for Cluster Hosts ⬢ Create Cluster > Advanced > External Sources > Configure Proxy • Configures yum with “proxy” settings • Configures Ambari Server with “httpProxy” settings
  • 58. 5 8 © Hortonworks Inc. 2011–2018. All rights reserved ✓ Control and Automation ✓ Cloudy Services ✓ Security ✓ Enterprise-Grade Support Checklist for enterprises in the cloud ✓ SmartSense integration ✓ Flex support
  • 59. 5 9 © Hortonworks Inc. 2011–2018. All rights reserved Cloudbreak in da house
  • 60. 6 0 © Hortonworks Inc. 2011–2018. All rights reserved ⬢ Have an internal hosted Cloudbreak service for… – our CI/CD pipeline – testing and prototyping HDP and HDF services – have self-service clusters for QE/SE/PS teams Main use cases
  • 61. 6 1 © Hortonworks Inc. 2011–2018. All rights reserved ⬢ Run Cloudbreak in HA (High Availability) mode – Ability to recover flows in case of node failure – Avoid master-slave design / leader election problems ⬢ Scale Cloudbreak as we desire – Distribute each cluster related flow – Cannot run 2 flows for the same cluster at the same time (e.g: 2 upscale flows) – Flow cancellation must be handled ⬢ Scale the Web UI – Had to introduce a Redis cluster for the session store ⬢ Scale every other service as well ⬢ Find a tool that makes it easy to deploy these services to multiple nodes Our technical goals
  • 62. 6 2 © Hortonworks Inc. 2011–2018. All rights reserved ⬢ Not because it’s fancy.. ⬢ Evaluated Kubernetes, Swarm, Mesos, Rancher ⬢ Open source / Active community with hands-on experience ⬢ Many cloud providers already supports it ⬢ Lots of tooling behind it / API / CLI / Helm / Ansible / Salt ⬢ Integration with most of the cloud providers – Provision Load Balancer (GCP, AWS, Azure) – Use object stores to share data (Ceph, S3, GCP bucket, Azure Storage Account) – Dynamic volume provisioning / Persistent disk (EBS, Azure Blob) Why Kubernetes?
  • 63. 6 3 © Hortonworks Inc. 2011–2018. All rights reserved Thank you