Joe Kaiser, System Engineer at StackIQ at the Seattle Scalability Meetup on April 27, 2016
This presentation was followed by a demo of Kubernetes on Stacki
2. Open Source Stack Installer
Stacki is a very fast and ultra reliable Linux server provisioning tool … at scale.
With zero prerequisites for taking systems from bare metal to a ping and prompt.
4. History
Roots in Open Source
Started life as the Rocks Project at UCSD
Started in May ’00.
6 month project going on ~16 years
Roots in the HPC world
6. Problem
OS Provisioning
Disk Configuration
Disk Controller Configuration
Disk Partitioning
Network Configuration
Services configuration
Application Deployment
Life-cycle management of the cluster
Server Provisioning
7. Problem – Contd. …
Datacenter Provisioning
Server Provisioning
Heterogeneous Hardware
Complex Network Configuration
Bonding
Bridging
VLANs
Combinations of the above
13. From Bare Metal Up
Take complete control of the Stack
Modified CentOS Installer
Parallel package sharing installer
Database to keep persistent data about the System
Command Line to interact with Stacki
Dynamic Kickstart File Generation
14. Frontend Services
Services to build backend nodes
DHCP – MAC to IP address Mapping
TFTP – Serve out PXE files, Installation Kernel, and RAM Disk
Apache – Serve Kickstart files
DNS (optional)
Services to access backend nodes
SSH key management
Parallel execution shell
15. Stacki Positioning
DevOps / Configuration Tool
DHCP /
DNS / TFTP
NetworkDiskOS
In-house
developed
deployment
tools
- Disk Array Controller Configuration
- Disk Partitioning Configuration
16. Download and Boot the ISO
Download the ISO from
www.stacki.com
It’s 1.5 GB
stacki pallet
Subset of CentOS 6.7
Boot the ISO on the host that will be
your frontend
25. Frontend Services
Services to build backend nodes
DHCP – MAC to IP address Mapping
TFTP – Serve out PXE files, Installation Kernel, and RAM Disk
Apache – Serve Kickstart files
DNS (optional)
Services to access backend nodes
SSH key management
Parallel execution shell
26. Adding Hosts
Method 1: Discovery
Advantages
Prior knowledge of MAC addresses Not Required
Automatic Sensible Hostname, IP address assignment
Disadvantages
Automatic Sensible Hostname, IP address assignment
Complex network configuration has to be done post-installation
Run
# insert-ethers
30. Adding Hosts
Method 2: Host Configuration Spreadsheet
Advantages
Complete control of Hostname, IP address, and network assignments
Easy to make changes
Fits very well with existing datacenter management processes.
Lots and lots of Error Checking
Disadvantages
A little tedious the first time around
Requires prior knowledge of
MAC addresses,
IP address assignments
Physical location of machines (Rack & Position)
32. Backend Installation
Save your Host Configuration spreadsheet as a CSV
Import CSV on frontend
# stack load hostfile file=hosts.csv
Tell backend nodes to install on their next PXE boot
# stack set host boot backend action=install
PXE boot all backend nodes
Go!
36. Advanced Networking
Advanced Network Configuration
Bonded interfaces
VLANs
Bridging
Any combo of the above
Multiple Subnets
Build a single cluster from hosts in multiple subnets
Manage hosts in multiple datacenters
38. Disk Controller Configuration
Disk Controller Support
LSI MegaRAID controller & derivatives
Intel MegaRAID
Dell MegaRAID
Cisco SAS MegaRAID
Any controller that supports the “storcli” or “megacli” command
HP Smart Storage Controller support
Supports RAID 0,1,5,6,10,50,etc.
Configure Controllers using Spreadsheets
# stack load storage controller
40. Disk Partitioning
Sensible Default Disk partitioning
Support for multiple disks
Support for file system options and mount options
Support for Software RAID configuration
Disk Partitioning through spreadsheets
# stack load storage partition
43. Pallets
Software Entity
Contains RPMS
Contains Configuration in the form of XML
Used for installation and configuration of an Application
Can be applied during Frontend installation or after the fact.
Each pallet functionally equal to a YUM repo with extra configuration
Example: Cloudera Pallet
Contains RPMS required to install the Cloudera Distribution of Hadoop
Contains scripts to configure and starts CDH
44. Example: Stacki with Cloudera Pallet
Check namenodes are
empty
Format/start HDFS
Create all directories
Create all metastores
Start services (Hbase, Hive,
Oozie, Sqoop, Impala, etc)
Deploy client configuration Configure database
Setup/assign monitors
(activity, services, and host)
Test database connections
Validate/resolve hostnamesConsistent host timezones
No bad kernel versions
running
(CDH) version consistency
Java version consistency
Daemons versions
consistency
Mgmt Agents versions
consistency
Host specification/SSH
ports
MUCH MORE …
DHCP Server/Client setup TFTP/PXE configuration
Server OS installation
Node OS Install
RAID configuration
Boot configuration
System/data disk
partitioning
Monitoring system setup
and config
Lights Out/IPMI setup
User accounts added and
synced
SSH keys on all hosts
Network node configuration
Config Mgmt install and
configuration
Route configurationOS upgrades/updates
Site specific software and
configuration
Host specification/SSH
ports
Security
Firewall setupCluster Mgmt utility Database install and config
Multiple network configPackage installation MUCH MORE …
App Config
Site Config
HW Install
Without Stacki
Stacki w/ Hadoop Pallet
45. Carts
Site Specific Pallets
Contains site-specific RPM
Contains site-specific configurations
Structurally and Functionally equivalent to a Pallet
Example: Client Cart
Contains RPMS to install DevOps tools
Contains custom post-install scripts to configure DevOps tools
Contains custom post-install scripts to run DevOps tools to bring system up to requisite
configuration.
46. Boxes
Logical Entity
Loose collection of Pallets and Carts
One-to-Many mapping to Backend Hosts
OS Pallet
Cloudera Pallet
Stacki Pallet
Pallets Carts
PayPal Cart
Ansible Cart
Boxes
RedHat Pallet
Stacki Pallet
PayPal Cart
Application
RedHat Pallet
Cloudera Pallet
Ansible Cart
OS Pallet
Stacki Pallet
PayPal Cart
Default
48. Multiple Distributions
Default Distribution
Based on stripped down CentOS 6.7 or 7.2
Used to build backend nodes
Multiple Distributions on Frontend
◦ E.g., RHEL 6.x based distribution, CentOS 6.7, etc.
Backend Nodes Distribution Mapping
Any Node can be mapped to any distribution
49. In Conclusion
Production Ready
Deploy large scale Big Data & OpenStack clusters very fast.
Deploy test systems to evaluate multiple applications with very short turn-around times
Deploy several small datacenters-in-a-rack that’s shipped out to customer sites.
50. Try it Out!
Website
www.stacki.com
Source Code
github.com/stackiq/stacki
Google Groups
groups.google.com/forum/#!forum/stacki
Notes de l'éditeur
Linux – Focused on RedHat-ish (Kickstart/Anaconda)
Provisioning – Bare Metal (total stack control)
Scale – solve 1000+ servers problem then scale down
Ping and Prompt – Get machine up to known base OS fully configuration raid / disk / networking / ssh access on
Nothing else … No agent left on the server
We started out as an open source project at UCSD.
Saw a lot of faculty trying to stand up their own clusters, and maintain it. Cost a lot of time, and effort.
PXE Boot, TFTP installation kernel, https system profile, locally translate, go