Speaker: Chris Deutsch - Systems Administrator, RightScale
As a systems administrator, what is the best way to ensure that you don’t get paged in your sleep or on your days off? The RightScale operations team manages hundreds of cloud servers, as well as a host of other cloud services, to deliver always-on production applications. The RightScale Ops Team will share tips as power users of RightScale, including running batch updates, automating scaling, adding custom monitoring graphs, and troubleshooting configuration and performance issues.
2. Cloud Management Platform
What I'll be talking about
• Meet RightScale Operations
• Monitoring
o How monitoring works on RightScale
o How to build a custom monitor
o How we monitor web servers and cassandra
• Automation
o The RightScale API
o The chimp command line tool
o How we automate releases
• Tips from Ops
3. Cloud Management Platform
RightScale Operations
• Deployed over 5 continents
• Over 700 cloud servers administered
• RightScale runs on RightScale
4. Cloud Management Platform
collectd: what is it?
• open source metric collection tool
• modular architecture
• uses the ubiquitous rrdtool
• more information: http://collectd.org/
5. Cloud Management Platform
collectd: built-in plugins
• host monitoring
o cpu
o disk space
o disk I/O
o memory
o network
• application monitoring
o process state
o memory use
o cpu usage
10. Cloud Management Platform
collectd: custom plugins
• Custom plugins written using the Exec plugin
• Can be written in any language
• Ruby, python and perl are common
• Simple
11. Cloud Management Platform
collectd: custom plugins
What we're going to look at:
• building an example monitor using the Exec plugin
• http error code monitor
• cassandra database server monitor
12. Cloud Management Platform
collectd: custom plugins: example
/etc/init.d/collectd/example.conf:
example.rb:
https://collectd.org/wiki/index.php/Plugin:Exec
#!/usr/bin/ruby
while true do
time = Time.now.to_i
puts "PUTVAL "host/cpu-0/cpu_overview" interval=20 #{time}:1"
sleep 20
end
<Plugin exec>
Exec "nobody" "example.rb"
</Plugin>
15. Cloud Management Platform
collectd: custom plugins: cassandra
• cassandra is a key-value data store (aka nosql) server
• data is stored on a ring
• a ring consists of nodes
18. Cloud Management Platform
automation: the rightscale api
• RightScale API is RESTful and easy to traverse
• right_api_client - ruby client library
• CloudFlows - the future
19. Cloud Management Platform
automation: the command line
• needed a tool that would let us be lazy
• the "chimp" executes commands on servers
• let's jump into a demo
20. Cloud Management Platform
automation: chimp
• select what to update using tags
• update across multiple deployments
• update one server at a time so service isn't disrupted
• track success/failure
21. Cloud Management Platform
automation: scripting languages
• having a command line tool lets us use scripting languages like bash
or ruby to automate common tasks
• we ended up using Ruby rake files to tie it all together
22. Cloud Management Platform
automation: a RightScale release
• chimp used to run commands on servers
• supports "rolling" operations
• uses tag service to scope operations
• we use rake to organise tasks that make up a release
• developed chimpd so we could run more commands in parallel
24. Cloud Management Platform
tips
• assume instances will die eventually
• always reboot test ServerTemplates
• use tags. everywhere. all the time.
• use chimp to make ad-hoc queries
• monitor not just host metrics but system metrics
• design everything to be runnable in a server array