Sharing Sensu with Multiple Teams using Ansible

Sharing Sensu with Multiple Teams
Deployment & Configuration using Ansible
David Schroeder
August 23, 2018

Short story shorter
2
Overview

› Environment segregation
– Access limits
– Contacts
› Different deployment strategies
› Different thresholds
– Both keepalive and other checks
› Different checks, different platforms (even Windows)
› API calls
– Creating silence
– Gather check results
"Can Sensu do #{this_thing}?"
3
Team Requirements

› Sensu Enterprise RBAC!
› Contact routing!
› Check parameter tokenization!
› API tokens!
› Custom configuration anywhere and everywhere!
"Sensu can do #{this_thing}!"
4
Team Requirements

sensu-client sensu-server sensu-enterprise rabbitmq-server
› Installs & configures
› Satisfies dependencies
› Creates client.json
– Maintenance mode
› Configures checks
– Pub/sub
– Aggregate
– API endpoint
– Ping
› Installs handlers & stand-
alone check scripts
› Configures handlers
› Configures contacts
› Installs Sensu Enterprise
› Configures API
› Configures dashboard
– RBAC through LDAP
› Installs and configures
RabbitMQ cluster
Redis Sentinel
› Fetches certificates
6
Ansible Roles
sensu-winclient
› Generates configuration
› Bundles installer &
dependencies
sensu-standalone
› Subrepo of community
sensu-ansible role
redis-server
Redis
Graphite

sensu-client sensu-server sensu-enterprise rabbitmq-server
› Installs & configures
› Satisfies dependencies
› Creates client.json
– Maintenance mode
› Configures checks
– Pub/sub
– Aggregate
– API endpoint
– Ping
› Installs handlers & stand-
alone check scripts
› Configures handlers
› Configures contacts
› Installs Sensu Enterprise
› Configures API
› Configures dashboard
– RBAC through LDAP
RabbitMQ cluster
Redis Sentinel
› Fetches certificates
7
Ansible Roles
sensu-winclient
› Generates configuration
› Bundles installer &
dependencies
sensu-standalone
› Subrepo of community
sensu-ansible role
redis-server
Redis
Graphite
› Shared role, "galaxy" style
› Included as 'subrepo'

› sensu/
– group_vars/
▪ framework_pdx_dev/
▪ framework_pdx_stage/
▪ framework_pdx_prod/
▪ sensu_one/
▪ sensu_two/
– roles/
▪ sensu_client/
▪ sensu_winclient/
▪ sensu_server/
▪ sensu_enterprise/
Drilling Down
8
Ansible Structure
› Team Environments

› sensu/
– group_vars/
▪ framework_pdx_dev/
▪ framework_pdx_stage/
▪ framework_pdx_prod/
▪ sensu_one/
▪ sensu_two/
– roles/
▪ sensu_client/
▪ sensu_winclient/
▪ sensu_server/
▪ sensu_enterprise/
Drilling Down
9
Ansible Structure
› Sensu Clusters

› sensu/
– group_vars/
▪ infrastructure_pdx_dev/
– main.yml
– vault.yml
Per Environment
10
Ansible Structure
---
### Environment Definitions ###########################################
host_subscriptions:
- "basic"
- "framework"
- "framework_pdx_dev"
host_environment: "framework_pdx_dev"
host_contact: "framework"
# Keepalive thresholds: number of seconds before warning or alerting
keepalive_warn: 150
keepalive_crit: 210
# Set re-notification time (in seconds) for keepalive alarms. Default is 300.
keepalive_refresh: 3600

› sensu/
– group_vars/
– main.yml
– vault.yml
Per Environment
11
Ansible Structure
# To add a subscription based on server role as included in the hostname,
# include the subscription name as the key, and hostname pattern as the
# value. Be sure to escape out backslashes.
role_patterns:
framework_zeromq: "-mq00d"
framework_utility: "^utly"
# Enable Sensu client socket commands
enable_client_socket: true
# Custom client-side configuration
custom_client_configs:
checks:
check_ram:
warning: 101
critical: 100

› sensu/
– group_vars/
– main.yml
– vault.yml
Per Environment
12
Ansible Structure
### Communicating with Sensu ##########################################
# Hostname or IP address of the graphite API server for graph rendering
graphite_server: "172.16.20.100"
rabbitmq_params:
port: 5671
user: "sensu"
pass: "{{ vault_rabbitmq['password'] }}"
host1: "172.16.20.101"
host1_cert: "{{ vault_rabbitmq['host1_cert'] }}"
host1_key: "{{ vault_rabbitmq['host1_key'] }}"
host2: "172.16.20.102"
host3: "172.16.20.103"

› sensu/
– group_vars/
▪ sensu_one/
– main.yml
– vault.yml
– aggregatechecks.yml
– endpoints.yml
– handlers.yml
– pingchecks.yml
– site_checks.yml
Sensu Clusters
13
Ansible Structure
ldap:
server: "auth.somewhere.out.there"
port: 636
roles:
framework_team:
name: "framework_team"
readonly: "false"
members:
- "framework"
datacenters: []
subscriptions:
- "framework"

› sensu/
– group_vars/
▪ sensu_one/
– main.yml
– vault.yml
– endpoints.yml
– handlers.yml
– pingchecks.yml
– site_checks.yml
Sensu Clusters
14
Ansible Structure
ldap:
roles:
jenkins_api:
name: "jenkins_api"
readonly: "false"
token: "{{ vault_ldap.jenkins_api.token }}"
members: []
datacenters: []
subscriptions: []
methods:
get:
- aggregates
- clients
- silenced
post:
- silenced

› sensu/
– group_vars/
▪ sensu_one/
– main.yml
– vault.yml
– endpoints.yml
– handlers.yml
– pingchecks.yml
– site_checks.yml
Sensu Clusters
15
Ansible Structure
handler_contacts:
- contacts.json:
contacts:
framework:
hipchatter:
api_token: ChahL8XeiphohBi2eiceiseehaele5eu1aesahyuu
room: 1234
mailer:
mail_to: frameworkteam.dl@wherever.com
sensu_admin:
hipchatter:
api_token: Aivoubah0iexi6eyioQu0eeThee2Aenu6kohw4qui
room: 2345
mailer:
mail_to: sensuteam.dl@wherever.com

› sensu/
– roles/sensu-server/
▪ vars/
– main.yml
– checks.yml
– filters.yml
– mutators.yml
Sensu Server Role
16
Ansible Structure
pubsub_checks:
# Basic Checks
- check_ram.json:
checks:
check_ram:
command: "check-memory-percent.rb –w :::custom.checks.check_ram.warning|95:::
-c :::custom.checks.check_ram.critical|98:::"
interval: "{{ default_interval }}"
subscribers:
- basic
handlers: "{{ default_handlers }}"
occurrences: 5
refresh: "{{ default_renotify }}"
runbook: "{{ runbook_base_url }}/check_ram"
graph: "http://{{ graphite_server }}/render?from={{ graph_time }}&until=now&{{
graph_size}}&target=:::environment:::.:::graphname:::.memory.usedWOBuffersCaches&title=Mem
ory+Used+Without+Buffers+and+Caches&uchiwa_force_image=.jpg"

Pull
Request
Code
Review
Client
Deployment
Server
Deployment
Win!
18
Sensu Change Workflow

Problems? Let's be honest: yes.
Classification goes here 19

Ongoing Challenges
API calls
Limited availability in RBAC01
Dashboard
Missing hosts in Events list02
Cleanup
Old checks, forgotten hosts03
Bottlenecks
04
20

Ongoing Challenges
API calls
Limited availability in LDAP RBAC01
21
› Works through RBAC, but without subscription limitations:
– /clients
– /clients/:client/history (deprecated)
– /events (returns all events)
– /silenced (POST ignores 'begin' field)
› Does not work at all through RBAC layer"
– /results
– /events/:client/
– /silenced/subscriptions/:subscription
– /silenced/checks/:check
– ?filter
› Good news: support in Sensu 2.0!

Ongoing Challenges
Dashboard
Missing hosts in Events list02
22
› If a host matches a subscription in RBAC, but the alerting
check does not, it is not visible on the Events page

Ongoing Challenges
Cleanup
Old checks, forgotten hosts03
23

Ongoing Challenges
Bottlenecks
04
24
This guy!

Sharing Sensu with Multiple Teams using Ansible

Recommandé

Recommandé

Contenu connexe

Similaire à Sharing Sensu with Multiple Teams using Ansible

Similaire à Sharing Sensu with Multiple Teams using Ansible (20)

Plus de Sensu Inc.

Plus de Sensu Inc. (20)

Dernier

Dernier (20)

Sharing Sensu with Multiple Teams using Ansible

Notes de l'éditeur