Here are the slides from John Jawed's PuppetConf 2016 presentation called Multi-Tenant Puppet at Scale. Watch the videos at https://www.youtube.com/playlist?list=PLV86BgbREluVjwwt-9UL8u2Uy8xnzpIqa
12. Classification
a little dash of bash
node_terminus = /enc_handler.sh
$ cat enc_handler.sh!
...!
echo $1 | nc -U /unix.sock!
...!
13. Classification
a little go go
William Kennedy’s workpool
(github.com./goinggo/workpool)
go server listening on /unix.sock
workpool routes requests to an idle
worker
16. Classification
end result
gets close to 100ms goal – 110ms
CPU usage – no constant bootstrapping
frees up resources, puppet master process
at scale, 200ms per run adds up quickly (30 for
every 60 seconds of CPU time)
18. agents
everything is SSL, that is good
everything is SSL, that is expensive
use yum.puppetlabs.com. or apt.puppetlabs.com.
to make sure you run 3.7+
runtime savings: 40%
Catalog
19. post run woes
after agent runs, the real fun begins
puppetmaster and agent both wait for
report processors to finish
slow report collection will cause your
infrastructure to fall over – some just avoid it
Reports/Facts
20. foreman
foreman report/fact processing – need to spread
read I/O
fact processing is read heavy, reports are write
heavy
ruby activerecord: makara
postgresql: local read slaves, pg_shard
Reports/Facts
21. reports
4k run reports per minute
using pg_shard:
psql> SELECT master_create_distributed_table(table_name := ’reports',
partition_column := ‘report_id');
psql> SELECT master_create_worker_shards(table_name := ‘reports',
shard_count := 365);
Reports/Facts
22. facts
most of the workload is read I/O, kept local
facts updated immediately after puppet runs
Master DB loadavg 2
Reports/Facts
27. simple is hard
“Simple can be harder than complex: You have
to work hard to get your thinking clean to make
it simple. But it’s worth it in the end because
once you get there, you can move mountains.”
- Steve Jobs
31. osquery
services, files, and any resource that can be
tracked as a host event
event information can also be recorded (doorman,
zentral, etc)
event info is stored in tables (sqlite)
34. pvc and foreman
foreman’s puppetrun API to set flag
pvc queries foreman to trigger run
logical separation with host groups
35. runinterval is an after thought
puppet runs instantly when it needs to
runinterval can be 3 minutes or 3 hours
frees up puppet masters, allows more resources
for other things
your infrastructure is still kept honest
38. I pummel people with questions, because I need to know
what they're thinking, what they're trying to achieve, what
they believe the final outcome is going to be.
Tim Gunn