systemd @ FB – a year later

systemd @ FB – a year later
Davide Cavalca
Production Engineer

• Recap
• Tracking upstream
• Resource management
• Service monitoring
• Case studies
• Advocacy
Agenda

• 100% of the bare metal feet on CentOS 7!
• Migrated countless services to systemd
• libsystemd integration in our build system
• Containers: see Zeal’s talk later today!
Recap
CentOS 7 migration

• systemd 231 232 233 (234 235)→ → → →
• Also tracking util-linux, dbus, etc.
• Published our Rawhide-based backports on:
https://github.com/facebookincubator/rpm-backports
• Binary RPMs based on it on:
https://copr.fedorainfracloud.org/coprs/jsynacek/systemd-
backports-for-centos-7/
Tracking upstream
Staying up to date

• Not specifc to systemd
• Duplicate systemd RPMs: package-cleanup wrapper
• rpmdb corruption: dcrpm
• Mismatch between systemd and systemd-libs
Tracking upstream
RPM issues
if ldd /usr/lib/systemd/systemd | grep ‘systemd.*not found$’
yum reinstall -y $systemd_packages
fi

• Rebuild packaging for the Meson transition
• Backported meson, ninja-build in CentOS
• Standalone systemd-compat-libs
https://github.com/facebookincubator/systemd-compat-libs
Tracking upstream
Meson and compat-libs

Tracking upstream
tty woes with 234
• When rolling 234 we discovered a race in the kernel tty
subsystem (repros all the way back to 4.0)
• Turns out both systemd and Tupperware use the real tty0
• Investigation still in progress, likely a use-after-free bug
• Tupperware should probably just use a pty here

• See Chris’s talk tomorrow for all things cgroup2!
• Using systemd to partition services and apply limits
• Lightweight daemon to collect metrics from /sys/fs/cgroup
• Chef API to apply confgurations and manage experiments
Resource management
Rolling out cgroup2

Service monitoring
• systemd exposes lots of useful metrics over dbus
• Unit properties (e.g. *Timestamp*, NRestarts)
• Status events (e.g. unit state changes)
• Options: python-dbus, sd-bus, coreos/go-systemd/dbus
Getting metrics out of systemd

Service monitoring
• Lightweight daemon to feed systemd metrics to various
monitoring systems
• Polling for unit properties, subscriptions for status events
• Initial implementation in golang
systemdmon

Service monitoring
• Thin Cython wrapper on top of sd-bus
• Expose systemd dbus object model
• ipython REPL for prototyping
• Will be opensourced together with systemdmon
pystemd

Case studies
dbus reliability
• Issues with dbus-daemon or the system bus afect systemd
• systemctl hanging or failing Chef failing→
• Easy to DoS the bus, especially with user services
• Hard to remediate without a reboot
• Looking forward to dbus-broker!

Case studies
rpm macros for systemd services
• By default RPM macros will restart units on upgrade...
• …which is a problem if you’ve also setup Chef to restart
• Solution: knob in our internal packaging tool to optionally
disable the restart macro

Case studies
Logging
• Journald setup: 10MB in memory logging feeding rsyslog
• journalctl is awesome
• Double writing problem
• No way to set per-unit limits

Case studies
Unit loops
• Easy to create loops with x-systemd-requires in fstab
• systemd will delete a random unit to break loops
• Solution: add _netdev to the fstab entry
• systemd-analyze to help debugging
systemd-tmpfiles-setup.service: Job systemd-tmpfiles-
setup.service/start deleted to break ordering cycle starting
with smc_proxy.service/start

Case studies
Transient unit creep
• systemd-run creates units in /run/systemd/transient
• If the unit fails, it sticks around in ‘failed’ state
• 10k failed units 50% cpu usage for pid 1→
• 30k failed units 100% cpu usage for pid 1→
• Fix: call systemctl reset-failed periodically

Case studies
KillMode=process
• KillMode=process may leave stray processes in the cgroup
• Changes to unit slices don’t apply unless the old slice is
empty
• Fix: move to use KillMode=control-group

Case studies
Unit escaping
• Escape logic relies on shell control characters:
/dev/dm0 dev-dmx2d1.swap→
• Chef fx: https://github.com/chef/chef/pull/6230
• path_to_unit wrapper in fb_systemd

• Announce core packages updates widely
• Tailor documentation to customer usecases
• Encourage people to engage upstream directly
• Tech talks
Advocacy

systemd @ FB – a year later

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to systemd @ FB – a year later

Similar to systemd @ FB – a year later (20)

More from Davide Cavalca

More from Davide Cavalca (10)

Recently uploaded

Recently uploaded (20)

systemd @ FB – a year later