SlideShare une entreprise Scribd logo
1  sur  22
QEMU Sandboxing for dummies
Eduardo Otubo <otubo@redhat.com>
Senior Software Engineer
27/Jan/2018
2
1. Secure Computing: The basics
2. Libseccomp
3. Qemu sandboxing v1
4. Qemu sandboxing v2 and more options
Agenda
3
Secure Computing: the basics
● Kernel support first version dated from March, 8th 2005 (2.6.12)
Commit by: Andrea Arcangeli
● The main purpose is to call prctl() with PR_SET_SECCOMP on the
process which will allow only: exit(), sigreturn(), read()
and write()
○ Otherwise SIGKILL or SIGSYS are issued
4
Secure Computing: the basics
● Second kernel implementation with dynamic seccomp policies:
January, 11th 2011; Commit by: Will Drewry <wad@chromium.org>
● Now uses with seccomp() system call
● Uses BPF (Berkeley Packet Filter)
○ An in-kernel data link layer packet filter that has an abstracted API that
also works as a generic filter
5
struct sock_filter filter[] = {
/* Grab the system call number */
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
/* Jump table for the allowed syscalls */
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#ifdef __NR_sigreturn
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#endif
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
6
Libseccomp
● Paul Moore (2011)
● Userspace layer to make life easier:
○ Abstract complex BPF constructions
○ Abstract differences between architectures and its ABIs
○ Optimize filter construction for best performance
○ Kill (sigkill), trap (sigsys), Allow in case of matched filter (among
other actions)
7
struct sock_filter filter[] = {
/* Grab the system call number */
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
/* Jump table for the allowed syscalls */
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#ifdef __NR_sigreturn
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#endif
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
8
struct sock_filter filter[] = {
/* Grab the system call number */
BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr),
/* Jump table for the allowed syscalls */
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#ifdef __NR_sigreturn
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
#endif
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1),
BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0),
BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2),
9
Qemu sandboxing v1
static const struct QemuSeccompSyscall seccomp_whitelist[] = {
{ SCMP_SYS(timer_settime), 255 },
{ SCMP_SYS(timer_gettime), 254 },
{ SCMP_SYS(futex), 253 },
{ SCMP_SYS(select), 252 },
{ SCMP_SYS(recvfrom), 251 },
{ SCMP_SYS(sendto), 250 },
{ SCMP_SYS(read), 249 },
{ SCMP_SYS(brk), 248 },
{ SCMP_SYS(clone), 247 },
{ SCMP_SYS(mmap), 247 },
{ SCMP_SYS(mprotect), 246 },
{ SCMP_SYS(execve), 245 },
{ SCMP_SYS(open), 245 },
{ SCMP_SYS(ioctl), 245 },
{ SCMP_SYS(recvmsg), 245 },
{ SCMP_SYS(sendmsg), 245 },
10
Qemu sandboxing v1
11
● Basic whitelist approach (--sandbox=on)
○ Every system call is blocked, except for the ones that are explicitly
whitelisted
● Various compatibility problems, requires lots of testing and
different workloads
● It’s safe right?
12
Qemu sandboxing v1
Not actually!
● QEMU links to too many different shared libraries and there is no way
to determine which code paths QEMU triggers in these libraries and
thus identify which syscalls will be genuinely needed.
● Sometimes you miss a syscall and it aborts right at the beginning
before boot (which is good?) but sometimes your VM is running for
days and it could suddenly abort (which is terrible)
13
Qemu sandboxing v2
● Extended blacklist approach (--sandbox=on,...)
● Everything is allowed except for a few sets that are definitely not
allowed
○ Default system calls: basic set of forbidden system calls (kexec,swapon,
swapoff, mount, umount, etc)
○ obsolete
○ elevateprivileges
○ spawn
○ resourcecontrol
14
Obsolete system calls
● Old system calls that were usefull in the past but became obsolete or
replaced by new version
○ Like readdir() being replaced by getdents()
● Should be by default blocked, but left an option to enabled it by
--sandbox on,obsolete=allow
15
Elevated Privileges
● This option would block all set*uid|gid system calls, this is known
to be required by some features like bridge helpers
● This option also does prctl(PR_SET_NO_NEW_PRIVS) which will
avoid new threads to escalate privilege as well
● This mode could be switched on or off by the option:
--sandbox on,elevatedprivileges=allow|deny|children
16
Spawn
● This option provides a fair way to disable new fork() or exec()
processes to be created at all, privileged or not.
● Things like bridge helper, SMB server, ifup/down scripts, migration
exec: protocol would all be disabled.
● This mode could be switched on or off by the option:
--sandbox on,spawn=allow|deny
17
Resource Control
● Avoids QEMU to set process affinity, scheduler priority, etc
● This shouldn’t be QEMU’s responsability to do this, but rather management
software like libvirt.
● This mode could be switched on or off by the option:
--sandbox on,resourcecontrol=allow|deny
18
Qemu sandboxing v2
static const struct QemuSeccompSyscall blacklist[] = {
/* default set of syscalls to blacklist */
{ SCMP_SYS(reboot), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(swapon), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(swapoff), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(syslog), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(mount), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(umount), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(kexec_load), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(afs_syscall), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(break), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(ftime), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(getpmsg), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(gtty), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(lock), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(mpx), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(prof), QEMU_SECCOMP_SET_DEFAULT },
{ SCMP_SYS(profil), QEMU_SECCOMP_SET_DEFAULT },
19
Some thoughts on Qemu sandboxing
20
● Sandboxing is not your definitive solution for security on virtualization.
But rather a good solution to be stacked on others like:
○ MAC/DAC (Mandatory Access Control and Discretionary Access Control)
○ SELinux
○ Remote Management using SSH/TLS/SSL
○ Guest Image cryptography
○ Virtual Trusted Platform Module (vTPM)
● Sandbox v2 are not low level knobs to control system calls but rahter a high
level knobs to controls concepts.
Questions?
21
THANK YOU
plus.google.com/+RedHat
linkedin.com/company/red-hat
youtube.com/user/RedHatVideos
facebook.com/redhatinc
twitter.com/RedHatNews

Contenu connexe

Tendances

Kernel_Crash_Dump_Analysis
Kernel_Crash_Dump_AnalysisKernel_Crash_Dump_Analysis
Kernel_Crash_Dump_Analysis
Buland Singh
 

Tendances (20)

Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Launch the First Process in Linux System
Launch the First Process in Linux SystemLaunch the First Process in Linux System
Launch the First Process in Linux System
 
Android Things : Building Embedded Devices
Android Things : Building Embedded DevicesAndroid Things : Building Embedded Devices
Android Things : Building Embedded Devices
 
Basic Linux Internals
Basic Linux InternalsBasic Linux Internals
Basic Linux Internals
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Making Linux do Hard Real-time
Making Linux do Hard Real-timeMaking Linux do Hard Real-time
Making Linux do Hard Real-time
 
Unix signals
Unix signalsUnix signals
Unix signals
 
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMUSFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
SFO15-202: Towards Multi-Threaded Tiny Code Generator (TCG) in QEMU
 
Namespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containersNamespaces and cgroups - the basis of Linux containers
Namespaces and cgroups - the basis of Linux containers
 
Qemu device prototyping
Qemu device prototypingQemu device prototyping
Qemu device prototyping
 
Android Multimedia Framework
Android Multimedia FrameworkAndroid Multimedia Framework
Android Multimedia Framework
 
syzkaller: the next gen kernel fuzzer
syzkaller: the next gen kernel fuzzersyzkaller: the next gen kernel fuzzer
syzkaller: the next gen kernel fuzzer
 
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
 
from Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Worksfrom Binary to Binary: How Qemu Works
from Binary to Binary: How Qemu Works
 
Linux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKBLinux Kernel Booting Process (1) - For NLKB
Linux Kernel Booting Process (1) - For NLKB
 
Introduction to Makefile
Introduction to MakefileIntroduction to Makefile
Introduction to Makefile
 
Kernel_Crash_Dump_Analysis
Kernel_Crash_Dump_AnalysisKernel_Crash_Dump_Analysis
Kernel_Crash_Dump_Analysis
 
Linux Porting
Linux PortingLinux Porting
Linux Porting
 
Introduction to open_sbi
Introduction to open_sbiIntroduction to open_sbi
Introduction to open_sbi
 
Introduction to Perf
Introduction to PerfIntroduction to Perf
Introduction to Perf
 

Similaire à QEMU Sandboxing for dummies

HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
Linaro
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
Kernel TLV
 

Similaire à QEMU Sandboxing for dummies (20)

Kernel debug log and console on openSUSE
Kernel debug log and console on openSUSEKernel debug log and console on openSUSE
Kernel debug log and console on openSUSE
 
Microkernel Development
Microkernel DevelopmentMicrokernel Development
Microkernel Development
 
Linux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compact
 
Alexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for DevelopersAlexander Reelsen - Seccomp for Developers
Alexander Reelsen - Seccomp for Developers
 
Chromium Sandbox on Linux (NDC Security 2019)
Chromium Sandbox on Linux (NDC Security 2019)Chromium Sandbox on Linux (NDC Security 2019)
Chromium Sandbox on Linux (NDC Security 2019)
 
Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)Linux seccomp(2) vs OpenBSD pledge(2)
Linux seccomp(2) vs OpenBSD pledge(2)
 
HKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
 
Linux scheduler
Linux schedulerLinux scheduler
Linux scheduler
 
Generic Synchronization Policies in C++
Generic Synchronization Policies in C++Generic Synchronization Policies in C++
Generic Synchronization Policies in C++
 
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
 
Linux Kernel Debugging
Linux Kernel DebuggingLinux Kernel Debugging
Linux Kernel Debugging
 
Implementing of classical synchronization problem by using semaphores
Implementing of classical synchronization problem by using semaphoresImplementing of classical synchronization problem by using semaphores
Implementing of classical synchronization problem by using semaphores
 
Building Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCC
 
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
 
Chromium Sandbox on Linux (BlackHoodie 2018)
Chromium Sandbox on Linux (BlackHoodie 2018)Chromium Sandbox on Linux (BlackHoodie 2018)
Chromium Sandbox on Linux (BlackHoodie 2018)
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
Roll your own toy unix clone os
Roll your own toy unix clone osRoll your own toy unix clone os
Roll your own toy unix clone os
 
BPF Tools 2017
BPF Tools 2017BPF Tools 2017
BPF Tools 2017
 
bcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challengesbcc/BPF tools - Strategy, current tools, future challenges
bcc/BPF tools - Strategy, current tools, future challenges
 
Tracer Evaluation
Tracer EvaluationTracer Evaluation
Tracer Evaluation
 

Dernier

%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
masabamasaba
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Medical / Health Care (+971588192166) Mifepristone and Misoprostol tablets 200mg
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
masabamasaba
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
chiefasafspells
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
masabamasaba
 

Dernier (20)

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni%in Benoni+277-882-255-28 abortion pills for sale in Benoni
%in Benoni+277-882-255-28 abortion pills for sale in Benoni
 
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
Devoxx UK 2024 - Going serverless with Quarkus, GraalVM native images and AWS...
 
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
%+27788225528 love spells in Huntington Beach Psychic Readings, Attraction sp...
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
Abortion Pill Prices Tembisa [(+27832195400*)] 🏥 Women's Abortion Clinic in T...
 
%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare%in Harare+277-882-255-28 abortion pills for sale in Harare
%in Harare+277-882-255-28 abortion pills for sale in Harare
 
What Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the SituationWhat Goes Wrong with Language Definitions and How to Improve the Situation
What Goes Wrong with Language Definitions and How to Improve the Situation
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Toronto Psychic Readings, Attraction spells,Brin...
 
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
Love witchcraft +27768521739 Binding love spell in Sandy Springs, GA |psychic...
 
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
WSO2CON 2024 - Cloud Native Middleware: Domain-Driven Design, Cell-Based Arch...
 
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
WSO2Con2024 - From Code To Cloud: Fast Track Your Cloud Native Journey with C...
 
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Knoxville Psychic Readings, Attraction spells,Br...
 
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
 
WSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go PlatformlessWSO2CON2024 - It's time to go Platformless
WSO2CON2024 - It's time to go Platformless
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 

QEMU Sandboxing for dummies

  • 1. QEMU Sandboxing for dummies Eduardo Otubo <otubo@redhat.com> Senior Software Engineer 27/Jan/2018
  • 2. 2
  • 3. 1. Secure Computing: The basics 2. Libseccomp 3. Qemu sandboxing v1 4. Qemu sandboxing v2 and more options Agenda 3
  • 4. Secure Computing: the basics ● Kernel support first version dated from March, 8th 2005 (2.6.12) Commit by: Andrea Arcangeli ● The main purpose is to call prctl() with PR_SET_SECCOMP on the process which will allow only: exit(), sigreturn(), read() and write() ○ Otherwise SIGKILL or SIGSYS are issued 4
  • 5. Secure Computing: the basics ● Second kernel implementation with dynamic seccomp policies: January, 11th 2011; Commit by: Will Drewry <wad@chromium.org> ● Now uses with seccomp() system call ● Uses BPF (Berkeley Packet Filter) ○ An in-kernel data link layer packet filter that has an abstracted API that also works as a generic filter 5
  • 6. struct sock_filter filter[] = { /* Grab the system call number */ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr), /* Jump table for the allowed syscalls */ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #ifdef __NR_sigreturn BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #endif BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2), 6
  • 7. Libseccomp ● Paul Moore (2011) ● Userspace layer to make life easier: ○ Abstract complex BPF constructions ○ Abstract differences between architectures and its ABIs ○ Optimize filter construction for best performance ○ Kill (sigkill), trap (sigsys), Allow in case of matched filter (among other actions) 7
  • 8. struct sock_filter filter[] = { /* Grab the system call number */ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr), /* Jump table for the allowed syscalls */ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #ifdef __NR_sigreturn BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #endif BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2), 8
  • 9. struct sock_filter filter[] = { /* Grab the system call number */ BPF_STMT(BPF_LD+BPF_W+BPF_ABS, syscall_nr), /* Jump table for the allowed syscalls */ BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_rt_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #ifdef __NR_sigreturn BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_sigreturn, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), #endif BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit_group, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_exit, 0, 1), BPF_STMT(BPF_RET+BPF_K, SECCOMP_RET_ALLOW), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_read, 1, 0), BPF_JUMP(BPF_JMP+BPF_JEQ+BPF_K, __NR_write, 3, 2), 9
  • 10. Qemu sandboxing v1 static const struct QemuSeccompSyscall seccomp_whitelist[] = { { SCMP_SYS(timer_settime), 255 }, { SCMP_SYS(timer_gettime), 254 }, { SCMP_SYS(futex), 253 }, { SCMP_SYS(select), 252 }, { SCMP_SYS(recvfrom), 251 }, { SCMP_SYS(sendto), 250 }, { SCMP_SYS(read), 249 }, { SCMP_SYS(brk), 248 }, { SCMP_SYS(clone), 247 }, { SCMP_SYS(mmap), 247 }, { SCMP_SYS(mprotect), 246 }, { SCMP_SYS(execve), 245 }, { SCMP_SYS(open), 245 }, { SCMP_SYS(ioctl), 245 }, { SCMP_SYS(recvmsg), 245 }, { SCMP_SYS(sendmsg), 245 }, 10
  • 11. Qemu sandboxing v1 11 ● Basic whitelist approach (--sandbox=on) ○ Every system call is blocked, except for the ones that are explicitly whitelisted ● Various compatibility problems, requires lots of testing and different workloads ● It’s safe right?
  • 12. 12
  • 13. Qemu sandboxing v1 Not actually! ● QEMU links to too many different shared libraries and there is no way to determine which code paths QEMU triggers in these libraries and thus identify which syscalls will be genuinely needed. ● Sometimes you miss a syscall and it aborts right at the beginning before boot (which is good?) but sometimes your VM is running for days and it could suddenly abort (which is terrible) 13
  • 14. Qemu sandboxing v2 ● Extended blacklist approach (--sandbox=on,...) ● Everything is allowed except for a few sets that are definitely not allowed ○ Default system calls: basic set of forbidden system calls (kexec,swapon, swapoff, mount, umount, etc) ○ obsolete ○ elevateprivileges ○ spawn ○ resourcecontrol 14
  • 15. Obsolete system calls ● Old system calls that were usefull in the past but became obsolete or replaced by new version ○ Like readdir() being replaced by getdents() ● Should be by default blocked, but left an option to enabled it by --sandbox on,obsolete=allow 15
  • 16. Elevated Privileges ● This option would block all set*uid|gid system calls, this is known to be required by some features like bridge helpers ● This option also does prctl(PR_SET_NO_NEW_PRIVS) which will avoid new threads to escalate privilege as well ● This mode could be switched on or off by the option: --sandbox on,elevatedprivileges=allow|deny|children 16
  • 17. Spawn ● This option provides a fair way to disable new fork() or exec() processes to be created at all, privileged or not. ● Things like bridge helper, SMB server, ifup/down scripts, migration exec: protocol would all be disabled. ● This mode could be switched on or off by the option: --sandbox on,spawn=allow|deny 17
  • 18. Resource Control ● Avoids QEMU to set process affinity, scheduler priority, etc ● This shouldn’t be QEMU’s responsability to do this, but rather management software like libvirt. ● This mode could be switched on or off by the option: --sandbox on,resourcecontrol=allow|deny 18
  • 19. Qemu sandboxing v2 static const struct QemuSeccompSyscall blacklist[] = { /* default set of syscalls to blacklist */ { SCMP_SYS(reboot), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(swapon), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(swapoff), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(syslog), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(mount), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(umount), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(kexec_load), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(afs_syscall), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(break), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(ftime), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(getpmsg), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(gtty), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(lock), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(mpx), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(prof), QEMU_SECCOMP_SET_DEFAULT }, { SCMP_SYS(profil), QEMU_SECCOMP_SET_DEFAULT }, 19
  • 20. Some thoughts on Qemu sandboxing 20 ● Sandboxing is not your definitive solution for security on virtualization. But rather a good solution to be stacked on others like: ○ MAC/DAC (Mandatory Access Control and Discretionary Access Control) ○ SELinux ○ Remote Management using SSH/TLS/SSL ○ Guest Image cryptography ○ Virtual Trusted Platform Module (vTPM) ● Sandbox v2 are not low level knobs to control system calls but rahter a high level knobs to controls concepts.