SlideShare une entreprise Scribd logo
1  sur  65
Télécharger pour lire hors ligne
1
Linux rumpkernelLinux rumpkernel
a librarified monolithic kernela librarified monolithic kernel
Hajime Tazaki
IIJ Research Laboratory
March, 2018, AsiaBSDCon 2018
slide source
https://github.com/thehajime/asiabsdcon-1803/
2
IntroIntro
i'm going to talk about Linux is great (sorry)
but Linux or xxxBSD doesn't matter
re-composable, re-usable, flexible operating system kernel
should make everyone happy
3
Who I am ?Who I am ?
Researcher at IIJ Research Laboratory
working for the Internet
4
5
The original InternetThe original Internet
packet switching network
a basis of end-to-end principle
a basis of the hugest platform
P. Baran, On Distributed Communications Networks, IEEE Transactions on Communications Systems, 1964
6
Today's internetToday's internet
not yesterday's Internet
various stake holders / controlled system / security fast
refs:
https://justimagine.aurecongroup.com/solving-complex-problems-forget-what-you-currently-know/
https://kentforliberty.liberty.me/letting-government-control-you/
7
Today's internet (cont'd)Today's internet (cont'd)
a packet is hard to deliver to the others without any modifications
ref: https://www.slideshare.net/obonaventure/innovation-is-back-in-the-transport-and-network-layers
8
End of evolution/innovation ??End of evolution/innovation ??
internet is mature enough (that we don't have to modify)
we can create another universe
are we satisfied ?
people want to innovate but the system is not ready
9
QuestionsQuestions
why do you want to extend your system ?
want to put new idea (I have a great protocol)
want to refresh design (socket API sucks)
want to optimize implementations (too slow for me)
want to secure codes (security fast)
10
What's theWhat's the
matter ?matter ?
11
ProblemsProblems
ossification/no innovation
no more end-to-end
no more experimental platform
more low-quality codes
more waste of time https://pixabay.com/en/questio
problem-think-thinking-62216
12
Protocol ossificationProtocol ossification
two obstacles for new protocol deployment
1. middlebox
2. host operating system
13
Ossification: middleboxOssification: middlebox
TCP segments processed by a router
ref:
https://www.slideshare.net/obonaventure/innovation-is-back-in-the-transport-and-network-layers
14
Ossification: middlebox (cont'd)Ossification: middlebox (cont'd)
TCP segments processed by a NAT router
ref:
https://www.slideshare.net/obonaventure/innovation-is-back-in-the-transport-and-network-layers
15
Ossification: middlebox (cont'd)Ossification: middlebox (cont'd)
possible TCP segments processed by typical middlebox today
ref:
https://www.slideshare.net/obonaventure/innovation-is-back-in-the-transport-and-network-layers
16
Ossification: host OSOssification: host OS
The deployment of protocol extensions takes long
Standardized
WS,TS: 1992 (RFC1323)
SACK: 1996 (RFC2018)
OS
WS, TS: Win 2000/Linux(1999)
SACK: defaulted 1999 (Linux), 2004 (Win)
Fukuda, Kensuke. "An Analysis of Longitudinal TCP Passive Measurements (Short Paper)." Traffic Monitoring and
Analysis 40: 29.
17
Ossification: host OS (cont'd)Ossification: host OS (cont'd)
updating base kernel is not an easy task
Android still uses older kernel
container guests use the host kernel (for network stack)
Android OS distribution with the base Linux kernel version
(taken Nov. 2017)https://developer.android.com/about/dashboards/index.html
18
Design patternDesign pattern
Multipath TCP (mptcp)
an extension to
(traditional) TCP
multipath communication
RFC6824 (experimental)
application compatibility
(unlike SCTP)
Good design ?
middlebox friendly => OK
unmodified application => OK
http://blog.multipath-tcp.org/blog/html/2015/12/25/commercial_usage_of_multipath_tcp.html
19
Ossification: Google's answerOssification: Google's answer
QUIC (Quick UDP Internet Connection)
a transport protocol over UDP
7% of Internet traffic *1
why UDP ?
middlebox friendly
with encrypted payload middlebox can't intercept
why UDP (cont'd) ?
can be implemented in userspace
no need to upgrade host OS
*1 The QUIC Transport Protocol: Design and Internet-Scale Deployment, ACM SIGCOMM 2017
20
Ossification: can othersOssification: can others
deploy such a way ?deploy such a way ?
no
only by creating another universe
21
If you face obstacles...If you face obstacles...
you would implement from scratch
as a name of specialization
lack of maturity of an OS history
more low-quality codes
more waste of time (reinventing a wheel)
22
summary of problemssummary of problems
today's internet is not the original internet
no more end-to-end
to put a break-through
be a part of giant
or thinks differently ?
23
AlternativesAlternatives
Userspace stack
lwip (2002~)
Arrakis [OSDI '14]
IX [OSDI '14]
MegaPipe [OSDI '12]
mTCP [NSDI '14]
SandStorm [SIGCOMM '14]
uTCP [CCR '14]
FastSocket [ASPLOS '16]
SolarFlare (2007~?)
libuinet (2013~)
SeaStar (2014~)
Snabb Switch (2012~)
lightweight VM
MirageOS [ASPLOS '13]
OSv [USENIX '14]
ClickOS [NSDI '14]
Most of them lack feature-richness, or one-shot porting w/o latest
feature updates
24
Alternatives (cont'd)Alternatives (cont'd)
MegaPipe [OSDI '12]
outperforms baseline Linux .. 582% (for short connections).
New API for applications (no free existing applications benefit)
mTCP [NSDI '14]
improves the performance ... by a factor of 25 compared to the
latest Linux TCP stack
implement with very limited TCP extensions
SandStorm [SIGCOMM '14]
our approach with the FreeBSD and Linux stacks ...,
demonstrating 2-10x improvements
specialized (no free existing applications benefit)
Arrakis [OSDI '14]
improvements of 2-5x in latency and 9x in throughput .. to a
well-tuned Linux implementation.
utilize simplified TCP/IP stack (lwip) (loose feature-rich extensions)
25
Does speed matter ?Does speed matter ?
nope, it's one of metric of a system
improving numbers often sacrifices features/functions
As the old joke goes, writing a TCP/IP stack from
scratch over the weekend is easy, but making it
work on the real-world Internet is more difficult
[1].
[1] Antti Kantee, Rump Kernels No OS? No Problem!, USENIX login; October, 2014
26
Our goalOur goal
Respect the implementation (and experience) of past decades
Accelerate the innovation of network stack
discover new values through the past studies
27
The projectThe project
28
AnykernelAnykernel
Anykernel: originally in NetBSD rump kernel
using (unmodified) high-quality code base of monolithic kernel
on different environment in different shape
by gluing additional stuffs
We define an anykernel to be an organization of kernel code which allows the
kernel's unmodified drivers to be run in various configurations such as
application libraries and microkernel style servers, and also as part of a
monolithic kernel. -- Kantee 2012.
29
transforming a monolithic kernel code into an Anykernel
30
Linux Kernel Library (LKL)Linux Kernel Library (LKL)
a library (liblkl.{so,a})
out-of-tree architecture
(h/w-independent)
run Linux code on various ways
with a reusable library
h/w dependent layer
on Linux/Windows
/FreeBSD/Android uspace,
unikernel, on UEFI
network simulator (ns-3)
code
2.4KLoC (h/w independent)
6.6KLoC (h/w dep)
31
LKL: internalsLKL: internals
core design
outsource machine dependent code
keep application and
kernel code untouched
components
1. host backend (host_ops)
2. CPU independent arch. (arch/lkl)
3. application interface
32
1. host backend1. host backend
environment dependent part
unify an interface across
different platforms
(rump-hypercall like)
device interface with Virtio
block device <=> disk image
networking <=> TAP,
raw socket, DPDK, VDE
2. CPU independent architecture2. CPU independent architecture
architecture (arch/lkl)
transparent architecture bind
(as CPU arch)
require no modification to
the other
implementation
thread information (struct
thread_info)
irq, timer, syscall handler
access to underlying layer
by host_ops
3334
3. Application interface3. Application interface
1. use exposed API (LKL syscall)
2. use host libc (LD_PRELOAD)
3. extend (alternative) libc
35
API 1: use exposed API (LKL syscall)API 1: use exposed API (LKL syscall)
call entry points of LKL kernel
lkl_sys_open(), lkl_sys_socket()
almost same as ordinal syscalls
return value, errno notification are different
can use LKL syscall and host syscall
simultaneously
read ext4 file by lkl_sys_read() =>
write into host (Windows) by write()
36
API 2: hijack host standard libraryAPI 2: hijack host standard library
dynamically replace symbols
of host syscalls (of libc)
LD_PRELOAD
socket() => lkl_sys_socket()
can use host binary (executable) as-is
limitation of replaceable symbols
needs syscall translation on non-linux host
37
API 3: extend (alternative) libcAPI 3: extend (alternative) libc
only call LKL syscall with our own libc
also introduce as a virtual CPU architecture
a program can link this instead of host libc
can't access to (underlying) host resource
directly via this lkl syscall
as a patch for musl libc
38
UsagesUsages
39
NUSE (Network Stack in UserspacE)NUSE (Network Stack in UserspacE)
What ?
install/use alternate network stack (i.e., TCP/IP)
but it's a full-fledged code (Linux)
host network stack isn't involved
Why ?
because kernel is hard to touch
Android (long delivery time)
container (e.g., docker: shared by others)
40
Demo (Android+mptcp)Demo (Android+mptcp)
41
unikernelsunikernels
What ?
OS instance w/ a single process
on bare-metal
on hypervisor
on userspace program
cross-compile with alt-libc
rumprun (by Antti Kantee)
frankenlibc (by Justin Cormack)
Why ?
small footprint
quick instantiation
- http://www.linux.com/news/enterprise/cloud-
computing/751156-are-cloud-operating-
systems-the-next-big-thing-
42
rumprun/frankenlibc unikernelrumprun/frankenlibc unikernel
43
Demo (frankenlibc)Demo (frankenlibc)
44
Service Function Chain (SFC)Service Function Chain (SFC)
What ?
SFC by Unix pipe and LKL
NF in a shell command
ping.sh | nat.sh | pfilter.sh
Why ?
a chain w/ VMs is heavyweight
Unix pipe is useful enough (e.g., packet filter by grep)
45
DemoDemo
How ping looks like ?How ping looks like ?
generate raw data to stdout
next program can receive from stdin
https://github.com/thehajime/blog/issues/3
4647
grep command as firewallgrep command as firewall
https://github.com/thehajime/blog/issues/3
48
49
Chaining (NAT + packet filter)Chaining (NAT + packet filter)
microbenchmark
netperf, iptables (NAT/ACL)
measure boot latency TCP googput
Boot latency
quick boot, reasonable fwd/filter
performance, w/o optimizations
Network simulationNetwork simulation
What ?
network simulation (ns-3)
with Linux network stack
Why ?
less abstraction
more realistic
fully reproducible
5051
Your experimentYour experiment
easy to create in your laptop with VM (UML/Docker/Xen/KVM)
only IF the test is enough to describe
52
Your experiment (cont'd)Your experiment (cont'd)
huge resources to conduct a test
not likely to reproduce
tons of configuration scripts
running on different machines/OSes
controling is troublesome
distributed debugger...
Debugging/TestingDebugging/Testing
5354
Testing with Continuous IntegrationTesting with Continuous Integration
Detected bugs (Linux net-next ree)
[net-next,v2] ipv6: Do not iterate over all interfaces when finding
source address on specific interface. (v4.2-rc0, during VRF)
[v3] ipv6: Fix protocol resubmission (v4.1-rc7, expanded from v4
stack)
[net-next] ipv6: Check RTF_LOCAL on rt->rt6i_flags instead of rt-
>dst.flags (v4.1-rc1, during v6 improvement)
[net-next] xfrm6: Fix a offset value for network header in
_decode_session6 (v3.19-rc7?, regression only in mip6)
55
Lessons learnedLessons learned
Proof: Anykernel isn't only for NetBSD
Can be applied to other kernels
New (standard?) layers of software stack
56
SummarySummary
i've talked about Linux,
but Linux or xxxBSD doesn't matter
because operating system kernel can be replaced / reused
57
ReferencesReferences
Code
(LKL)
Articles
(blog)
(my info)
https://github.com/lkl/linux
https://github.com/libos-nuse/frankenlibc
https://github.com/libos-nuse/rumprun
https://github.com/thehajime/blog/issues/3
http://www.iij-ii.co.jp/en/lab/researchers/tazaki/
Linux rumpkernelLinux rumpkernel
Hajime Tazaki (tazaki at iij.ad.jp)
@thehajime
58
BackupsBackups
## (a bit of) History - rump: 2007 (NetBSD) - LKL: 2007 (Linux) - DCE/LibOS:
2008 (Linux/FreeBSD) - LibOS/LKL revival: 2015 - LibOS merged to LKL
http://news.mynavi.jp/news/2015/03/25/285/
https://news.ycombinator.com/item?id=9259292
http://www.phoronix.com/scan.php?page=news_item&px=Linux-Library-LibOS
http://lwn.net/Articles/639333/
## LKL v.s. LibOS
LKL LibOS
LKL v.s. LibOSLKL v.s. LibOS
(cont'd)(cont'd)
LoC:
arch/lkl (LKL) < arch/lib (LibOS)
diff: the amount of stub code
commons
no modification to the original Linux code
description of kernel context (by POSIX thread)
outsourced resources (clock, memory, scheduler)
CPU independent architecture
diffs
LibOS: implemented with higher API (timer, irq, kthread) by pthread
LKL: implement IRQ, kthread, timer with pthread in lower layer
LKL: current statusLKL: current status
59
Sent RFC (Nov. 2015)
no update on LKML since then
have evolved a lot
fast syscall path
offload (csum, TSO/LRO)
CONFIG_SMP (WIP)
json config
qemu baremetal (unikernel)
on UEFI
https://github.com/lkl/linux
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)

Contenu connexe

Tendances

Kernelvm 201312-dlmopen
Kernelvm 201312-dlmopenKernelvm 201312-dlmopen
Kernelvm 201312-dlmopen
Hajime Tazaki
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
IO Visor Project
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)
micchie
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
Jiannan Ouyang, PhD
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
Kernel TLV
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Jiannan Ouyang, PhD
 

Tendances (20)

mTCP使ってみた
mTCP使ってみたmTCP使ってみた
mTCP使ってみた
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
Kernelvm 201312-dlmopen
Kernelvm 201312-dlmopenKernelvm 201312-dlmopen
Kernelvm 201312-dlmopen
 
CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016] CETH for XDP [Linux Meetup Santa Clara | July 2016]
CETH for XDP [Linux Meetup Santa Clara | July 2016]
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
 
Linux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use CasesLinux Kernel Cryptographic API and Use Cases
Linux Kernel Cryptographic API and Use Cases
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)
 
VLANs in the Linux Kernel
VLANs in the Linux KernelVLANs in the Linux Kernel
VLANs in the Linux Kernel
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 
Cilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDPCilium - Fast IPv6 Container Networking with BPF and XDP
Cilium - Fast IPv6 Container Networking with BPF and XDP
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
netfilter and iptables
netfilter and iptablesnetfilter and iptables
netfilter and iptables
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Netmap presentation
Netmap presentationNetmap presentation
Netmap presentation
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
 

Similaire à Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)

Similaire à Linux rumpkernel - ABC2018 (AsiaBSDCon 2018) (20)

Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
 
Analise NetFlow in Real Time
Analise NetFlow in Real TimeAnalise NetFlow in Real Time
Analise NetFlow in Real Time
 
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
Journal Seminar: Is Singularity-based Container Technology Ready for Running ...
 
[ko] Kernel Networking Stack 진입 장벽 허물기
[ko] Kernel Networking Stack 진입 장벽 허물기[ko] Kernel Networking Stack 진입 장벽 허물기
[ko] Kernel Networking Stack 진입 장벽 허물기
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
OSCON: System software goes weird
OSCON: System software goes weirdOSCON: System software goes weird
OSCON: System software goes weird
 
Hackerworkshop exercises
Hackerworkshop exercisesHackerworkshop exercises
Hackerworkshop exercises
 
Scaling the Container Dataplane
Scaling the Container Dataplane Scaling the Container Dataplane
Scaling the Container Dataplane
 
Using open source software to build an industrial grade embedded linux platfo...
Using open source software to build an industrial grade embedded linux platfo...Using open source software to build an industrial grade embedded linux platfo...
Using open source software to build an industrial grade embedded linux platfo...
 
UniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtimeUniK - a unikernel compiler and runtime
UniK - a unikernel compiler and runtime
 
Pristine glif 2015
Pristine glif 2015Pristine glif 2015
Pristine glif 2015
 
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
SF-TAP: Scalable and Flexible Traffic Analysis Platform (USENIX LISA 2015)
 
ERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projectsERTS 2008 - Using Linux for industrial projects
ERTS 2008 - Using Linux for industrial projects
 
Busy Polling: Past, Present, Future
Busy Polling: Past,      Present, FutureBusy Polling: Past,      Present, Future
Busy Polling: Past, Present, Future
 
Using VPP and SRIO-V with Clear Containers
Using VPP and SRIO-V with Clear ContainersUsing VPP and SRIO-V with Clear Containers
Using VPP and SRIO-V with Clear Containers
 
Modern IoT and Embedded Linux Deployment - Berlin
Modern IoT and Embedded Linux Deployment - BerlinModern IoT and Embedded Linux Deployment - Berlin
Modern IoT and Embedded Linux Deployment - Berlin
 
Rlite software-architecture (1)
Rlite software-architecture (1)Rlite software-architecture (1)
Rlite software-architecture (1)
 
Snabb, a toolkit for building user-space network functions (ES.NOG 20)
Snabb, a toolkit for building user-space network functions (ES.NOG 20)Snabb, a toolkit for building user-space network functions (ES.NOG 20)
Snabb, a toolkit for building user-space network functions (ES.NOG 20)
 
First Steps Developing Embedded Applications using Heterogeneous Multi-core P...
First Steps Developing Embedded Applications using Heterogeneous Multi-core P...First Steps Developing Embedded Applications using Heterogeneous Multi-core P...
First Steps Developing Embedded Applications using Heterogeneous Multi-core P...
 

Dernier

Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Sheetaleventcompany
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
sexy call girls service in goa
 
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
anilsa9823
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
soniya singh
 

Dernier (20)

(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...(+971568250507  ))#  Young Call Girls  in Ajman  By Pakistani Call Girls  in ...
(+971568250507 ))# Young Call Girls in Ajman By Pakistani Call Girls in ...
 
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
INDIVIDUAL ASSIGNMENT #3 CBG, PRESENTATION.
 
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort ServiceEnjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
Enjoy Night⚡Call Girls Dlf City Phase 3 Gurgaon >༒8448380779 Escort Service
 
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
VVVIP Call Girls In Connaught Place ➡️ Delhi ➡️ 9999965857 🚀 No Advance 24HRS...
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Daund ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 26 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine ServiceHot Service (+9316020077 ) Goa  Call Girls Real Photos and Genuine Service
Hot Service (+9316020077 ) Goa Call Girls Real Photos and Genuine Service
 
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service OnlineCALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
CALL ON ➥8923113531 🔝Call Girls Lucknow Lucknow best sexual service Online
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
Russian Call Girls in %(+971524965298  )#  Call Girls in DubaiRussian Call Girls in %(+971524965298  )#  Call Girls in Dubai
Russian Call Girls in %(+971524965298 )# Call Girls in Dubai
 
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night StandHot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
Hot Call Girls |Delhi |Hauz Khas ☎ 9711199171 Book Your One night Stand
 
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 6 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Sukhdev Vihar Delhi 💯Call Us 🔝8264348440🔝
 
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
All Time Service Available Call Girls Mg Road 👌 ⏭️ 6378878445
 
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
@9999965857 🫦 Sexy Desi Call Girls Laxmi Nagar 💓 High Profile Escorts Delhi 🫶
 
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
𓀤Call On 7877925207 𓀤 Ahmedguda Call Girls Hot Model With Sexy Bhabi Ready Fo...
 

Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)

  • 1. 1 Linux rumpkernelLinux rumpkernel a librarified monolithic kernela librarified monolithic kernel Hajime Tazaki IIJ Research Laboratory March, 2018, AsiaBSDCon 2018 slide source https://github.com/thehajime/asiabsdcon-1803/
  • 2. 2 IntroIntro i'm going to talk about Linux is great (sorry) but Linux or xxxBSD doesn't matter re-composable, re-usable, flexible operating system kernel should make everyone happy
  • 3. 3 Who I am ?Who I am ? Researcher at IIJ Research Laboratory working for the Internet
  • 4.
  • 5. 4 5 The original InternetThe original Internet packet switching network a basis of end-to-end principle a basis of the hugest platform P. Baran, On Distributed Communications Networks, IEEE Transactions on Communications Systems, 1964
  • 6. 6 Today's internetToday's internet not yesterday's Internet various stake holders / controlled system / security fast refs: https://justimagine.aurecongroup.com/solving-complex-problems-forget-what-you-currently-know/ https://kentforliberty.liberty.me/letting-government-control-you/
  • 7. 7 Today's internet (cont'd)Today's internet (cont'd) a packet is hard to deliver to the others without any modifications ref: https://www.slideshare.net/obonaventure/innovation-is-back-in-the-transport-and-network-layers
  • 8. 8 End of evolution/innovation ??End of evolution/innovation ?? internet is mature enough (that we don't have to modify) we can create another universe are we satisfied ? people want to innovate but the system is not ready
  • 9. 9 QuestionsQuestions why do you want to extend your system ? want to put new idea (I have a great protocol) want to refresh design (socket API sucks) want to optimize implementations (too slow for me) want to secure codes (security fast)
  • 11. 11 ProblemsProblems ossification/no innovation no more end-to-end no more experimental platform more low-quality codes more waste of time https://pixabay.com/en/questio problem-think-thinking-62216
  • 12. 12 Protocol ossificationProtocol ossification two obstacles for new protocol deployment 1. middlebox 2. host operating system
  • 13. 13 Ossification: middleboxOssification: middlebox TCP segments processed by a router ref: https://www.slideshare.net/obonaventure/innovation-is-back-in-the-transport-and-network-layers
  • 14. 14 Ossification: middlebox (cont'd)Ossification: middlebox (cont'd) TCP segments processed by a NAT router ref: https://www.slideshare.net/obonaventure/innovation-is-back-in-the-transport-and-network-layers
  • 15. 15 Ossification: middlebox (cont'd)Ossification: middlebox (cont'd) possible TCP segments processed by typical middlebox today ref: https://www.slideshare.net/obonaventure/innovation-is-back-in-the-transport-and-network-layers
  • 16. 16 Ossification: host OSOssification: host OS The deployment of protocol extensions takes long Standardized WS,TS: 1992 (RFC1323) SACK: 1996 (RFC2018) OS WS, TS: Win 2000/Linux(1999) SACK: defaulted 1999 (Linux), 2004 (Win) Fukuda, Kensuke. "An Analysis of Longitudinal TCP Passive Measurements (Short Paper)." Traffic Monitoring and Analysis 40: 29.
  • 17. 17 Ossification: host OS (cont'd)Ossification: host OS (cont'd) updating base kernel is not an easy task Android still uses older kernel container guests use the host kernel (for network stack) Android OS distribution with the base Linux kernel version (taken Nov. 2017)https://developer.android.com/about/dashboards/index.html
  • 18. 18 Design patternDesign pattern Multipath TCP (mptcp) an extension to (traditional) TCP multipath communication RFC6824 (experimental) application compatibility (unlike SCTP) Good design ? middlebox friendly => OK unmodified application => OK http://blog.multipath-tcp.org/blog/html/2015/12/25/commercial_usage_of_multipath_tcp.html
  • 19. 19 Ossification: Google's answerOssification: Google's answer QUIC (Quick UDP Internet Connection) a transport protocol over UDP 7% of Internet traffic *1 why UDP ? middlebox friendly with encrypted payload middlebox can't intercept why UDP (cont'd) ? can be implemented in userspace no need to upgrade host OS *1 The QUIC Transport Protocol: Design and Internet-Scale Deployment, ACM SIGCOMM 2017
  • 20. 20 Ossification: can othersOssification: can others deploy such a way ?deploy such a way ? no only by creating another universe
  • 21. 21 If you face obstacles...If you face obstacles... you would implement from scratch as a name of specialization lack of maturity of an OS history more low-quality codes more waste of time (reinventing a wheel)
  • 22. 22 summary of problemssummary of problems today's internet is not the original internet no more end-to-end to put a break-through be a part of giant or thinks differently ?
  • 23. 23 AlternativesAlternatives Userspace stack lwip (2002~) Arrakis [OSDI '14] IX [OSDI '14] MegaPipe [OSDI '12] mTCP [NSDI '14] SandStorm [SIGCOMM '14] uTCP [CCR '14] FastSocket [ASPLOS '16] SolarFlare (2007~?) libuinet (2013~) SeaStar (2014~) Snabb Switch (2012~) lightweight VM MirageOS [ASPLOS '13] OSv [USENIX '14] ClickOS [NSDI '14] Most of them lack feature-richness, or one-shot porting w/o latest feature updates
  • 24. 24 Alternatives (cont'd)Alternatives (cont'd) MegaPipe [OSDI '12] outperforms baseline Linux .. 582% (for short connections). New API for applications (no free existing applications benefit) mTCP [NSDI '14] improves the performance ... by a factor of 25 compared to the latest Linux TCP stack implement with very limited TCP extensions SandStorm [SIGCOMM '14] our approach with the FreeBSD and Linux stacks ..., demonstrating 2-10x improvements specialized (no free existing applications benefit) Arrakis [OSDI '14] improvements of 2-5x in latency and 9x in throughput .. to a well-tuned Linux implementation. utilize simplified TCP/IP stack (lwip) (loose feature-rich extensions)
  • 25. 25 Does speed matter ?Does speed matter ? nope, it's one of metric of a system improving numbers often sacrifices features/functions As the old joke goes, writing a TCP/IP stack from scratch over the weekend is easy, but making it work on the real-world Internet is more difficult [1]. [1] Antti Kantee, Rump Kernels No OS? No Problem!, USENIX login; October, 2014
  • 26. 26 Our goalOur goal Respect the implementation (and experience) of past decades Accelerate the innovation of network stack discover new values through the past studies
  • 28. 28 AnykernelAnykernel Anykernel: originally in NetBSD rump kernel using (unmodified) high-quality code base of monolithic kernel on different environment in different shape by gluing additional stuffs We define an anykernel to be an organization of kernel code which allows the kernel's unmodified drivers to be run in various configurations such as application libraries and microkernel style servers, and also as part of a monolithic kernel. -- Kantee 2012.
  • 29. 29 transforming a monolithic kernel code into an Anykernel
  • 30. 30 Linux Kernel Library (LKL)Linux Kernel Library (LKL) a library (liblkl.{so,a}) out-of-tree architecture (h/w-independent) run Linux code on various ways with a reusable library h/w dependent layer on Linux/Windows /FreeBSD/Android uspace, unikernel, on UEFI network simulator (ns-3) code 2.4KLoC (h/w independent) 6.6KLoC (h/w dep)
  • 31. 31 LKL: internalsLKL: internals core design outsource machine dependent code keep application and kernel code untouched components 1. host backend (host_ops) 2. CPU independent arch. (arch/lkl) 3. application interface
  • 32. 32 1. host backend1. host backend environment dependent part unify an interface across different platforms (rump-hypercall like) device interface with Virtio block device <=> disk image networking <=> TAP, raw socket, DPDK, VDE
  • 33. 2. CPU independent architecture2. CPU independent architecture architecture (arch/lkl) transparent architecture bind (as CPU arch) require no modification to the other implementation thread information (struct thread_info) irq, timer, syscall handler access to underlying layer by host_ops
  • 34. 3334 3. Application interface3. Application interface 1. use exposed API (LKL syscall) 2. use host libc (LD_PRELOAD) 3. extend (alternative) libc
  • 35. 35 API 1: use exposed API (LKL syscall)API 1: use exposed API (LKL syscall) call entry points of LKL kernel lkl_sys_open(), lkl_sys_socket() almost same as ordinal syscalls return value, errno notification are different can use LKL syscall and host syscall simultaneously read ext4 file by lkl_sys_read() => write into host (Windows) by write()
  • 36. 36 API 2: hijack host standard libraryAPI 2: hijack host standard library dynamically replace symbols of host syscalls (of libc) LD_PRELOAD socket() => lkl_sys_socket() can use host binary (executable) as-is limitation of replaceable symbols needs syscall translation on non-linux host
  • 37. 37 API 3: extend (alternative) libcAPI 3: extend (alternative) libc only call LKL syscall with our own libc also introduce as a virtual CPU architecture a program can link this instead of host libc can't access to (underlying) host resource directly via this lkl syscall as a patch for musl libc
  • 39. 39 NUSE (Network Stack in UserspacE)NUSE (Network Stack in UserspacE) What ? install/use alternate network stack (i.e., TCP/IP) but it's a full-fledged code (Linux) host network stack isn't involved Why ? because kernel is hard to touch Android (long delivery time) container (e.g., docker: shared by others)
  • 41. 41 unikernelsunikernels What ? OS instance w/ a single process on bare-metal on hypervisor on userspace program cross-compile with alt-libc rumprun (by Antti Kantee) frankenlibc (by Justin Cormack) Why ? small footprint quick instantiation - http://www.linux.com/news/enterprise/cloud- computing/751156-are-cloud-operating- systems-the-next-big-thing-
  • 44. 44 Service Function Chain (SFC)Service Function Chain (SFC) What ? SFC by Unix pipe and LKL NF in a shell command ping.sh | nat.sh | pfilter.sh Why ? a chain w/ VMs is heavyweight Unix pipe is useful enough (e.g., packet filter by grep)
  • 46. How ping looks like ?How ping looks like ? generate raw data to stdout next program can receive from stdin https://github.com/thehajime/blog/issues/3
  • 47. 4647 grep command as firewallgrep command as firewall
  • 49.
  • 50. 48 49 Chaining (NAT + packet filter)Chaining (NAT + packet filter) microbenchmark netperf, iptables (NAT/ACL) measure boot latency TCP googput Boot latency quick boot, reasonable fwd/filter performance, w/o optimizations
  • 51. Network simulationNetwork simulation What ? network simulation (ns-3) with Linux network stack Why ? less abstraction more realistic fully reproducible
  • 52. 5051 Your experimentYour experiment easy to create in your laptop with VM (UML/Docker/Xen/KVM) only IF the test is enough to describe
  • 53. 52 Your experiment (cont'd)Your experiment (cont'd) huge resources to conduct a test not likely to reproduce tons of configuration scripts running on different machines/OSes controling is troublesome distributed debugger...
  • 55. 5354 Testing with Continuous IntegrationTesting with Continuous Integration Detected bugs (Linux net-next ree) [net-next,v2] ipv6: Do not iterate over all interfaces when finding source address on specific interface. (v4.2-rc0, during VRF) [v3] ipv6: Fix protocol resubmission (v4.1-rc7, expanded from v4 stack) [net-next] ipv6: Check RTF_LOCAL on rt->rt6i_flags instead of rt- >dst.flags (v4.1-rc1, during v6 improvement) [net-next] xfrm6: Fix a offset value for network header in _decode_session6 (v3.19-rc7?, regression only in mip6)
  • 56. 55 Lessons learnedLessons learned Proof: Anykernel isn't only for NetBSD Can be applied to other kernels New (standard?) layers of software stack
  • 57. 56 SummarySummary i've talked about Linux, but Linux or xxxBSD doesn't matter because operating system kernel can be replaced / reused
  • 59. Linux rumpkernelLinux rumpkernel Hajime Tazaki (tazaki at iij.ad.jp) @thehajime
  • 60. 58 BackupsBackups ## (a bit of) History - rump: 2007 (NetBSD) - LKL: 2007 (Linux) - DCE/LibOS: 2008 (Linux/FreeBSD) - LibOS/LKL revival: 2015 - LibOS merged to LKL
  • 62. LKL LibOS LKL v.s. LibOSLKL v.s. LibOS (cont'd)(cont'd)
  • 63. LoC: arch/lkl (LKL) < arch/lib (LibOS) diff: the amount of stub code commons no modification to the original Linux code description of kernel context (by POSIX thread) outsourced resources (clock, memory, scheduler) CPU independent architecture diffs LibOS: implemented with higher API (timer, irq, kthread) by pthread LKL: implement IRQ, kthread, timer with pthread in lower layer LKL: current statusLKL: current status
  • 64. 59 Sent RFC (Nov. 2015) no update on LKML since then have evolved a lot fast syscall path offload (csum, TSO/LRO) CONFIG_SMP (WIP) json config qemu baremetal (unikernel) on UEFI https://github.com/lkl/linux