3. Isolation
● From in-guest kernel/userspace
• Provided by Xen
• Buggy emulation blurres the line
● From trusted computing base (TCB)
• Possible via Xen Security Modules
• Move introspection system out from dom0!
4. Xen Security Modules (XSM)
● Usable since Xen 4.3
and Linux 3.8
● Disaggregate the TCB
● Available on both
x86 and ARM
● Not enabled by default
6. Interposition
● Trap to Xen when something of interest
happens within the guest
• Enable optional hardware traps
• CLTS, HLT, LGDT, LIDT, LLDT, LTR, SGDT, MOV from
CR3, MOV from CR8, MOV to CR0, MOV to CR3, MOV
to CR4, MOV to CR8, MOV DR, MWAIT, INT3, INT0,
MTF, etc..
• See full list in Intel SDM 3c 25.1.3
8. EPT caveats
“An EPT violation that occurs during as a
result of execution of a read-modify-write
operation sets bit 1 (data write). Whether it
also sets bit 0 (data read) is implementation-
specific and, for a given implementation,
may differ for different kinds of read-modify-
write operations.” - Intel SDM 3c
9. EPT caveats
● “Why can't the hardware report the true
characteristics right away?” - Jan Beulich
● “when spec says so, there is a reason but I
can't tell here. :-)” - Kevin Tian
● Well.. let's just mark all write volation as
read violation too..
● Patched in Xen 4.5
10. EPT caveats
● Requires relaxing the
EPT permissions
● Requires singlestepping
the vCPU
● Many VMEXITs not
shown in picture!
● Fixed for Xen 4.6
11. EPT caveats
● Race-condition if VM
has multiple vCPU
● No solution for this
problem prior to Xen 4.6
● New method introduced
in Xen 4.6 that solves
this: altp2m
12. altp2m
● Add support for
multiple EPTs for
second stage lookup!
● One table for
“restricted view”
● One table for “normal
view”
13. altp2m
● EPT pointer can be
swapped in the
VMCS
● No need to change
EPT PTE permissions
all the time
● No race condition
14. Interposition
● Once trapped to Xen, forward events
• Formerly known as mem_event
• Renamed and reworked as vm_event in 4.6
● Request/response via shared memory ring
• Monitor page used for VMI related events
• Two additional pages: memory sharing and
paging
15. vm_event & mem_access & monitor
● Let's keep track of subsystem names
● vm_event is the underlying request/response
mechanism
● mem_access memops control EPT
● monitor_op domctls control all other optional
VM execution traps
16. Event delivery structures in 4.6
● Defined in xen/vm_event.h public header
● Easily extendable and versioned
● No more hackery
● Event response can trigger specific behavior
without additional hypercalls
• Trigger emulation, singlestepping, swap altp2m...
17. VMI with Xen on ARM
● ARM has two-stage paging similar to EPT
● mem_access implemented for 4.6
● Some caveats:
• No singlestepping?
• Can be worked around but it's a pain
• Split-TLB ambiguities
18. ARM mem_access
● ARM PTEs have fewer software
programmable bits as compared to EPT
● ARM mem_access requires maintaining a
Radix-tree to keep track of PTEs with
custom permissions
● Radix-tree keyed with GPA
19. ARM mem_access
● For a 2nd
stage violation ARM provides the
faulting GVA
● GPA only provided if fault happened during
1st
stage pagetable walk
● Xen needs to translate GVA to GPA to
perform Radix-tree lookup
20. ARM mem_access
● Native CPU instructions to perform GVA to
GPA translation
● Performs lookup as data-fetch access
● What if we trapped an instruction-fetch
access?
• In-guest translation hits iTLB
• Xen hits dTLB
21. ● Split-TLB is a real rootkit problem
• ShadowWalker, MoRE, etc..
● Guest can load the iTLB with rootkit page
and dTLB with benign page
● Flushing the TLB does not help, iTLB
translation may be lost if PT no longer
represents the cached translation
ARM Split-TLB problem
22. ● Execution tracing with mem_access may be
problematic
● Use Secure Monitor Call (SMC) instruction
injection!
● Similar to 0xCC injection on x86
● TODO
ARM future work
23. ● altp2m is primarily designed to be used with
Intel #VE
● VMCALL instruction to perform EPTP
switching from the guest
● Hybrid VMI
● KVM events
x86 future work
24. ● Why aren't we using git pulls?
• Patches in mailinglist without branch-off point
specified
• Carving patches from mbox is a pain
• Start providing a public git branch for your
series!!
Lessons learnt
25. ● Provide build-testing for the community
• It's a waste of time to wait for review on
something that's broken
• Check for style issues automatically?
• Travis-CI is OK but can time-out on large series
• https://github.com/tklengyel/xen/tree/travis
Lessons learnt