This document discusses improving boot-up time on MeeGo handsets. It describes experiments done to optimize boot time by implementing hibernation using swsusp and TuxOnIce on an ARM-based reference board similar to the N900. Boot graph and memory usage analysis was performed. Implementing hibernation with TuxOnIce and shrinking memory usage before hibernation reduced boot time from over 120 seconds to around 20 seconds. Further optimization ideas discussed include using faster storage, decreasing image size by killing bloated apps, and lazy image/page loading.
8. Impact of boot-up time
For consumer client device
User experience
TV, IVI, Camera
Immediate action is preferable right after power on.
Tablet, netbook, handset
Is cold start really necessary?
More complicated S/W stacks, more memory consumed.
Mass Production test
The more time a device spends on the production line, the more expensive.
9. Boot-Up time definition
Until when?
When Login prompt appears.
When Desktop shows up.
When Network is available.
When Browser is ready.
When it can take a picture.
When CPU goes into idle.
This depends on:
Your H/W configuration.
Your S/W configuration.
Your system requirements.
The shortest isn’t always the best.
11. Measurement method(userland)
uptime
/ # cat /proc/uptime
18.73 14.24
/ # cat /proc/uptime
20.55 16.05
bootchart
A newer version is released in MeeGo
No additional tool to create svg. Directly created.
entire measurement
Including bootloader, kernel and userland
grabserial
show_delta, again
oprofile
ETM, Embedded Trace Macrocell, H/W assisted
12. Existing Optimization techniques
kernel optimization
asynchronous initcall
asynchronous resume/suspend
misc: preset lpj, no probe, no console, deferred module loading
userland optimization
initscript: upstart or systemd. Do it in parallel
readahead
prelink
hibernation based optimization
snapshot boot
InstantBoot
Warp2
QuickBoot
BIOS/bootloader assisted.
13. Is cold start still necessary?
Do we need cold start so often?
Flashing a hibernation image in advance could reduce the production
line usetime.
Optimization may depend on your product specific part
S/W configuration
H/W configuration
Your system requirement
Wouldn’t hibernation be ok in most cases?
15. Handset requirement
Responsiveness of device/applications
Quick response could improve UX, especially Handsets.
One touch can choose a friend from "contact list".
One touch can start camera. Same as digital camera.
One touch can start web browsing.
A call has to be processed within a short time, from operator spec.
Resolving dynamic libraries takes more time than swapping in pages.
All major applications can be started but invisible
Then, visible upon request.
RAM is occupied with started applications/daemons.
16. Handset Boot-Up time
N900 boot-up takes ~40 sec
Until Desktop shows up.
Number of applications 137
Swap status
22. Target spec
OMAP3 based reference board
Similar to N900
512MB RAM
MeeGo Handset
Number of applications ~161
~120 sec with all application boot-up done
Swap status
23. No hibernation support for ARM
There was no hibernation support for ARM.
Picked up old patch, and upgraded to v2.6.35.
Rejected by RMK because:
Need to be synch’ed with suspend-to-ram
Lack of PXA support
coprocessor differences between ARM versions
mrc p15, 0, %0, c2, c0, 0
At least, it works!
Let’s proceed.
24. Which hibernation method to use?
Three implementation of hibernation
1. swsusp
Included in mainline kernel as default.
2. uswsusp
Userland implementation
3. tuxonice
Out of kernel, but many features
Compression of images
multiple thread I/O
readahead
LVM support
28. Use mtdblock rather than eMMC
mtdblock is much faster than eMMC.
mtdblock
~23 MB/sec/READ
eMMC
~20 MB/sec/READ
~15 MB/sec/READ
This is a HACK since:
mtdblock itself is bogus without wear-leveling support.
mtdswap is *volatile*.
Good performance
But cannot be used for hibernation.
Need non-volatile mtdswap!!
31. Port TuxOnIce on ARM
TuxOnIce has many optimization features:
Compression of images
multiple threaded I/O
readahead
LVM support
To drop pagecache
echo -2 > /sys/power/tuxonice/image_size_limit
To start hibernation
echo disk > /sys/power/tuxonice/do_hibernation
37. What is the bottleneck?
The smaller RAM consumed, the lesser boot time.
But cannot squeeze any more after certain size
In our case:
size: ~110 MB
~70% of boot time is spent on (compressed) image restoration.
40. Why unevictable?
Recent SoC has smart coprocessors
GPU, DSP and H/W accelerators.
They may have IOMMU.
More memory could be shared with coprocessors
http://en.wikipedia.org/wiki/IOMMU
41. Why does IOMMU have an effect?
pages have to be DMA’able.
Shared pages have to be pinned.
They shouldn’t be swapped out.
Unevictable
43. Linearity of hibernation method
Linux VM tries to occupy RAM as much as possible(ex: page cache).
RAM consumption can be squeezed at certain point.
The boot time increases in proportion to the size of unevictable
memory.
For further optimization, we need something more!
44. Proposals
1. To increase read performance of storage
Faster storage?
mtd gets shorter boot-up time than eMMC
faster mtd gets shorter boot-up time than slower mtd
non-volatile mtdswap driver
LVM swap to improve disk performance by raid-0
2. Still to decrease image size
Kill & restart bloated Apps if possible.
maybe a bit brutal, but it works certainly.
Swap out unevictable pages
How to ensure if those pages exisit when it’s necessary?
page coloring
memory cgroup, which process page can be swapped out
3. Lazy image/page loading
Don’t we forget the system responsiveness?
45. Example: Ubiquitous QuickBoot
Can be considered as "Lazy image/page loading":
http://www.ubiquitous.co.jp/En/products/middleware/quickboot