LAS16-300K2: Overview of IoT Zephyr
Speakers: Geoff Thorpe
Date: September 28, 2016
★ Session Description ★
Title: Overview of IoT Zephyr
Bio:
Geoff Thorpe heads up security within the Microcontroller group of NXP, where the intersection of device security and network security gives him a headache commonly known as “IoT”. His early experience with security topics was very software-centric, as a long-standing member of the OpenSSL team and a contributor to related open source projects. After many years veering off into semiconductors and hardware architecture, his software-bias has been domesticated to some extent but not eradicated.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-300k2
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-300k2/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
2. geoff.thorpe@nxp.com:/Microcontrollers/R&D/Security
Software
• Involvement in open source around security and networking (OpenSSL member)
• Interests in security scalability
• Member of Zephyr governance board
Hardware
• “Datapath” architecture for QorIQ and Layerscape SoCs (Networking)
• i.MX apps processors and Kinetis microcontrollers
Focused on new security problems (and solutions) brought on by the emergence of IoT
Based in Québec City, originally from Wellington, New Zealand. (Was not in LoTR)
3. Zephyr
• What, where and why
• Status
IoT security
• Terminology
• Disruption
• Observations
• Where does Zephyr fit into this?
Agenda
4. Zephyr
• What, where and why
• Status
IoT security
• Terminology
• Disruption
• Observations
• Where does Zephyr fit into this?
Agenda
See recording of
Anas Nashif’s
Zephyr talk from
Monday
5. Strategic Investment Best-of-Breed RTOS
for IoT
True Open Source
Development and
Governance
Established Code BasePermissively Licensed Modular
Why Zephyr?
15. Usage
“Add security to the product”
“Secure the edge-node”
“Integrated security, because security is
important”
16. Abusage
“Add security to the product”
“Secure the edge-node”
“Integrated security, because security is
important”
And by security you mean … what exactly?
17. Does “security” mean…
• Tamper-proof?
• Resistant to side-channel attack?
• Able to perform cryptographic operations fast/efficiently?
• Key-protection and other logical separation?
• Supports secure network protocols?
• Protects content restrictions against misuse?
• Is kept up-to-date through patch updates?
• Reliable/robust in the face of adversarial RF?
• You did some code reviews (this time round)?
18. Security facets, a less incomplete list
Cryptography;
• Software optimization
• Hardware IP
• Protocol security, interoperability
• Privacy, authentication, non-repudiation
Secure non-volatile storage
Inline encryption (memory, flash, …)
Trusted execution (secure boot, …)
Key management and protection
Certification
Code quality and review
Vulnerability analysis
Best practice
Process and production security
Compartmentalization/isolation
Digital Rights Management
IP protection (anti-cloning, …)
Resistance to side-channel attacks
• Power
• Timing
• Electromagnetic emissions
Emergency response
Security maintenance
Attack detection
Reliability (quality-of-service, stability, …)
20. What is “security”?
•“Security” on its own can mean almost anything
“Security” on its own means almost nothing
•It’s almost always context-dependent, in terms of
interpretation and importance of those different facets.
•“The minimization of insecurity (or ‘threats’)” ?
22. What is “IoT security”?
The meeting (perfect storm) of two domains;
•Device security
•Network and logical security
23. What is “IoT security”?
Device Security Network Security
Secure non-volatile storage Cryptographic s/w and h/w
Inline encryption (memory, flash, …) Protocol security & interoperability
Trusted execution (secure boot, …) Usability and clarity
Key management and protection Code quality and review
Certification Best practice
Vulnerability analysis Emergency response
Process and production security Security maintenance
DRM & IP protection (anti-cloning, …) Attack detection
Resistance to side-channel attacks Reliability (quality-of-service, stability, …)
24. IoT Security – when assumptions collide
Device security
• Implementation + certification are static
• Threat model is physical
Network security
• Patched early and often, via network
• Threat model is “the network”
Risk multipliers
• Widely deployed
• Physical and network accessibility
Large attack surface
High attack incentive
Defense de-multipliers
• Commodity pricing
• Finding and fixing bugs will be hard
Minimization of engineering investment
Reactive security down, zombies up
25. Traditional MCU-based engineering
Oriented around device-security (if at all);
•Industrial, medical, automotive, …
•Non-networked
•Heavily engineered for a static state of optimal security
•Once that’s done, ship it!
(And then move on to something else…)
27. Conventional computing complexity
MPU-based and even MCU-based systems are more and
more complex, resembling server, network, and smartphone
systems.
Things will go wrong! Reactive security (vulnerability
handling, incident response) is needed in the
microcontroller/IoT ecosystem.
29. Reactive security for MCUs / IoT
Is Device Lifecycle Management (DLM) the answer?
Not really, that’s mostly limited to;
• Installing a vendor’s “Root of Trust” (RoT)
• Being locked-in to that vendor’s code/patch-signing services
• The mechanics of deploying updates “Over The Air” (OTA)
30. Reactive security for MCUs / IoT
Reactive security is well-understood in traditional networked
computing;
• Servers
• High-end networking
• Smart-phones
• Desktops
• […]
Can we adopt the same methods?
31. Reactive security for MCUs / IoT
There are some complications with conventional vulnerability-
handling (CVE, CPE, etc.)
• The MCU/MPU and its software is often “hardware” to a host
• SoC subsystems often contain firmware too
• One product’s host OS is another product’s subsystem firmware
• CPE isn’t flexible about this hierarchical view
• Multiple vendors involved, supply-chain complexities
32. Certification for IoT?
Various things have been proposed, but;
• Limit themselves to evaluating the implementation
• Don’t account for the (post-production) process
• Works against responsible code maintenance
• Collapse the supply-chain
33. Certification for IoT?
Various things have been proposed, but;
• Limit themselves to evaluating the implementation
• Don’t account for the (post-production) process
• Works against responsible code maintenance
• Collapse the supply-chain
And if we certified the software process?
35. Certified/certifiable (audited/auditable, …)
A
Downstream
• Users
• OEMs
• Certified products
Upstream
A+
B
B+
merge
Merge is usually hard and expensive;
• Upstream doesn’t minimize diff(A,A+)
• delta(A,B) doesn’t account for re-certification difficulty
36. Upstream
• Mainline devel
• Stable/LTS
• Hardened tree
Certified/certifiable (audited/auditable, …)
A
Downstream
A+
B
B+
Hardened “downstream” is coupled to mainline work
• Feedback for security impact of mainline changes
• Creates incentive for a better mainline
• Minimize throttling of mainline development
…
37. Where does this happen?
Governance
Security TSC Marketing
Contributors
38. Summary
• RTOS upstream to be maintained as production-worthy and
current, i.e. reactive security in “real time”.
• Vulnerability handling needs a refresh for “LITE”-type tech.
• Security quality (certifiability, auditability, safety, …) integrated
into the project, without bogging down the mainline.
• Drive best-practice for IoT security, practicing what is preached.
Thanks George, David, Ebba and others for the organisation and invitation. Honoured to be here.
Friend said that by Wednesday people would be hungover, stoned or broke, and speaking before 9am would be nothing but crickets. Take a photo.
I work at NXP, heading up Security inside R&D inside MICR at NXP. Parallel to S/W and H/W R&D, though more h/w centric.
Background in OSS, OpenSSL, Apache and some other things, but not much coding in recent years.
Here because I’ve helped champion NXP participation in NXP and am on the governance board. (Will go into why…)
H/W architecture work at DN, and now dealing with MICR (i.MX and Kinetis)
This is the agenda I came to Connect with;
Zephyr (from a govt board view),
share my thoughts on IoT Security,
and how I see their relationship.
But there has been a lot of discussion of Zephyr already
Anas’ talk was video’d and should be available, highly recommend (re)watching that for detail on Zephyr.
I’ll just say a few words on that.
Focus on this “IoT Security” stuff, which could easily fill a day, necessarily cursory.
PARTNER DECK
Small Footprint RTOS for IoT
As small as 8kb
Enable application code to scale
Truly open source project
Apache* 2.0 License
Hosted by Linux Foundation
Transparent development
Cross Architecture
ARM, Intel, Synopsys
KATE
So, what are the key characteristics of Zephyr
Small Footprint RTOS for IoT
Built in support for real time as part of nanokernel, rather than an afterthought.
Architected to enable applications to scale in size - build in what you need
Leverages the Kconfig infrastructure that has been demonstrated to work in the linux kernel, so you only include the specific parts you need.
Can create application images that will fit in as small as 8kb
Truly open source project
Transparent development
All code is available for download & inspection - but that doesn’t make it an true open source project… we’re building a community here that respects the rights of contributors
Contributors retain their Copyright, contribute to project under DCO
Public email lists and review via maintainers
Maintainers from multiple companies
Apache* 2.0 License was selected for the project
Explicit patent license: Code includes a royalty-free license from contributors for their licensable patents which are necessary for the covered code
Permissive: Apache 2.0 code may incorporated into an open source or a proprietary product. There is no obligation to release improvements in source, but of course we hope you will, so we can all benefit.
Hosted by Linux Foundation
Neutral governance structure, leveraging lessons that have worked with the LInux Kernel development
Contributions are welcome from everyone, not just members.
TSC membership can be earned by merit from community contributions.
Cross Architecture
today we have Synopsys - ARC processors, ARM - M, and Intel x86 processors in the code base.
talking to MIPS, DSP, and other architectures…
KATE
Going forward, the governing board and technical steering committee are committed to the Zephyr project supporting:
secure IoT applications to be developed and deployed, we’re going to want to make sure the code in this project can be security and safety certified, yet still incorporate community contributions.
the code to remain modular, you don’t need to include anything in your products that you don’t need.
the code to incorporate the latest connectivity stacks, so it can be used in a wide array of systems.
the code to support multiple hardware architectures, so there are multiple options for implementing functions.
evolving this to be a technical meritocratic project, where the best code solution is incorporated in the code base, and others can build downstream distributions using this code base…
Basically, we’re starting with the practices and processes that have been shown to work with linux, but will evolve to suit the needs of the project, as long the core values are preserved.
PARTNER DECK
1. Saturated RTOS/ Small OS market translates to fragmentation
2. Majority of device solutions currently used are “roll your own” or “no OS”
3. Open source adoption is growing as part of IoT development
4. With more connected devices there’s increased risk for compromised devices
KATE
So lets start with Why?
Last year, market research that was done by our members showed us that
We’ve got a Saturated RTOS/ Small OS market, which has translated to fragmentation and frustration. This fragmentation means that there’s no real way to collaborate, and nowhere to share the cost of monitoring security and developing fixes.
Majority of device solutions currently used are “roll your own” or “no OS” - which is a security nightmare.
Open source adoption is growing as part of IoT development, this is where the software standards will emerge - “show me the code”, “more eyes, reduces risk”...
With more connected devices there’s increased risk for compromised devices, so we need practices and processes to minimize vulnerabilities during development, increase the range of expertise of reviewers, and automation.
So for the IoT space, our members wanted to come up with a solution to address these concerns.
Seeded by established multi-arch codebase that has existed for ~20 years.
Wind River has developed a commercial version of this and has deployed it via their Wind River Rocket product, which remains a downstream product.
In 2015, the Linux Foundation was approached to build a properly governed open source project and community, to leverage the combined creativity of all interested parties. And NXP was interested.
Like Linux on apps processors, there’s value in having a reliable, canonical base system that is arch-neutral (non-lock-in, portable), and “commodity” in a good sense. Connectivity, sensors, and embedded dev costs should leverage an economy of scale over time.
All parties ultimately cede control to the OSS development model for the resulting collective benefits.
There are a few open source RTOS options, but some have licensing problems, some are focused on a single architecture, some constrain the feature-set in order to monetize essential functionality, some don’t complement the community model with a critical mass of backing, governance, budget, etc.
If there is likely to be a “linux for MCUs”, it seems likely to be Zephyr.
Current platinum members
We’re tickled pink to have Linaro on-board and the LITE group ramping up.
We have high hopes for LITE, and dare I say, high expectations!
We follow tried and tested governance model through LF, codified in a charter.
Technical work is managed by the TSC, low barrier to entry, no need to be platinum, meritocratic process.
Governance board handles what you’d expect;
Legal
Budget
Sign-off on important escalations from TSC (requests for infrastructure funding, modifications to the charter, …)\
Through sub-committees;
Marketing (conferences, recruitment, branding/trademarks, …)
Security-sensitive stuff, handling commercial-value artifacts from funded security activities (the code remains open!)
TSC lead is on the governance board and acts as interface
In this way, separation between coding and community vs Zephyr the branded and funded entity.
Early participation means:
Influence over the direction of the project
Impact the SW architecture and HW architecture support
Guide the direction of the security, and marketing activities
Decision making on all aspects of the project
Join the project to impact the IoT small device market!
Extraordinary phase of creativity, innovation, disruption.
Some of the best developments are likely not to be planned, funded, scheduled work, but instead surprising things coming from people with itches to scratch.
Import to note that nasty rumours abound about open source types being shabby and hairy, with questionable personal hygiene. Not my intention to lend any support to that!!
If a picture is worth a thousand words, there are some cases where the converse is true.
What does this mean?
Let’s start with the first one.
So much hype and nonsense that IoT has been classified in all sorts of ways, and ridiculed in others.
A biz/marketing view tends to dominate, that classifies it in terms of certain new markets and form-factors. From a technical standpoint though, that says almost nothing (as the same technologies and challenges apply both inside and outside categories that are defined in this way).
Best to thing of it not as a market or set of markets, or as a technology, but in terms of what the fundamental engineering changes and consequences are.
Need it be any more complex than that? Indeed, as we’ll see, this may be a banal definition, but the consequences are not banal.
Another vexing term. Let’s look at some examples …
Have you heard or said things like this?
Do any of these sentences make any sense on their own?
The word is used almost to provide emotional or psychological effect.
It’s an axiomatically reassuring word, whatever it means.
A grammatical and emotional wild-card.
But this kind of grammatical and emotional wild-card can lead to trouble when it’s being used as though it has any precision at all.
I suggested at one point that saying this sort of inane thing should be a firing offense. It was point out that I would have been fired multiple times by now if that were the case. Mea culpa. I think we’re all guilty to some extent.
Try catching this word as it gets used, and play the mental exercise of asking whether its usage is smothering over any need to say what is really meant!
“Security” is often used to mean one of these things, where your only clue which might be the context, if you’re lucky. Otherwise the speaker is assuming one, and it’s not sure that the listener is assuming the same.
Piling on.
There are others. Insurance, risk-management, emergency planning, …
So we’ve highlighted pitfalls, but still not answered the question.
Key takeaway is that “it depends”.
If there isn’t some context that makes it implicit, then it doesn’t really mean anything.
You can invert it into another question, “what is insecurity”? This helps to perceive how wide things are, because “threats” is a potentially big set.
So, coming back to the original question.
I’ll give you my take, again in terms of what the fundamental engineering changes and consequences are
Traditional online systems (servers, networking, …) are focussed on networking and logical security. Exploitable implementation bugs, eavesdropping, MITM, …
Traditional offline systems (medical, automotive, …) are focussed on device security.
This is a very rough rearrangement of the “security facets” into two categories. Obvious limitations with this partitioning (“cryptographic sw and hw” also applies to device, “side-channel” also applies to network, e.g. timing).
These domains make an odd couple, they have fundamental differences to sort out.
Risk multipliers: proliferation combined with both device and network/logical threats is a big challenge.
Another point worth noting: IoT brings us lots of startups/makers/etc;
Defense de-multipliers: high-volumes coming from companies that don’t necessarily have armies of security experts, NoCs and devops.
Read the slide.
This is usually what you find in the embedded/MCU/RTOS world, lots of upstream code being thrown over the wall.
At least as far as central O/S functions and networking (connectivity, protocols, …), the complexity of AP and even MCU is growing.
That complexity means we cannot live under the delusion that the code is ever “final”. The path from the upstream code down the supply chain to the product (tweaks, added layers, embedded configs, keys, …) must be managed in a way that reacts properly to discovery of stability and security problems. Throwing something over the wall breaks compromises that link in the chain, w.r.t. reactive security.
A year or two ago, at the big RSA security conference in SF, you couldn’t shake a tree without 2 DLM solutions falling out of it.
It’s part of the answer.
But it’s dangerous in that it sounds/looks like the solution (“so you can update your software”), while in reality it’s the last mile.
So it’s justifiable to be a bit cynical …
(Personal hunch here: as real networking starts to reach edge-nodes, e.g. IPv6-over-mesh, it’s reasonable to assume they’ll “pull upon notification” rather than consistently being slaves to “push” models. Like a regular O/S doing its updates. How many regular O/S’s have an update mechanism that is provided from a professional service that is completely decoupled from the owner/maintainer of the code packages? I.e. that the transport itself is a notable differentiation or value-add?)
Go through the slide.
Supply-chain == not just adding of layers and packages, but modification and transformation. So whether a CVE applies or not is muddied.
“Systems of systems”.
(Or “subsystems within subsystems”)
As baby monitors continue to relay metallica songs at terrified children, and cars get run off the road remotely, the needs for metrics and certification (and regulation) will be inevitable.
Problem: certification is a computer science that tends to cease being applicable when you leave the static world. We at NXP deal with chips going into bank cards, passports, etc. It’s stuff that doesn’t change.
So perhaps we should certify the correct behaviour and process for keeping software secure?
After my numerous musings about IoT Security, what’s the relationship to Zephyr?
One challenge of certifiable/auditable code is that it’s slow, painful, and expensive. And it depends on which standards, criteria, use-case, …
So a successful Zephyr upstream can’t bog down on this.
Managing a downstream that *does* handle this then creates numerous challenges.
(Read the white stuff)
The security group in Zephyr wants to maintain a “hardened” codebase within the project itself, and couple the activities together for mutual benefit.
In a kind of feedback loop.
(Read white stuf)
That coupling needs to be a collaboration between the security group and TSC/general-dev-process.
This is what we’re trying to define right now, e.g. face-to-face coming up in Berlin where this will be a hot topic.
Summarize these security points w.r.t. Zephyr.
We need to handle security issues and the downstream like complex network application projects do. (disclosures, embargos, CERT/first.org/etc,)
The methods for describing, categorizing, and tracking such problems+fixes needs some attention. “LITE”-type tech is this subystems-within-subsystems stuff.
The “hardened stuff” (base for certification, …) has to be an activity within the project itself, that feeds it, and is kept manageable because of it.
For these things, plus other forward-leaning security work in the space (e.g. next-gen certification, covering process), we want to see Zephyr walk the walk.
Thanks again for the opportunity to share these thoughts with you today.
If you have itches to scratch, please give Zephyr a look, and if it doesn’t already fit your needs, it might with your help.