The document discusses considerations for automotive cybersecurity. It begins with two quotes about trust and progresses through discussing technological advances, architecture goals, security goals, advanced design concepts, and concludes with an agenda. The document covers a wide range of topics related to automotive cybersecurity including hardware security, software security, safety and reliability, cryptography, and system architecture.
2. Automotive Cybersecurity, Detroit, March 2019
Love all, trust a few, do wrong to none.
– William Shakespeare, All's Well That Ends Well
If you want to trust people you must trust no one!
– J. Edgar Hoover
3. Automotive Cybersecurity, Detroit, March 2019
Progress in Technology has been Astonishing
Every generation of technology has enabled remarkable outcomes
Apollo 11
2048 words RAM (16-bit word) ~4KB
36,864 words ROM
Average Smartphone
256MB – 512MB Cache
2GB – 64GB RAM
Next 10 to 20 years
???
45 years
62M x RAM
Cognitive Cyber-Physical Systems
???
???
5. Automotive Cybersecurity, Detroit, March 2019
I always talk about this to folks at Microsoft, especially to developers. What’s the most important operating system you’ll write
applications for? Ain’t Windows, or the Macintosh, or Linux. It’s Homo Sapiens Version 1.0. It shipped about a hundred thousand
years ago. There’s no upgrade in sight. But it’s the one that runs everything.
– Bill Buxton from Microsoft Research
Economic Utility
There is an axiom in economics called
economic utility, it says that feature value
with time tend to zero. As soon as you put
a feature (product) on a shelf it starts to
depreciate.
The goal of any well-defined process
including SDL is ‘continuous improvement’.
6. Automotive Cybersecurity, Detroit, March 2019
Architecture Goals
1. The most obvious approach might be to imagine the future you want and build it.
Unfortunately, that doesn’t work that well because technology co-evolves with
people. It’s a two step—technology pushes people to move forward and then people
move past technology and it has to catch up. The way we see the future is constantly
evolving and the path you take to get there matters. In technical terms we can call
this ‘continuous improvement.’
2. Establish modular and composable design making it possible to (1) use your system
in different (standardized) configurations and applications and (2) evolve it as the
requirements and technologies change.
3. Control (or manage) and reduce complexity!
Civilization advances by extending the number of important operations we can perform without thinking about them.
– Alfred North Whitehead
7. Automotive Cybersecurity, Detroit, March 2019
As the complexity of a system increases, the accuracy of any single agent's own model of that
system decreases rapidly.
Technical debt is a runaway complexity. For example, if it takes you enormous effort and money to
upgrade your system you have accumulated huge technical debt. Remember that value of your
system is inversely proportional to its maintainability.
Dark debt is a form of technical debt that is invisible until it causes failures.
Dark debt is found in complex systems and the anomalies it generates are complex system
failures. Dark debt is not recognizable at the time of creation. … It arises from the unforeseen
interactions of hardware or software with other parts of the framework. …
Unlike technical debt, which can be detected and, in principle at least, corrected by refactoring,
dark debt surfaces through anomalies.
Technical & Dark Debt
Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.
– Antoine de Saint-Exupery
8. Automotive Cybersecurity, Detroit, March 2019
New challenges brought by AI
A single bit-flip error leads to a misclassification of image by DNN
From research by Karthik Pattabiraman
University of British Columbia
10. Automotive Cybersecurity, Detroit, March 2019
Information Security Goals
1. Secure boot
2. Secure auditing and logging
3. Authentication and authorization
4. Session Management
5. Input validation and output encoding
6. Exception management
7. Key management, cryptography, integrity, and availability
8. Security of data at rest
9. Security of data in motion
10. Configuration management
11. Incidence response and patching
Together, these formulate the end-to-end security architecture for the product and thus should be considered alongside one
another—not in isolation. Also, each of the categories has many sub-topics within it. For example, under authentication and
authorization there are aspects of discretionary access controls and mandatory access controls to consider. Security policies for the
product are an outcome of the implementation decisions made during development across these categories.
We already know that a “control” strategy fails
worse than a “resilience” strategy.
11. Automotive Cybersecurity, Detroit, March 2019
Cyberattacks to CPS Control Layers
Control Layer
Regulatory Control Supervisory Control
Deception attacks
Spoofing, replay Set-point change
Measurement substitution Controller substitution
DoS attacks
Physical jamming Network flooding
Increase in latency Operational disruption
Estimation of CPS risks by naively aggregating risks due to reliability and security
failures does not capture the externalities,
and can lead to grossly suboptimal responses to CPS risks.
To thwart the outcomes that follow sentient opponent actions,
diversity of mechanism is required.
12. Automotive Cybersecurity, Detroit, March 2019
The Honeymoon Affect
Design specifications miss important security details that appear only in code.
For most programmers it's hard enough to get the code into a state where the compiler reads it
and correctly interprets it; worrying about making human-readable code is a luxury.
The software industry needs to change its outlook from trying to achieve code perfection to
recognizing that code will always have security bugs.
FailureRate
Number of Months
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
1 2 43 5 6 7 8 109 11
VulnerabilitiesperMonth
Months since Release
Current Software Engineering literature supports the
Brooks life-cycle model - image taken from “Post-
release reliability growth in software products”, ACM
Trans. Softw. Eng Methodol. 2008
13. Automotive Cybersecurity, Detroit, March 2019
Cryptography ≠ Security
Whoever thinks his problem can be solved using cryptography, doesn’t understand his problem and doesn’t understand cryptography.
– Attributed by Roger Needham and Butler Lampson to each other
Cryptography rots, just like food. Every key and every algorithm has shelf time. Some have very short shelf time.
• How long do you need your cryptographic keys or algorithms to be secure? – this is cryptography shelf life (x years)
• How long will it take to extract secrets out of your system? – this is the end of honeymoon (z years)
• What are your parameters to reduce attack surface and to update keys or algorithms? - (pronounced Xi)
𝐼𝑓 𝑧 < 𝑥 + 𝜉, 𝑖𝑚𝑝𝑟𝑜𝑣𝑒 𝑦𝑜𝑢𝑟 𝑎𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒 𝑎𝑛𝑑 𝑖𝑛𝑓𝑟𝑎𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒!
Cryptographic Agility
14. Automotive Cybersecurity, Detroit, March 2019
Anti-Virus and other security SW
On a recent software vulnerability watch list, about one-third of the
reported software vulnerabilities were in the security software itself.
The average time it takes to identify a cybersecurity incident discovery is
197 days.
From DARPA High-Assurance Cyber Military Systems (HACMS) Proposer’s Day Brief.
15. Automotive Cybersecurity, Detroit, March 2019
1. Restrict all code to very simple control flow constructs, do not use goto statements, setjmp or longjmp constructs, direct or indirect recursion.
2. Give all loops a fixed upper bound. It must be trivially possible for a checking tool to prove statically that the loop cannot exceed a preset upper bound on
the number of iterations. If a tool cannot prove the loop bound statically, the rule is considered violated.
3. Do not use dynamic memory allocation after initialization.
4. No function should be longer than what can be printed on a single sheet of paper in a standard format with one line per statement and one line per
declaration. Typically, this means no more than about 60 lines of code per function.
5. The code’s assertion density should average to minimally two assertions per function. Assertions must be used to check for anomalous conditions that
should never happen in real-life executions. Assertions must be side effect-free and should be defined as Boolean tests. When an assertion fails, an
explicit recovery action must be taken, such as returning an error condition to the caller of the function that executes the failing assertion. Any assertion
for which a static checking tool can prove that it can never fail or never hold violates this rule.
6. Declare all data objects at the smallest possible level of scope.
7. Each calling function must check the return value of non-void functions, and each called function must check the validity of all parameters provided by the
caller.
8. The use of the preprocessor must be limited to the inclusion of header files and simple macro definitions. Token pasting, variable argument lists (ellipses),
and recursive macro calls are not allowed. All macros must expand into complete syntactic units. The use of conditional compilation directives must be
kept to a minimum.
9. The use of pointers must be restricted. Specifically, no more than one level of dereferencing should be used. Pointer dereference operations may not be
hidden in macro definitions or inside typedef declarations. Function pointers are not permitted.
10.All code must be compiled, from the first day of development, with all compiler warnings enabled at the most pedantic setting available. All code must
compile without warnings. All code must also be checked daily with at least one, but preferably more than one, strong static source code analyzer and
should pass all analyses with zero warnings.
NASA’s Ten Principles of Safety-Critical Code
Gerard J Holzmann. The power of 10: rules for developing safety-critical code. Computer, 39(6):95–99, 2006.
16. Automotive Cybersecurity, Detroit, March 2019
No single point of failure—this means that no component should be exclusively dependent on the
operation of another component. Service-oriented architectures and middleware architectures often
do not have a single point of failure.
Diagnosing the problems—the diagnostics of the system should be able to detect malfunctioning of
the components, so mechanisms like heartbeat synchronization should be implemented. The layered
architectures support the diagnostics functionality as they allow us to build two separate hierarchies—
one for handling functionality and one for monitoring it.
Timeouts instead of deadlocks—when waiting for data from another component, the component
under operation should be able to abort its operation after a period of time (timeout) and signal to the
diagnostics that there was a problem in the communication. Service-oriented architectures have built-
in mechanisms for monitoring timeouts.
Reliability and Fault Tolerance
17. Automotive Cybersecurity, Detroit, March 2019
6 Security Categories
Fast cryptographic performance
Device identification
Isolated execution
(Message) Authentication
Virtualization
Hardware security services that can be used by applications
Platform boot integrity and Chain of Trust
Secure Storage (keys and data)
Secure Communication
Secure Debug
Tamper detection and protection from side channel attacks
Hardware security building blocks
Over-the Air Updates
IDPS / Anomaly Detection
Network enforcement
Certificate Management Services
Antimalware and remote monitoring
Biometrics
Software and Services
Security features in the silicon, for example Memory Scrambling,
Execution Prevention, etc.
Defense in Depth
HardwareRootofTrust
Analog security monitoring under the CPU
Components associated with physical control
of the vehicle
Components associated with safety
Components associated with entertainment
and convenience
Effective security must be end-to-end
1. Hardware Root of Trust (boot, TEE, crypto, storage)
2. Root of Trust building blocks (anti-tamper, counters, etc.)
3. Software Services to utilize Hardware (encrypt/decrypt, etc.)
4. Cybersecurity Process compliance (SAE J3061)
5. Higher-level services (IDS, Firewalls, Data Analytics, etc.)
6. Next-generation security (AI, self-*)
Image credit: Mercedes-Benz Museum
(as cited in Computer History Museum, 2011)
19. Automotive Cybersecurity, Detroit, March 2019
7 Classes of Hardware Vulnerabilities
1. permissions and privileges,
2. buffer errors,
3. resource management (shared resources),
4. information leakage,
5. numeric errors,
6. crypto errors, and
7. code injection.
Probability of software failure equals 1
20. Automotive Cybersecurity, Detroit, March 2019
Functional Safety & Security Architecture
Product Heterogeneous Architecture
Safety Island
Security Island
(PKCS 11, FIPS 140-2 L2/3)
FuSa (ISO 26262)SDL (ISO 21434)
ASIL Security
Process
Safety/Security
Architecture
Device
Reliability &
Trustworthiness
Process
Platform Hardening for
Safety and Security
Safety & Security
Architecture
FuSa (ISO 26262)
Functional Safety Security
Self-Test and Recovery
(STAR)
Safety Island
SDL (ISO 21434)
The principle of least
privilege (POLP)
Security Island
Platform
There are 3 sides of security:
– Automotive SDL (aligned with FuSa)
– System hardening (similar to FuSa, the goal is to ensure there are
no single points of failure)
– Security features (encryption, signatures, etc. – working with FuSa
is very important to prevent false positives)
All of these have to be considered during system lifecycle from conception
through design and to maintenance while the system is in the wild (for at least
15 years!!!).
Security should not be viewed in isolation from the system design and other
inputs including safety, privacy, survivability, etc. (slide with the umbrella).
21. Automotive Cybersecurity, Detroit, March 2019
Security Mechanism to Protect Graphics
1. Monitor parses configuration file for checking criteria (Was
the file tampered with? Is the monitor trusted?)
2. Cluster app requests Screen to display a symbol (Does the
application run in a trusted sandbox? Is the application
trusted?)
3. Cluster app requests Monitor to check the rendered
symbol
4. Monitor retrieves the framebuffer from Screen
5. Monitor performs checking according to criteria from (1)
6. Monitor notifies the cluster app of the checking results
7. Cluster app decides the course of action
Does the application
trust the message?
Was the configuration
file tampered?
Is the application
trusted?
Is the monitor trusted?
23. Automotive Cybersecurity, Detroit, March 2019
Three Pillars of Autonomous systems
Autonomous vehicles are a key example where
designers are challenged with the simultaneous
integration of three critical areas:
1. supercomputing complexity,
2. hard real-time embedded performance
3. functional safety.
24. Automotive Cybersecurity, Detroit, March 2019
The Four Pillars of CPS
The four key pillars driving cyber-physical systems are:
1. Connectivity,
2. Monitoring,
3. Prediction, and
4. Self-Optimization.
While the first two have experienced recent technological enablement, prediction and
optimization are expected to radically change every aspect of our society.
Components associated with physical
control of the vehicle
Components associated with safety
Components associated with
entertainment and convenience
25. Automotive Cybersecurity, Detroit, March 2019
Ultra-Reliable Systems
Air Force F-15 flying despite the absence of one of its wings.
The image demonstrates why self-repairing flight control systems play vital role in aircraft
control.
From The Story of Self-repairing Flight Control
Systems by James E. Tomayko
NASA photo (EC 88203-6) shows an Air Force F-15
flying despite the absence of one of the wings.
26. Automotive Cybersecurity, Detroit, March 2019
3-Dimensional Structure of Digital Security
Defense in Depth
Defense in Diversity
4 i‘s
Isolation
Inoperability
Incompatibility
Independence
But eventually everything fails. You have to make it fail in a predictable way.
Temporal Redundancy
Information Redundancy
Majority voting
Software and Services
Hardware security services
Hardware security building blocks
Security features in the silicon
Analog security monitoring under the CPU
HardwareRootofTrust
Self-Healing
Two-tier architecture is required!
27. Automotive Cybersecurity, Detroit, March 2019
Self-* and High Dependability
Self-healing is the ability of the system to autonomously change its structure so that its
behavior stays the same.
Trend of using self-adaptation is used increasingly in safety-critical systems as it allows
us to change the operation of a component in the presence of errors and failures.
Self-Monitor Self-Diagnosis
Anomalous Event
Deployment
Self-Testing
Candidate Fix
Generation
Self-Adaptation
Fault Identification
30. Automotive Cybersecurity, Detroit, March 2019
Evolution of Technology and Security Solutions
1. Interactive computing.
2. Time sharing.
3. User authentication.
4. File sharing via
hierarchical file systems.
5. Prototypes of ‘computer
utilities’.
Emerging
concerns
1. Access controls
2. Passwords
3. Supervisor state
Security
Technologies
1960s
1. Packet networks
(ARPANET)
2. Local networks (LANs)
3. Communication secrecy
and authentication
4. Object-oriented design
5. Multilevel security
6. Mathematical models of
security
7. Provably secure systems
1. Public key cryptography
2. Cryptographic protocols
3. Cryptographic hashes
4. Security verification
1. Adoption of TCP/IP
protocols for the
Internet
2. Exponential growth of
Internet
3. Proliferation of PCs and
workstations
4. Client-server model for
network services
5. Viruses, worms, Trojans,
and other forms of
malware
6. Buffer overflow attacks
1. Malware detection
(antivirus)
2. Intrusion detection
3. Firewalls
1. World Wide Web
2. Browsers
3. Commercial
transactions
4. Data repositories and
breaches
5. Portable apps and
scripts
6. Internet fraud
7. Web-based attacks
8. Social engineering and
phishing attacks
9. Peer-to-peer (P2P)
Networks
1. Virtual private networks
(VPNs)
2. Public-key
infrastructure (PKI)
3. Secure web connections
(SSL/TLS)
4. Biometrics
5. 2-factor authentication
6. Confinement (virtual
machines, sandboxes)
1. Botnets
2. Denial-of-service attacks
3. Wireless networks
4. Cloud platforms
5. Massive data breaches
6. Ransomware
7. Malicious adware
8. Internet of things
9. Surveillance
10. Cyber warfare
1. Secure coding and
development processes
2. Threat intelligence and
sharing
3. Adware blocking
4. Denial-of-service
mitigation
5. WiFi security
1970s 1980s 1990s 2000s Future
Attacks against Cyber-
Physical Systems (CPS):
1. Autonomous vehicles
2. Smart communities
3. Aviation and
transportation
4. Robots
5. Drones
6. Infrastructure
1. Self-adaptive Systems
which can evaluate and
modify their own
behavior to improve
efficiency, and which
can self-heal.
2. Multi-agent Systems, a
loosely coupled network
of software agents that
interact to solve
problems, are resilient
and partition tolerant.
3. Cognitive Technologies
31. Automotive Cybersecurity, Detroit, March 2019
Summary
1. Absolutely secure systems are impossible, with enough money and commitment
any system can be broken
2. Assume your system is compromised and build it so that it can recover
3. Strive for continuous incremental improvement, not perfection
4. We do not know how to build 100% reliable systems, we only know how to
manage risk – your system will fail and your design has to ensure that it fails in a
predictable way.
Every 30 years there is a new wave of things that computers do. Around 1950 they began to model events in the world (simulation), and around 1980 to connect people (communication). Since 2010 they have begun to engage with the physical world in a non-trivial way (embodiment – giving them bodies).
- Defining next-gen security architectures in automotive products for both hardware and software
- Ensuring OEM and Tier 1 software and hardware needs are met
- Autonomous architecture ECU security trends
https://tatourian.blog/2014/03/06/interview-with-bill-buxton-from-microsoft-research/
Your system architecture has to be adaptable and evolvable. Requirements and technologies change. You have to design your system for that change!
If you have to kiss a lot of frogs to find a prince, find more frogs and kiss them faster and faster.
Resilience and Security in Cyber-Physical Systems: Self-Driving Cars and Smart Devices
Karthik Pattabiraman
University of British Columbia
2017
https://youtu.be/O6NKY2oE99M
This is a joint Microsoft/Nvidia research. The first half of the talk is entirely on functional safety and resilience of DNNs, the second describes invariant-based Intrusion Detection System.
The future will be defined by autonomous computer systems that are tightly integrated with the environment, also known as Cyber-Physical systems (CPS). Resilience and security become extremely important in these systems, as a single error or security attack can have catastrophic consequences. In this talk, I will consider the resilience and security challenges of CPS, and how to protect them at low costs. I will give examples of two recent projects from my group, one on improving the resilience of Deep Neural Network (DNN) accelerators deployed in self-driving cars, and the other on deploying host-based intrusion detection systems on smart embedded devices such as smart meters and smart medical devices. Finally, I will discuss some of our ongoing work in this area, and the challenges and opportunities. This is joint work with my students and industry collaborators.
- Defining next-gen security architectures in automotive products for both hardware and software
- Ensuring OEM and Tier 1 software and hardware needs are met
- Autonomous architecture ECU security trends
In cybernetics and control theory, a setpoint (also set point, set-point) is the desired or target value for an essential variable, or process value of a system.[1] Departure of such a variable from its setpoint is one basis for error-controlled regulation using negative feedback for automatic control. [2]. The set point is usually abbreviated to SP, and the process value is usually abbreviated to PV.[3]
https://en.wikipedia.org/wiki/Setpoint_(control_system)
Familiarity breeds contempt: the honeymoon effect and the role of legacy code in zero-day vulnerabilities
https://www.semanticscholar.org/paper/Familiarity-breeds-contempt%3A-the-honeymoon-effect-Clark-Frei/1148f37a8ca0a5ca0a26178c7d85a063bd539725
DARPA High-Assurance Cyber Military Systems (HACMS) Proposer’s Day Brief.
The average time it takes to identify a cybersecurity incident discovery is 197 days, according to the 2018 Cost of a Data Breach Study from the Ponemon Institute, sponsored by IBM. Companies who contain a breach within 30 days have an advantage over their less-responsive peers, saving an average of $1 million in containment costs.
Gerard J Holzmann. The power of 10: rules for developing safety-critical code. Computer, 39(6):95–99, 2006.
From Physically Unclonable Functions - Constructions, Properties and Applications
System Security Integrated Through Hardware and Firmware (SSITH)
SSITH specifically seeks to address the seven classes of hardware vulnerabilities listed in the Common Weakness Enumeration (cwe.mitre.org), a crowd-sourced compendium of security issues that is familiar to the information technology security community.
In cyberjargon, these classes are:
permissions and privileges,
buffer errors,
resource management,
information leakage,
numeric errors,
crypto errors, and
code injection.
Researchers have documented some 2800 software breaches that have taken advantage of one or more of these hardware vulnerabilities, all seven of which are variously present to in the integrated microcircuitry of electronic systems around the world. Remove those hardware weaknesses, Salmon said, and you would effectively close down more than 40% of the software doors intruders now have available to them.
- Defining next-gen security architectures in automotive products for both hardware and software
- Ensuring OEM and Tier 1 software and hardware needs are met
- Autonomous architecture ECU security trends
Developers need to efficiently produce systems that meet safety and other key system-level requirements.
This approach facilitates flexible and efficient integration of internal, 3rd party, and/or customer IP subsystems to support late design changes and potentially customer-specific technology/IP requirements.
I sum up this model as design for security, ship, analyze, self-heal or quarantine, and treat (if required).
Hackers too can generally pivot faster than product-makers so our approach must be anticipatory, flexible and resilient.
I can see a world where we will have put hackers out of business.
– Simon Segars, CEO, Arm
From The Story of Self-repairing Flight Control SYstems
But eventually everything fails. You have to make it fail in a predictable way. Here, there are two strong links and one weak link. In case of failure, the weak link will disintegrate before the two strong links fail and detonate the warhead. Two strong links are made using different architecture (incompatibility).
We already know that a “control” strategy fails worse than a “resilience” strategy.
Temporal Redundancy: Read commands multiple times, Use median voting
Information Redundancy: Process values multiple times, Store several copies in memory
Use majority voting to schedule control commands
independence – Design of subsystems to prevent common-mode and common-cause failures such that the failure of one subsystem does not affect the failure of another subsystem
Incompatibility – the use of energy or information that will not be duplicated inadvertently
Isolation – the predictable separation of weapon elements from compatible energy
Inoperability – the predictable inability of weapon elements to function
Despite considerable work in fault tolerance and reliability, software remains notoriously buggy and crash-prone. The current approach to ensuring the security and availability of software consists of a mix of different techniques:
Proactive techniques seek to make the code as dependable as possible, through a combination of safe languages (e.g., Java [5]), libraries [6] and compilers [7, 8], code analysis tools and formal methods [9,10,11], and development methodologies.
Debugging techniques aim to make post-fault analysis and recovery as easy as possible for the programmer that is responsible for producing a fix.
Runtime protection techniques try to detect the fault using some type of fault isolation such as StackGuard [12] and FormatGuard [13], which address specific types of faults or security vulnerabilities.
Containment techniques seek to minimize the scope of a successful exploit by isolating the process from the rest of the system, e.g., through use of virtual machine monitors such as VMWare or Xen, system call sandboxes such as Systrace [14], or operating system constructs such as Unix chroot(), FreeBSD’s jail facility, and others [15, 16].
Byzantine fault-tolerance and quorum techniques rely on redundancy and diversity to create reliable systems out of unreliable components [17, 1, 18].
These approaches offer a poor tradeoff between assurance, reliability in the face of faults, and performance impact of protection mechanisms. In particular, software availability has emerged as a concern of equal importance as integrity.