Ever since Charlie Miller and Chris Valasek rocked the world of automotive embedded software with their paper “Remote Exploitation of an Unaltered Passenger Vehicle”, domain separation has been a hot topic. Tesla took an alternative approach. Here is how it works.
There is a plethora of technologies available to to prevent, as far as is reasonable, the access to safety critical domains from those that are more benign. Tesla took an alternative route, preferring to implement that separation in hardware, and unlike the technology in compromised Jeep, their approach was considered state-of-the-art at that time. But as subsequent attacks on Tesla vehicles suggested, although separation technologies offer an admirable line of defence, they are no “silver bullet”. The Keen Laboratories hacker team created a malicious Wi-Fi hotspot called ‘Tesla Guest’ to emulate the Wi-Fi at Tesla’s service centres. When a Tesla connected to the hotspot, the browser would push an infected website created by the hacker team. That provided a portal to access relatively trivial functions, but the more safety critical systems such as braking also fell under their control once they had replaced the gateway software with their own.
Tesla’s quick response was admirable, and the principle of separation is undoubtedly sound. But as their experience shows, it is only one part of the story. What is required, then, is a multi-faceted approach that will minimize the vulnerable attack surface, maximize the separation of the outward facing attack vector from the safety applications it serves, and ensure that application code and any operating systems it runs on are developed with security as a priority.
The “Swiss Cheese” Model
What Tesla and Jeep’s experiences tell us is that connectivity has fundamentally changed how we must view car safety. For example, consider a car built in 1978, and lovingly maintained such that its systems are kept in the same pristine order as the day it entered service. There is every reason to suppose that it will remain exactly as safe as it was then.
Buy a new, mid-range car today and it will clearly be far safer than a 40-year-old classic car could ever be. Much of that is the result of its sophisticated electronic systems, and its connectivity providing sophisticated facilities such as automatic crash response. But for the newer model to retain its safety advantage, it must also retain a consistent level of security. In short, unlike the classic car, the current model cannot automatically be assumed to retain its admirable safety features in 40 years’ time unless its software remains sufficiently impervious to attack.
To understand the scope of the challenge in keeping it so, it is useful to borrow an analogy from the world of medical systems. In March 2000, Professor James Reason proposed an analogy to explain how system accidents occur in medical environments. Many aspects of medical endeavour require human input, and the inevitable human error that goes with it. But generally, there are so many levels of defence that for a catastrophe to happen, an entire sequence of failures is required such than none of the defences prevent it.
Professor Reason likened this principle to a sequence of slices of “Swiss Cheese” (Figure 1).
Figure 1: The “Swiss Cheese” Model, illustrating how a sequence of imperfect defensive layers will only fail when those imperfections coincide.
No connected automotive system is ever going to be both useful and absolutely impenetrable, and no single defence of that system can guarantee impenetrability. It therefore makes sense to protect it proportionately to the level of risk involved if it were to be compromised, and that means applying multiple levels of security – or “slices” of Swiss cheese – so that if one level fails, others are standing guard.
Examples of such “slices” might include:
- Secure boot to make sure that the “correct” image is loaded
- Domain separation to defend critical parts of the system
- MILS (Least Privilege) design principles to minimize vulnerability
- Minimization of attack surfaces
- Secure coding techniques
- Security focused testing
But should every one of these precautions be implemented on every occasion? And if not, how should the decisions be made as to what applies, and when? To address that question, it is useful to focus on the relationship between two key “slices of Swiss cheese” – domain separation and secure coding.
Developing Secure Application Code
Identifying high risk areas
It is easy to suggest the application of every possible security precaution to every possible software architecture and application, but clearly that makes little commercial sense, especially when (for example) a head unit’s Linux-based OS is involved, complete with massive footprint and unknown software provenance. Where, then, should attention be focused?
According to Peterson, Hope and Lavenhar, “Through the process of architectural risk assessment, flaws are found that expose information assets to risk, risks are prioritized based on their impact to the business, mitigations for those risks are developed and implemented, and the software is reassessed to determine the efficacy of the mitigations.” Although this and similar studies are generally focused on enterprise computing, the basic premise of identifying and focusing attention on the components of the system at most risk makes a great deal of sense, as reflected by the “Threat Analysis and Risk Assessment” process described in SAE J3061. Examples of high-risk areas are likely to include:
- Files from outside of the network
- Backwards compatible interfaces with other systems – old protocols, sometimes old code and libraries, multiple versions that are hard to maintain and test
- Custom APIs – protocols etc. – likely to involve errors in design and implementation
- Security code, such as anything to do with cryptography, authentication, authorization (access control) and session management.
Consider that principle in relation to a system deploying domain separation technology – in this case, a separation kernel or hypervisor (Figure 2).
Figure 2: Deploying separation technology to help optimize security.
It is easy to find examples of high-risk areas specific to this scenario. For instance, consider the gateway virtual machine. How secure are its encryption algorithms? How well does it validate incoming data from the cloud? How well does it validate outgoing data to the different domains?
Then there are data endpoints (Figure 3). Is it feasible to inject rogue data? How is the application code configured to ensure that doesn’t happen?
Figure 3: Automotive Attack Surfaces and Untrusted Data Sources
Another potential vulnerability arises because many systems need to communicate across domains. For example, central locking generally belongs to a fairly benign domain, but in an emergency situation after an accident it becomes imperative that doors are unlocked, implying communication with a more critical domain. However such communications between virtual machines are implemented, their very nature demands that their implementation should be secure.
With these high-risk software components identified, attention can be focused on the code associated with them leaving us with a system where secure code does not just provide an additional line of defence, but it actively contributes to the effectiveness of the underlying architecture by “reinforcing” its weak points.
Optimizing the security of this application code involves the combined contributions of a number of factors, mirroring the multi-faceted approach to the security of the system as a whole.
Secure Coding Practices
The CERT (Computer Emergency Readiness Team) division of the Software Engineering Institute (SEI) have nominated a total of 12 key secure coding practices, all of which have a part to play. Here, we consider how five examples relate to the code for the automotive system outlined in Figure 3.
Heed compiler warnings
“Compile code using the highest warning level available for your compiler … use static and dynamic analysis tools”
Many developers have a tendency to attend only to compiler errors during development, and ignore the warnings. CERT’s recommendation is to setting the warnings at the highest level available and ensure that all of them are attended to. Static analysis tools are designed to identify additional and more subtle concerns.
Architect and design for security policies
“Create a software architecture and design your software to implement and enforce security policies”
Practitioners familiar with the development processes promoted by ISO 26262 will be familiar with the notion that requirements are to be established, and that bi-directional traceability between those requirements, software design artefacts, source code and tests is required. Designing for security implies extending those principles to include requirements for security alongside requirements for safety, and tools can help ease the administrative headache associated with traceability (Figure 4).
Figure 4: Automating requirements traceability with the TBmanager component of the LDRA tool suite
Keep it simple
“Keep the design as simple and small as possible.”
Figure 5: Using the TBvision component of the LDRA tool suite to assess code complexity
There are many complexity metrics to help developers evaluate their code, and automated static analysis tools help by automatically evaluating those metrics (Figure 5).
Use effective quality assurance techniques
“Good quality assurance techniques can be effective in identifying and eliminating vulnerabilities.”
The traditional approach to testing in the security market is largely reactive – so that the code is developed in accordance with relatively loose guidelines, and then it is tested by means of performance, penetration, load and functional testing to spot any vulnerabilities and to deal with them. Although it is clearly preferable to ensure that the code is secure “by design” by using the processes championed by ISO 26262, the tools used in the traditional reactive model such as penetration tests still have a place, simply to confirm that the system is secure.
Unit test tools provide a targeted “robustness test” capability by automatically generating test cases to subject the application code to such as null pointers, and upper and lower boundary values (Figure 6). Static analysis tools clearly lend themselves to secure code auditing.
Figure 6: Using the TBeXtreme component of the LDRA tool suite to test code robustness
Adopt a secure coding standard
“Develop and/or apply a secure coding standard”
There are a number of potential sources of secure coding standards. CERT C is a coding standard designed for the development of safe, reliable, and secure systems that adopts an application centric approach to the detection of issues.
MISRA C:2012 offers another option, despite a common misconception that it is designed just for safety-related, not security-related, projects. Its suitability as a secure coding standard was further enhanced by the introduction of MISRA C:2012 Amendment 1 and its 14 additional guidelines.
Static analysis tools vary in terms of their ability to identify the more subtle nuances of standard violations, but the more sophisticated implementations can seem slower because of the additional processing required to achieve that. A sensible approach is to choose tools with the option to run in “lightweight” mode initially, and to apply more complete analysis as development progresses.
The key to automotive system security is to make every component as secure as it can reasonably be. Even then, security is a continuously evolving game. Hackers don’t stand still just because life has become more difficult – they simply have new challenges to overcome.
No connected system is ever going to be both useful and absolutely impenetrable. It makes sense to protect it proportionately to the level of risk involved if it were to be compromised, and that means applying multiple levels of security so that if one level fails, others are standing guard.
Domain separation and secure application code provide two examples of these levels – or slices of “Swiss cheese”. The effort required to create a system that is sufficiently secure can be optimized by identifying high-risk elements of the architecture, and applying best-practice secure coding techniques to the application code associated with those elements.
It would be very expensive to apply state-of-the-art security techniques to every element of every embedded system. It is, however, important to specify security requirements and then architect and design them to be appropriate to each element of a system – perhaps the most important lesson to take from CERT’s “Secure Coding Practices”. In terms of the coding itself, risk assessment will create important pointers regarding where the system as a whole will most benefit from the application of static and dynamic analysis techniques.
Happily, as MISRA’s analysis has proven, many of the most appropriate quality assurance techniques for secure coding are well proven in the field of functional safety. These techniques include static analysis to ensure the appropriate application of coding standards, dynamic code coverage analysis to check for any excess “rogue code”, and the tracing of requirements throughout the development process.
Given the dynamic nature of the endless battle between hackers and solution providers, optimizing security is not merely a good idea. Should the unthinkable happen and with it a need to defend a connected system in court, there are very real advantages to being able to provide evidence of the application of the best available practice.