
I spent yesterday in a workshop with a bunch of very smart folks who spend their days trying to break things. You can debate at some length the ethics of this behavior, but that’s not really the point as far as I’m concerned. What really struck me in the workshop was humanity’s ability to create astonishingly complex systems, and our corresponding inability to understand them fully. Complexity carries a cost, as it turns out, and the full bill is rarely apparent until we are well down the road.
Our keynote speaker at the workshop was Daniel Gruss, a professor from the Graz University of Technology in Austria. I’ve come to know Daniel a bit since he and a separate team led by Daniel Genkin at the University of Michigan simultaneously found the Meltdown and Spectre exploits which were published back in January 2018. Daniel’s sly smile when he presents on these topics — he appears to find it all highly amusing — belies his real concern over these vulnerabilities and the systems and institutions that keep producing them.
Here is the link to Daniel’s talk at the workshop. It’s worth watching.
The key thing to understand about micro-architecture vulnerabilities, and in fact many IT security issues, is that they come about in response to customer demand for features and performance. Spectre, in particular, is only possible because processor makers long ago introduced a feature called “speculative execution,” which to be honest is an amazing feat of technology in and of itself. What speculative execution actually does is use free processor time to guess what a program is going to do next, before it actually does it. This works because the time required to retrieve data from system main memory is so much larger than the time required to execute a few instructions that may not be needed. When speculative execution wins, it is because the processor has already executed some code that turns out to have been the right code once the data from slow memory comes back. When it loses, it is because the data points down the other code path, in which case the processor throws away the results of its gamble and continues as if nothing had happened. It turns out the processor guesses right often enough that this feature results in major performance gains, with the result that it is present in all modern CPUs.
The downside of speculative execution is that when the processor guesses wrong, it can leave orphaned data in the processor cache until it is cleaned up later. A determined attacker can use this fact to read data she shouldn’t be able to read, including things like your encryption keys and your password. This is called a “side-channel attack” because it relies on a side effect of normal processing — the abandoned data in the cache — rather than a direct vulnerability in the program that is running.
It is important to note that this kind of attack is significantly harder to carry out than a simple phishing expedition that fools users into typing their username and password somewhere they shouldn’t. The fact that it is possible to learn private facts without first breaking into a system should be scary for large institutions — governments, banks, the large cloud providers — but less so for an average person with a laptop and a phone. Nonetheless, it is a real vulnerability, and although operating system vendors like Red Hat have done heavy work making it harder to exploit, the vulnerabilities are on the chip and there is only so much that can be done to mitigate them. What’s worse is that chipmakers don’t seem to be in any rush to produce real fixes to the problem, and they justify this by saying — accurately — that customers don’t care. Even large institutions with lots to lose are more interested in chasing performance than in protecting their data (or, more importantly, that of their citizens or their customers).
There is another important piece of this puzzle that Daniel (and other security researchers) are passionate about. With one notable exception that I will touch on in a moment, all processors are based on proprietary designs which are guarded very closely. This might not be a problem if processors were simply engines designed to churn through programming instructions in a linear way, but they are far more complex than this. The speculative execution and predictive branching features I mention above are only one of hundreds or even thousands of complex subsystems on the chip, all there to improve performance, none well documented. Daniel and the rest of the security community do their work by sniffing in the dark, trying to guess where a vulnerability might lie. They do this because if they find a vulnerability and tell everyone about it, we can hopefully prevent it being used against us by adversaries with bad intent. There is no doubt that these adversaries exist in government, organized crime, and elsewhere, and that the researchers they employ are every bit as skilled as Daniel, and that when they find a way to break into a system they are not going to be telling the press about it. So the processor makers, by keeping their designs and the security vulnerabilities inherent in them private, expose all of us to an unknown, unquantifiable risk.
I mentioned an exception. It turns out that there is a growing open-source community around a freely available Instruction Set Architecture called RISC-V. An Instruction Set Architecture is simpler than it sounds — it is essentially the set of instructions that a processor is guaranteed to implement. (The Intel ISA we have been using since the IBM PC came out is called the x86 architecture.) Anyone is free to take the RISC-V ISA and build a processor design that implements it, and in fact several folks in academia and industry are using the RISC-V ISA to design processors. Some processor designs are themselves open sourced like BlackParrot (BU/UW), Rocket (U C Berkeley), SweRV (Western Digital) and Ariane (ETH Zurich). Ideally these groups would open source not just the design but also the toolchain and the physical tools that they use to build the actual chips — but one step at a time.
Now, RISC-V, in order to be relevant, has to have optimizations that can compete with the proprietary chipmakers’ designs, which means it too will certainly have vulnerabilities. (Daniel Gruss is fond of saying that every optimization introduces an opportunity for a security flaw.) The difference is that with RISC-V everyone can see them. So, even though these are incredibly complex systems and it is unlikely that even a very skilled engineer can understand the entire thing all at once, it is possible that a community of people working together can do real analysis that will mitigate all kinds of security flaws… and that, in turn, will allow customers to understand exactly what security they are giving away in order to get better performance.
Daniel Genkin, the Spectre researcher at U of M, spends a lot of time talking about how we need to stop talking about security as an all-or-nothing thing, and instead start talking about it in terms of contracts. Something like “This chip has these performance characteristics, and also limits speculative execution to these kinds of safe operations or this kind of non-private data.” You can imagine that this would at last let customers start making intelligent decisions about how to spend their purchasing dollars. I suspect that as the world becomes home to more processor types and architectures, that processor makers will start offering these kinds of tradeoffs — especially if their chips are based on an open design where they can actually show buyers what they’re getting. Operating in this way won’t eliminate vulnerabilities, by any means, but it would at least let us say we’re trying.
UPDATE: I missed an important part of what RISC-V is — it is an ISA, not a processor design. There are a number of open-source processor designs that implement the RISC-V ISA.
Interesting… Hadn’t thought of security vulnerabilities at this level…
I am a bit concerned whether RISC-V is going the same route as the rest of the crew albeit in a more subtle way. If adopters follow the open spec to implement the core RTL but decide to keep the really *meaningful* optimizations and enhancements to themselves to gain an advantage over the competitor, I am afraid we could be on the track back to square one. Hopefully, this will change though!
Nice article!
Here is the link to the BlackParrot design – https://github.com/black-parrot
Did you guys know/discuss the OpenPiton project/framework? See https://parallel.princeton.edu/openpiton/
Here is the project description (and here is a link to the original ASPLOS’16 paper https://parallel.princeton.edu/openpiton/paper.html)
OpenPiton is the worlds first open source, general-purpose, multithreaded, manycore processor and framework. It is based on the Princeton Piton processor which was designed from December 2013 and taped-out in March 2015 by the Princeton Parallel Group. OpenPiton is open source across the entire computing stack, from the hardware to the firmware and software. Researchers and industry experts from many fields can utilize OpenPiton to modify any part of the stack and evaluate their ideas at scale. The hardware can be easily synthesized to FPGA and run an OS and applications at reasonable speeds for realistic evaluations. OpenPiton is designed to be highly configurable, including core count, cache sizes, and NoC topology, enabling it to adapt to different use cases. OpenPiton has an active community of users and is supported by the Princeton Parallel Group. Some of the features of OpenPiton include:
Open source (BSD uncore) manycore
Written in Verilog HDL
Scalable up to 1/2 Billion Cores
Configurable core and uncore
Includes synthesis and back-end flows for ASIC and FPGA
Support for multiple target FPGA boards
Runs full stack multi-user Debian Linux
Multiple I/O device options, including Ethernet