The CxO’s Guide to Spectre and Meltdown - Part 1, A Lesson in History

Spectre and Meltdown.  Spectre and Meltdown.  Sigh.  It’s only a few days over a week since the public disclosure.  In that time, I’ve written and re-written this blog post at least six times!  Even for those of us that live and breathe this stuff all day every day, it’s hard to keep up with the rate at which information has been becoming available.   It’s a moving landscape - there is a lot of information to wade through and the volume is growing day by day.  Some of it is true, some of it is patently false, and some of it is sort of true…if you look from just the right angle.

meltdown.png

This is the first of a two part series covering Spectre and Meltdown.  I’m not going to write a detailed technical treatise.  If you want that, you should go and read the original research papers.  Don’t rely on my - or anyone else's - interpretation of them (available here and here).  Rather I want to look at the background that has led to the current predicament, what the actual risks are, and what you can actually do about it (because just keeping up to date with security patches is not enough this time).

To understand the basis of Spectre and Meltdown, we need to go back in time.  About two decades back in fact, to the 486 processor.  Back then, the operating system presented the processor with a list of instructions which then dutifully executed them.  From the first, to the last.  In order.  If a decision needed to be made - a fork in the road if you will - the processor would wait until the operating system told it whether to turn left or right.  At some point, some very clever folk realised that a processor sitting and waiting wasn’t doing anything, and that this was extremely wasteful.  What if we could avoid this time sitting around and doing nothing?  Enter speculative execution.  Instead of sitting and waiting, the processor uses its time more wisely.  To take the road analogy, let’s presume that the operating system wants to know what is at the end of one of the forks in our imaginary road.  It only cares about one side.  It’s just not sure yet - perhaps it’s waiting for some user input (“you have reached a fork in the road, do you go left and battle the troll, or right and outrun the giant”).  Instead of lazing around waiting, the processors sprints down both roads by itself.  The operating system doesn’t need to know that it’s doing this, but when it’s finally made a decision - take the left fork, go to the end, and tell me what’s there - the processor can answer immediately.  It doesn’t need to take time to wander down the road and make the operating system wait.  It already knows what is at the end of both forks from its own meanderings when it would have otherwise been sitting around doing nothing.  It just has to throw away the entire side it doesn’t need.  Brilliant!

Now, that is a vast oversimplification of course.  In practice the processor is dealing with millions of forks continuously.  It can’t head down all of them, so it tries to predict which might be the most likely - essentially trying to guess the statistically most promising branches (or forks in our road) that the operating system might be about to head down.   This is called branch prediction, a form of speculative execution, and it really is very clever.  It’s so clever in fact that it has appeared in almost all processors in the last 20 years.  Trawling through old datasheets, the earliest mention I can see was found in the Cyrix 6x86 (thanks Cyrix).  Nowadays speculative execution is found in Intel processors, AMD processors, ARM processors, and POWER processors.  There are even suggestions that MIPS processors utilise speculative execution.  There are a few exceptions such as some ARM processors, Intel Atom processors, and the Intel Itanium range, but otherwise take a look around you.  If it has a processor, it almost certainly utilises speculative execution.  That’s how good an idea this is!

Except that it’s really not.  It turns out that it’s actually a really, really bad idea.  You see, it turns out that even though the result of a code branch might be discarded by the processor, it has still been executed.  It is possible to use this fact to compromise the security of a system.  Coming back to our road analogy, if our processor heads off along the right fork of its own accord, and happens upon a “hazard” (BOOM!), then the operating system won’t know to send help - it doesn’t even know that the processors is there.  It is possible for a malicious application to attempt to predict which branches a processor might speculatively execute and arrange a notional ambush on those paths bypassing protections that would otherwise exist.

spectre.png

Imagine for a moment that each of the forks in our road is guarded by a troll.  Each troll demands a password to use their road.  The processor still wants to get to the end of each road.  Even though the operating system may eventually say turn left, the processor doesn’t yet know this - it needs to progress down both roads.  Therefore, it needs both passwords.  Herein lies a problem, because the processor now holds the password for both forks, without the knowledge of the operating system.  No problem you think, the operating system eventually says turn left - the processor is allowed the password to the left fork.  The processor simply discards the result of the right fork, and everything that got it there - including the password.  Sounds reasonable.  Except, as some very clever folk have figured out, there is a small window of opportunity to interrogate the state of the processor and determine the state of speculative branch and steal away the password to the right fork in our road before the processor discards it.  Now this is a pretty narrow window - milli or even micro seconds, but it does exist.  It also turns out that my use of the word “discards” above is overly generous and “ignores” could be more appropriate.  (Technically, the instruction rollback can leave remnants in the processor cache which can potentially be accessed using other variants of the exploit.)

Given all of that, a clever application may be able to predict what the processor is planning to speculatively execute, arrange for that code to be something specific, and/or use this to obtain information that the application would usually not be privy to.  The exploit bypasses the usual barriers and isolations between different areas of the system.

Both Spectre and Meltdown are examples of what is termed a “side-channel” attack.  In this case they both target the speculative execution mechanism.  This previously unseen family of exploits was discovered by Google researchers (Project Zero) and was disclosed in confidence to processors manufacturers in June 2016 under the doctrine of responsible disclosure.  The public disclosure date was set to be 9th January 2018, the idea being that fixes and protections against the exploit would be in place by the time it was publicly disclosed.  This has gone a wee bit wrong in a couple of ways though.  Firstly, some industrious journalists at The Register out of the UK put things together and disclosed this early.  And secondly, Spectre, well, it can’t really be fixed.  Now, this is tangential to our topic, but it places Intel in an interesting position.  You see, Intel, based in the most litigious nation on the face of our planet, released their 8th generation i7 processor during this time.  With the full, but secret knowledge that it was irreparably flawed.  Let’s just say the phrase “class action” is certainly being bandied around within those United States at present.

The past two decades has been a race for performance.  And wow do we have that.  The processing power that fits in my pocket is so many time more powerful that what sat beside my desk 20 years ago that it’s difficult to even comprehend.  As with all things though, there is sacrifice - in this case security has suffered.

Now, I’m painting with a pretty broad brush here and everything I’ve said is quite the oversimplification.  There are currently three demonstrated variants of the exploit against speculative execution: CVE-2017-5753 (Spectre - Variant 1), CVE-2017-5715 (Spectre - Variant 2), and CVE-2017-5754 (Meltdown - Variant 3).

I hope that I’ve provided an accessible high level explanation of the basis of Spectre and Meltdown.  In part two, we will take a look at the actual business risks, what your vendors are telling you, what your vendors are not telling you, and what YOU can do (and what you can’t do).

 - Patrick Brennan | patrick@sc.nz | Chief Digital Officer