The sad reality is that HTTPS does virtually nothing to protect you from the prying eyes of alphabet soup agencies - or anybody else with enough knowledge about how these supposedly "secure" connections actually work.
It's true that connecting to web sites with SSL will certainly prevent "script kiddies" and other more winky opponents from eavesdropping on your surfing or otherwise interfering in your affairs. But as for the Real Bad Guys, forget it...
We shall begin by taking a brief dive down the rabbit hole of SSL, hopefully in a way that will make sense to even the least technically inclined among us.
This issue is, after all, so extremely important that I think everyone needs to understand what is really going on, and how web security actually works, without needing a PhD in cryptography, computer science, or engineering!
Our story begins with a little e-mail I received the other day. The basic message can be found here:
Microsoft Security Advisory (2880823)
Of course, the idea that Microsoft of all companies is warning me about security is kind of laughable, so I didn't pay much attention. Nevertheless, there was this little voice in the back of my mind that kept pestering me, so I decided to dig in and see what all the hoopla was about... or indeed if any hoopla was even warranted.
Boy, is it ever warranted!
From the above link, we read:
Microsoft is announcing a policy change to the Microsoft Root Certificate Program. The new policy will no longer allow root certificate authorities to issue X.509 certificates using the SHA-1 hashing algorithm for the purposes of SSL and code signing after January 1, 2016. Using the SHA-1 hashing algorithm in digital certificates could allow an attacker to spoof content, perform phishing attacks, or perform man-in-the-middle attacks.Okay, so that's probably like trying to read a foreign language to most people. Even I didn't understand exactly how these hashing algorithms were used with SSL. So, I started digging. What I found nearly floored me:
Microsoft recommends that certificate authorities no longer sign newly generated certificates using the SHA-1 hashing algorithm and begin migrating to SHA-2. Microsoft also recommends that customers replace their SHA-1 certificates with SHA-2 certificates at the earliest opportunity. Please see the Suggested Actions section of this advisory for more information.
MD5 considered harmful today: Creating a rogue CA certificate
Now, if you thought the M$ advisory was confusing, take a peek at the above link.
WOW! That's wild.
In summary, way back in 2008, some smart people figured out a way to make themselves a Fake SSL Certificate Authority, and they accomplished this feat by using a weakness in the MD5 hashing algorithm.
"Eureka! This must be the key to our mystery," I thought.
So, I began to read... and re-read... and think... and re-read. And then it clicked. To paraphrase Inspector Finch:
I suddenly had this feeling that everything was connected. It's like I could see the whole thing, one long chain of events that stretched all the way back before the MD5 hash advisory in 2008. I felt like I could see everything that happened, and everything that is going to happen. It was like a perfect pattern, laid out in front of me. And I realised we're all part of it, and all trapped by it."Well, that's stunningly dramatic," you think, "But just... What is going on?!"
First, let's define some terms - hopefully in Plain English:
SSL Web Site Certificate
This is a digital certificate, with a digital signature, that verifies that a website is who they say they are. When you connect to a web site using SSL (HTTPS), your browser says, "Papers, please!" The remote site then sends the SSL Web Site Certificate to your browser. Your browser then verifies the authenticity of this "passport". Once verified, encrypted communications ensue. The point of the SSL Web Site Certificate is that under no circumstances should anyone else be able to create a valid, signed certificate for a web site that they do not own and operate. In order to obtain an SSL Web Site Cert, you must verify by varied means that you are the owner and operator of the web site involved. So, using HTTPS is not only for encryption of communications, but also a way to verify that the site you are communicating with is the Real Thing, and not an imposter. And of course you must pay for the certificate!
Certificate Authority (CA) Root Certificate
This is also a digital certificate, with a digital signature... But in this case, this certificate can be used to create and digitally sign normal SSL Web Site Certificates. This is the kind of certificate that a CA (Certificate Authority) has. These certificates also get passed to browser makers, and are then included in your web browser. This is so that when your browser receives an SSL Web Site cert, it can use the CA Root Certificate to verify that the Web Site Cert is in fact valid.
Certificate Authority (CA)
A CA is the kind of web site from which you would buy a valid, secure SSL Web Site Certificate to use for HTTPS on your site. For example: Verisign.com, RapidSSL.com, Geotrust.com, etc. are Certificate Authorities. They have CA Root Certificates for generating and signing valid SSL Web Site Certificates.
It's helpful to understand that with all these certificates, there is a "chain of command". SSL Web Site Certificates are validated and authenticated using CA Root Certificates. CA Root Certificates are validated with yet higher-authority certificates, all the way up the pyramid to The One Great Root Certificate, which is like the God of Certificates. Thus, each lower-ranking certificate is verified up the chain of command. This all happens behind the scenes, and you have no idea it's occurring.
Piece of cake, right?
Now, where do these hash algorithms like MD5, SHA-1, and SHA-2 come into play?
All certificates contain information, like:
- Web site domain (www.mysite.com)
- Site location (country, state, etc.)
- Site owner info (company name)
- Period of validity
- Data of any length (30 characters, 3000 characters, 40MB, whatever) is passed into the hash algorithm
- The hash algorithm chops up the data and mathematically processes it, thereby spitting out a signature - or digital fingerprint - of the data
- The hash of no two chunks of data should ever be the same - just as the fingerprints of no two people should ever be the same
- The hash output is always the same size, regardless of the size of the input data (just like a fingerprint - no matter the size of the person)
Now, think about that for a minute... If the police were using these hashes, or thumbprints, to verify your identity, they might mistake you for your neighbor, or your neighbor for you, if you "had the same thumbprint". If they did no other checking, and just relied on that thumbprint, they might very well "authenticate" your identities completely incorrectly. BIG OOPS!
This is exactly what happened with the MD5 SSL attack outlined at the above link.
These smarty-pants people were able to carefully buy a valid SSL Web Site Certificate from RapidSSL in 2008. Before they did that, they created their own CA Root Certificate in such a way that the hash (fingerprint) of their valid, just-purchased Web Site Cert was identical to the hash of the FAKE CA Root Certificate that they created out of thin air.
Since RapidSSL had just said, "Dudes, this Web Site Certificate fingerprint is valid!", and since this was the same fingerprint on the fake CA Root Cert, the forged CA Root Certificate becomes valid.
Now, recall that a CA Root Certificate - as long as it has a valid hash/fingerpint that will validate up the "chain of authority" - can be used to generate a valid SSL Web Site Certificate for any web site in the world... And neither you, nor RapidSSL, nor your browser will ever know that anything is amiss.
Why is this a problem? For starters, consider a man-in-the-middle attack.
You want to go to https://www.gmail.com. But some "hackers" have used another type of hack to insert their server between you and Gmail. Normally, this would not be possible, because you're using HTTPS! You're SAFE!
WRONG!
As far as anyone knows, you are connected to gmail.com over HTTPS. But in reality, what's happening is this:
- You try to connect to https://www.gmail.com
- The attacker diverts your request (perhaps using DNS cache poisoning or some other such attack) to a fake server
- Since Attacker's Server contains a falsely generated, perfectly valid SSL Web Site Certificate using the tricks outlined above, your browser doesn't know any better. Everything appears to be legit.
- You begin doing e-mail, but all your data is actually going encrypted to Attacker's Server, being decrypted and recorded/modified, and then Attacker's Server then passes the data on to the real https://www.gmail.com (using Gmail's actual, valid SSL cert).
- You have absolutely no clue that your "secure" communications are not secure in the least!
Now, isn't that a daisy?
"But wait!" you say. "Isn't it therefore good for Microsoft to recommend changing the hash function to SHA-256 if SHA-1 has the same potential problem as MD5 did back in 2008?"
An excellent question! Unfortunately, yes and no. Even if you, as a web site owner, change your SSL Web Site Certificate from one that is signed using SHA-1 to a new cert that is signed using SHA-2, you are still unsafe.
Why?
Because all it takes is for ONE Certificate Authority to use a "weak" hash algorithm, and someone who is up to no good can generate a forged CA Root Certificate. Once they have that, they can generate as many SSL Web Site Certs as they want - using any hashing algorithm they please - including a fake-yet-valid cert that they can use to impersonate your "secure" site!
In other words, the weakness in the hashing algorithm is just the tip of the iceberg. Due to the hierarchical "chain of authority" in the whole certificate system, if anyone manages to create a false CA Root Cert, they are more or less god in terms of creating false SSL Web Site Certs.
Thus, in order for Microsoft's words to have an effect, there must not be ANY Certificate Authority (Web Site Cert issuer) in the whole world that still uses SHA-1. In order for the "security" to actually be more secure, everyone must upgrade right now. But this isn't going to happen.
Now, if that isn't bad enough, think about all the NSA spying. Think about how many people said, "Naw, man, I just surf using HTTPS, and I'm totally safe!"
You think so?
I don't. You know why? Well, you should, by now... But there's more!
Guess who invented the SHA-1 hash algorithm in 1995?
The NSA.
Guess who invented SHA-2 in 2001?
The NSA.
So, why should all the Certificate Authorities switch from the NSA's SHA-1 to the NSA's SHA-2? Why, because the NSA created it the way they did for a reason!
SHA-1 already has been theoretically breached, and there are a few indications that SHA-2 isn't quite as super-duper-safe as everyone thinks.
Imagine you are the NSA. You want to spy on everyone, everyone's grandmother, the grandmothers' cats, and the mice that are currently being digested inside the cats. SSL is kind of a problem... It can use pretty annoying encryption. Well, hell! No problem. Just compromise the "certificate authority chain" by forging one little CA Root Certificate, and blammo! You can eavesdrop and man-in-the-middle anybody you darn well please, SSL or not!
Web sites over SSL? No problem.
E-mail over SSL? No problem.
I have said it before, and I'll say it again: There never was security or privacy on the internet, there is no security or privacy on the internet now, and most likely there never will be. Not unless some very big changes are made...
And do you know why all this (and much, much more) is possible?
Because just like you, I had no knowledge of the gaping holes in SSL. Awareness of this and many other issues - technological, political, psychological, social, etc. - is absolutely essential.
Otherwise, frankly, we're screwed.
As a cryptogeek myself, I'm happy to see you bringing this issue to the light. I have to caution you against assuming that the NSA knows some voodoo magic to break SHA-1 (or 2) just because they made it. Nevertheless, aware that the NSA's word doesn't hold water, and paranoid about potential collisions being found, an open competition was held by NIST to find the SHA-3 standard, but many question whether industry will actually adopt it. Here's where we should focus our efforts! If we are going to embark on the cost of forcing everyone to upgrade, why not go all the way?