This is a story about something that could have gone wrong on the internet this week but instead turned out mostly OK. How often can you say that?
Around 9 o’clock on the East Coast on Friday, February 28, bad news arrived on the doorstep of Let’s Encrypt. An limb of the nonprofit Internet Security Research Group, Let’s Encrypt is a so-called credential authority that lets websites implement encrypted connects at no cost. A CA parcels out digital certificates that essentially vouch that a website isn &# x27; t an imposter. That cryptographic guarantee is the backbone of HTTPS, the encrypted connects that keep anyone from intercepting or spying on your interactions with websites.
Those credentials expire after a situated sum of day; Let &# x27; s Encrypt caps its credentials at 90 days, at which point a site operator has to renew. It &# x27; s a largely automated process, but if a site doesn &# x27; t have an active certificate, your browser will notice and may not load the page you &# x27; re trying to visit at all.
Think of it kind of like updating the registration on your vehicle every year. If your tags expire, you &# x27; ll get pulled over.
Let &# x27; s Encrypt &# x27; s work is technical and happens in the background. But in a few short years it has helped induce the internet much more secure on a fundamental level. Plenty of companies offer security certificates; Let’s Encrypt just took the audacious step of constructing them free. A week ago, it issued its billionth certificate.
But that ubiquity also means that when a pebble fells in the middle of Let’s Encrypt’s pond, the ripples can travel a long way. On February 28, the pebble was a bug that threatened to effectively render 3 million sites nonfunctional in a matter of days.
The flaw itself? Relatively minor in the grand strategy of the internet. Let &# x27; s Encrypt utilizes software called Boulder to make sure that it &# x27; s allowed to issue a certificate to a site.( Some high-value targets, like banks, specify that they &# x27; ll merely accept certificates from a particular CA. Let &# x27; s Encrypt has solid security, but some paid credential authorities offer warranties in the event anything goes wrong, as well as other upgrades. It &# x27; s the difference between, say, having a strong deadbolt and adding renter &# x27; s insurance .) Boulder confirms that Let &# x27; s Encrypt is honoring those preferences when it first issues a credential and again 30 days later. Or at least, it’s supposed to; the glitch entailed it was skipping the second check. And that’s a big no-no.
The actual security implications of that backend hiccup were minimal, says ISRG executive director Josh Aas. At the same time, Let’s Encrypt couldn’t let a bug that affected 2.6 percent of its active certificates — 3,048, 289 in all, where reference is confirmed the issue–linger indefinitely. “The severity of the bug here is not very high, ” says Aas. “But these 3 million credentials were released in a noncompliant way. We have an obligation to revoke them.”
That obligation stems from the Certification Authority Browser Forum, or CA/ B, an industry group that sets strict standards about the use of certificates. In this case, those standards gave Let &# x27; s Encrypt a five-day window to come back into compliance, which would entail rescinding every credential that was affected by the bug. The alternative for Let &# x27; s Encrypt was ignoring the CA/ B and letting it slide, but that was really no option at all.
“They did the right thing. The CA/ B defines these rules and has reasonably strict requirements, which you want. When a person or computer talks to another computer, you want to make sure they’ve gratified some identity criterion, ” says Kenneth White, security principal at MongoDB, a massive database provider that uses Let’s Encrypt. “You can’t be mostly correct. You’ve got to follow the guidelines for how to enforce these things.”
The impact of pulling those certifications would be swift and serious. Once browsers like Chrome and Firefox observed them missing, they would flash warns to any visitors that the sites weren’t safe. Some browsers would block access altogether. A not insignificant chunk of the internet would effectively be taken out of commission. All because of this one small flaw in one niche corner of the Let’s Encrypt operation.
Within two minutes of corroborating the glitch, the Let’s Encrypt team stopped issuing any new certificates in a bid to stanch the bleeding. A little over two hours after that, they fixed the glitch itself. And then they let everyone know what was coming.
“We can’t contact everybody, so we started contacting the largest subscribers, telling them about the situation, get them as informed as possible, ” says Aas. “And then we worked with them to get them to replace their certificates as quickly as possible.”
Once a site operator renewed a certification, Let’s Encrypt could safely revoke the old one. No damage would befall the site. Which sounds like a simple enough solution–but nothing’s simple at this kind of scale.
Bigger organisations had an easier day fixing the problem, since they are generally have the resources to monitor any signs of trouble that surface and the tools to automate the renewal process. “If you’ve got a dozen or two dozen servers or something, that’s some poor sleepy-eyed soul in the middle of the night at a keyboard, ” says MongoDB’s White. “We reissued a little over 15,000 credentials[ for clients ], and we did it in a few hours. There was some work involved, but it wasn’t catastrophic. We had measures in place to be able to rotate quickly.”
Smaller sites got a big assist from the Electronic Frontier Foundation, which operates Certbot, a free software tool that automatically adds Let’s Encrypt credentials to sites and renews them every 60 days. In the last two months alone, Certbot has generated certificates for 19.2 million unique sites. “Fortunately we had anticipated the need to check revoked credentials for renewal in 2015, ” says EFF engineering director Max Hunter. “Because Let &# x27; s Encrypt communicated the questions early, and the code route for the query was already in place, our work was relatively straightforward.” By Tuesday a team from EFF, together with volunteers in Paris and Finland, had updated Certbot to renew any rescinded certificates.
Meanwhile, Let’s Encrypt sent an email to every address it had on file. It created a searchable database of every affected domain so that hosting companies could see if they needed to act. “We marked those credentials as expired in our internal system, and then our normal automated processes kicked in to generate and deploy new certifications, ” says Justin Samuel, CEO of Less Bits, a startup that operates hosting company ServerPilot.
On Tuesday night, 30 minutes before the deadline, Let’s Encrypt made another proclamation. Of the 3 million potentially impacted sites, 1.7 million had managed to renew their certificates, an astonishing number given the short window of period. “No other CA comes close to inducing large-scale cert reissuing not only feasible but also fast, ” says Samuel.
That success also emboldened Aas to make a difficult call. Let’s Encrypt would let the remaining certificates slide. “We made the decision that instead of breaking more than a million websites, potentially, we just aren’t going to revoke them by the deadline, ” says Aas. “We think it’s the right decision for the health of the internet.”
It was the internet equivalent of a bellow from the governor minutes before midnight. Let’s Encrypt will continue to rescind certificates if it can confirm that the sites have renewed them, but otherwise it is content to leave them be in their somewhat violated kind. The security risk is small, Aas says, and since Let’s Encrypt certificates are only viable for 90 days to begin with, any stragglers will have washed out of the ecosystem by summertime at the latest.
“If anything, this just reinforces that they are one of the most transparent, modern certificate authorities in the world, ” says MongoDB’s White, who points to previous certification snafus that for-profit companies like Symantec have badly mishandled. “It’s easy to armchair quarterback. But I think if people are overly critical that’s misplaced.”
The intricacies of internet infrastructure are generally dismissed until something goes terrible incorrect. This time, though, it’s useful to reflect on what went right. For once, the narrative is that nothing broke.