Once Cracked Twice Shy - A Blacklist Too Far?

Maria Holler
Feb 26, 2018
7 min read

Updated: Aug 28, 2019

February 26, 2018 by Jeremy Spilman

What makes a good secret password? How can we choose better passwords? What makes some choices so bad that they should never be used?

Much has been made of the National Institute of Standards and Technology (NIST) guidance in their Special Publication 800-63B (“Digital Identity Guidelines”) and in particular Section 5.1.1 on Memorized Secrets. Unlike randomly generated keys from a password manager, a secret that we are expected to memorize and spontaneously recall in order to access our account is a very special kind of authenticator. Authenticating with a mere utterance is the most universal, most inexpensive, most portable, most accessible, https://haveibeenpwned.com/API/v2#PwnedPasswords most legally protected form of authentication. Our mission at BlindHash is to make authenticating with your mind the easiest, fastest, and most secure access method possible.

The inability of websites to protect their verifiers for memorized secrets is due to the widespread but fundamentally flawed method for protecting passwords called iterative hashing. The security provided by iteratively hashing passwords depends on the strength of the password and the latency of the hashing process. This approach fails on both accounts -- memorized passwords are not complex enough, and applications are too latency sensitive, to result in a secure password verifier (hash). The sad state of password security is such that we’re lucky if a site even uses iterative hashing in the first place.

Password theft is extremely damaging and costly, both for the users who are directly impacted, and for the companies that suffer operational disruption, loss of trust, and civil liability as the result of a breach. As breaches have accumulated, we’ve witnessed the rising blood-sport of trying to crack the greatest percentage of the hashes in a freshly leaked database dump. The so-called “offline dictionary attack” is the process of running millions or billions of potential passwords through the same hashing algorithm used by the breached site and comparing the results until a match is found that reveals the cleartext password for a user.

Troy Hunt’s service, “Have I Been Pwned?” has helped raise the visibility of password breaches and demonstrate how incredibly widespread these breaches have become. Over the last several years, Hunt has accumulated 500 million cracked passwords. Researchers study these cracked passwords to learn about the way we choose our passwords and how to encourage better password selection. We can rest assured that attackers have access to this same data, which they use to tune their dictionaries and algorithms, becoming ever better and more efficient at churning through candidate passwords in search of a match to our hashed secret.

What is a site to do? Modern iterative hashing is designed specifically to maximize the computational burden on the backend CPU in an effort to thwart attackers who run that some algorithm on specialized hardware. Increasing the computational cost and latency of logging in a user makes offline attacks correspondingly slow and costly. This iterative hashing imposes a burden on real-time CPU capacity, however; and it is critical computation that cannot be anticipated, scheduled or deferred; and it blocks the user until it completes; and it presents an exploitable denial-of-service vector. If a strong iterative hash takes 500 milliseconds to complete on a single core, and you need to log in a peak of 1,000 users per second, simple math shows that 500 CPU cores must be always available for nothing more than to carry the iterative hashing load. That’s before you can even begin the real work of actually servicing requests.

To avoid this tradeoff between resources and security, sites are becoming ever more hostile to users in their restrictions on simple passwords. They hope that the onus (and dare we say, liability) of preventing password theft can be pushed onto users. Policies that require a password to contain some minimum quantity of upper and lower case letters, numbers, and symbols, are perhaps the most common and surely the most universally reviled technique. NIST has gone so far as to cease recommending such methods in their latest guidelines. Worse still, password expiration rules that compel users to choose new passwords continually, and forbid reuse of previously chosen passwords, were not only banished from the latest NIST recommendations, but also resulted in a spectacular front-page mea culpa late last year from the designer of the guidelines, which had been in place since 2003.

Research suggests that the most effective way to improve password quality is a client-side password strength meter, the often colorful diagram that gives increasingly pleasant feedback as your password gets longer and contains a greater mix of letters, numbers, and symbols. This is now the NIST preferred method. Despite the somewhat unsolved problem of actually measuring the “guessability” of a given secret, these meters do successful encourage users to tack on a bit more entropy. Studies have shown that providing immediate feedback, and favoring the carrot over the stick, is more effective at persuading users to take extra care in their choice of secrets.

But now, the latest attempt to pummel users on the sign-up page, enshrined in recently published NIST guidelines, is the idea of blacklists. Behold the passwords deemed so dangerous that they can never be used responsibly. Sites are advised that they must now refuse prohibit passwords that are either; 1) too repetitive, 2) too contextual (e.g. the same as your username, or the name of the site itself), 3) straight from the dictionary, or 4) passwords obtained from previous breaches. Give engineers a dial, and they will find a way to turn it past 10. Here we see Hunt has delivered in spades. With his corpus of cracked passwords, half a billion strong, and a sweet sweet API to query his 31-GB dataset, HaveIBeenPwned has created no less than the ultimate blacklist.

The open source community responded resoundingly with code snippets, plugins, pull requests, and integrations to happily query this new API and allow it veto authority for any attempt by a user to choose a poorly recycled password. But what is the actual cost of blacklisting half a billion previously cracked passwords? How does it impact the percentage of users who complete the registration process? How does it impact the ability of users to remember their passwords or simply perpetually reset their forgotten secret? How many more users abandon a service because of the increased frustration of trying to login? How much does it increase the support capital required to recover access to locked accounts? And most of all, how does it actually impact the complexity of the password ultimately chosen by the user, and the likelihood that they go on to reuse that password?

The answers to all these questions seem especially grim. In a 2017 study by Habib et al, which is referenced by the NIST guidelines themselves, researchers attempted to quantify the impact of blacklists on usability, sentiment, and the complexity of the ultimately chosen passwords. They found that users whose initial password choices were blacklisted went on to choose passwords that were significantly weaker. Furthermore, more than half of those who chose a blacklisted password went on to simply tweak their chosen secret until the blacklist let them through. Remarkably, even though the study used a client-side blacklist which provided detailed and instant feedback after each keystroke, and with a blacklist consisting of only 100,000 of the most commonly chosen passwords, users who were thwarted at least once by the blacklist were twice as likely to Agree or Strongly Agree that the login task was “Annoying” or “Difficult.” The study did not assess the likelihood of a user abandoning the registration process over their feelings of frustration, nor did it assess the likelihood of users ultimately being able to remember their chosen secrets after the fact, or whether the blacklist resulted in a user memorizing a password they were more likely to reuse elsewhere.

Let’s return to the ultimate purpose of implementing a blacklist in the first place. Account compromise can happen in one of three ways. An attacker can use phishing or malware to steal a password, in which case the password complexity is irrelevant. An attacker can try to login as the user through the site’s own login page. This online attack and can allow a limited number of guesses before the site should require a second factor, such as clicking an email link). An attacker can steal the site’s password verifier database--namely, the hashes--and use an offline attack to guess passwords. This offline attack allows unlimited guesses. Offline attacks in practice provide a motivated attacker billions of attempts at guessing a given password, often using either custom built hardware stacked with dozens of GPUs (which are particularly adept at running most hashing algorithms), or by distributing the highly parallel task among their illegal botnets. [Ed: Say this earlier.]

Passwords must be strong enough that they cannot be guessed outright in on online attack This means they should survive on the order of 100 to 1000 guesses, and we must not use any password on more than one website. This is because we don’t want an attacker to take a username and cracked cleartext password from an old breach, and be able to use the same username and password on another site. A different person can use the same password (on one website) with no immediate reduction in security, particularly when the site’s passwords are secured by BlindHash and thereby protected from offline attack.

This is a crucial distinction. Absent 2-factor authentication, a breach of one website gives the attacker username and matching passwords that will access any site using that same password in one try. But to say that any user cannot choose any password that has ever been known to be cracked is nonsensical. To implement such a policy, as we have seen with past overzealous interpretation of the NIST guidelines, risks a myriad of direct costs in terms of conversation rate, bounce rate, user satisfaction, re-engagement, and support costs. Such a policy could even ultimately be counter-productive toward the stated goal of increasing password complexity and decreasing password reuse, just as we ultimately found the prior NIST guidance on password expiry led to users resorting to making trivial iterations on simpler passwords than what they would otherwise have chosen without an expiration policy.

A more sensible approach is to blacklist only the worst offenders -- the top 10,000 passwords -- in a client-side module that can provide instant feedback, and that will only intercede when the user is trying to select a password which is truly at high risk for an online attack. While the offline attack vector may be somewhat deterred, it can never be secured through the use of blacklists or password strength meters. Attempting to prevent an offline attack with a blacklist is like trying to mow the lawn with a machete: It might seem easy at first, but it’s likely to end in blood, sweat and tears.

BlindHash is the only proven, perfectly-secure, and commercially available method to provide high performance, low latency, and scalable password hashing. BlindHash protects against offline attack, even if a site’s password verifier database is stolen, and it does so even with simple passwords. By eliminating any possibility of an offline attack, BlindHash allows sites to adopt user-friendly password policies that maximize conversion rate, satisfaction, and engagement, while minimizing support costs and account lockouts. We encourage you to download our whitepaper to learn more.

Once Cracked Twice Shy - A Blacklist Too Far?

Recent Posts

Comments