The implications of simple passwords and having their hashes sent to haveibeenpwnd.com (Watchtower)

PreCTANT
PreCTANT
Community Member

Hey folks

I have a question about the toggle switch for haveibeenpwnd.com in my settings and the warning that comes along with it.

On that page About Watchtower privacy in 1Password you state:

"However, if you have similar weak passwords like MySekret1 and MySekret1!, there’s a risk that Have I Been Pwned could learn your passwords if they acted maliciously."

I have looked into previous explanations on the forum, but have not understood the attack vector just yet.

If 1Password only sends the first five characters of each password's hash to haveibeenpwnd.com and assuming that the hash for "MySekret1" and "MySekret1!" are vastly different from each other, I don't understand the increased risk that comes with simple passwords.

Are you implying that simpler passwords are more likely to be part of common rainbow tables and therefore they can guess the password from the 5 character snippet more easily than it would usually be the case?

Also, wouldn't it be that the simpler a password is, the more likely it is that more people actually are using it – thereby making it harder for the malicious actor on haveibeenpwnd.com's side to correlate the password with an individual's user name?

Maybe @brenty or somebody else could provide more insight into the issue at hand, helping the user to make an informed decision.

That would be greatly appreciated.


1Password Version: Not Provided
Extension Version: Not Provided
OS Version: Not Provided
Sync Type: Not Provided

Comments

  • Hi @PreCTANT

    I've asked our security team to weigh in on this for you. We'll be in touch soon. :)

    Ben

  • Lars
    Lars
    1Password Alumni

    @PreCTANT - that's an excellent question. First and foremost, the obvious thing to reiterate up front is that haveibeenpwned is decidedly NOT malicious as it exists today, and never has been. But it's always worth being aware of what the possible issues are or could be with what you send to any service, no matter how trustworthy they are.

    You are of course correct that the hash of MySekret1 and MySekret1! are nothing alike:

    • MySekret1 - SHA1 hash - 9F6880C0E1616EF55F5498B633F52FDB40A1DBE5
    • MySekret1! - SHA1 hash - DB038D087A4876D4110CEBE0C8BABCDC1DC0DF6E

    As you correctly point out, it's not that an attacker can learn much about what one password might be solely from what a completely different-looking hash of a very similar password is. If a password has been part of a previous breach and the hash is known, that's a problem, obviously. This problem can be compounded by passwords which are altered only superficially such as the two examples above. The hashes are different, the passwords are not. Users frequently will alter a password that they have to memorize only slightly across services, most frequently adding only a digit - and that digit is very often a 1 at the end, hence our example.

    In such a case where one password is known, much more effective and likely accurate guesses could be made if you have such similar passwords...presuming haveibeenpwned wished to do such a thing...and if you have such similar passwords. The solution to this is: use unique passwords for every Login for different accounts. That's why the very next, bolded sentence after the one you quoted says exactly that: Strong, unique passwords created with the password generator in 1Password are not at risk.

    Hope that helps! :)

  • PreCTANT
    PreCTANT
    Community Member

    Thanks @Lars for your detailed explanation on this. Please allow me to rephrase to see whether I understand now:

    Assuming
    1. I use "MySekret1" and "MySekret1!" as my passwords on two different services,
    2. Watchtower and the haveibeenpwned.com option are enabled,
    3. Watchtower would be acting maliciously, which they have never done,
    4. one of the services above has been compromised and password hashes have been leaked and
    5. the cleartext passwords for those leaked hashes are known or have been brute forced.

    Then
    1. haveibeenpwned.com would receive "9F688" and "DB038" from my installation of 1Password,
    2. they would guess from the leak that "9F688" could be "9F6880C0E1616EF55F5498B633F52FDB40A1DBE5",
    3. they would know the cleartext password for that hash and the corresponding user identity (username, e-mail, …) and
    4. they would try different small variations of that cleartext password, looking for one resulting in a hash beginning with "DB038".

    That would mean
    1. that looking for the hash beginning with "DB038" would be much easier than it would have been without the additional information that they received by knowing my cleartext password for "9F688" (because they can start calculating hashes for passwords starting with "MySekret" and add numbers and characters, which are very few combinations to try out compared to trying out every possible password) and
    2. they would then know another password that I often use, but not which service it is connected to.

    Did I understand you correctly?

    Now, comparing this to a random person out there that has access to the leaked passwords, a compromised haveibeenpwned.com installation would be able to confirm that I use "MySekret1!" as my password. But calculating similar passwords' hashes based on my leaked has and user identity could be done by anyone who uses the leaked data.

    Ultimately, haveibeenpwnd.com would be able to identify which passwords I generally use, so if they tried to guess it against a rate-limited service, they would stand a much greater chance to not be detected (as opposed to the random person that might need to still try hundreds of passwords).

    However, if I use a lot of similar passwords, haveibeenpwnd.com would face a similar challenge, because they would then need to try all of my discovered passwords for all of the services they are trying to get access to (because they don't know which hash is linked to which service).

    Is that the gist of it? :-)

  • PreCTANT
    PreCTANT
    Community Member

    Correction (as I am unable to edit):
    "3. haveibeenpwned.com would be acting maliciously, which they have never done" as well as
    "based on my leaked hash and user identity"

  • jpgoldberg
    jpgoldberg
    1Password Alumni

    Almost a correct understanding @PreCTANT, but not quite.

    Your first steps are spot on. So suppose that you make a query for 9F688, and you make a query for DB038, and the server knows that these both come from you. Also note that anyone with access to the hashes can reverse each individual one with high probability.

    When you submit a prefix, like 9F688 that might end up matching, say, 500 hashes. HIBP doesn't know which if any of those 500 corresponds to your password. The "if any" in the previous sentence is important. Lots of passwords that aren't in the database will have a prefix of 9F688, including extremely strong and unique password. Note that checking to see if any of the hashes returned to 1Password is actually a full match for the password you queried is done entirely inside 1Password, so HIBP doesn't learn whether whether any of those 500 are a full match (and certainly not which one, if any). All of this is factored into the k of k-anonymity. The system is tuned to provide an upper bound to the information that HIBP learns. We set a desired k and work out how big the prefix should be given the size of the database.

    Connecting the queries

    But now consider the situation where you also submit DB038. In terms of an individual query it learns very little as designed. It will match another set of 450 or so hashes. The password you are checking might be among those 450 or it might not be. Just as described above. But now assume that HIBP is acting maliciously and it knows who is making these queries. It can look through the 500 passwords (because it can reverse the hashes) matching 9F688, and it can look through the 450 passwords matching DB038. And it can then look for similar passwords between those sets. It will find the pair MySecret1 and MySecret1!. Knowing that lots of people use closely related passwords, it can take a good guess that those are the passwords that correspond to your two different queries.

    It isn't a certain guess, but it is a guess it can make with much higher probability than what we were wanting from _k-_anonymity.

    Now to make use of this information a malicious HIBP would, as noted, need to know that the queries come from the same individual, but also would need to have some idea of who you really are or at least what username or email address you typically use. Knowing that there is a good chance some Internet user uses both MySecret1 and MySecret1! as passwords isn't really going to be telling it much new unless it can get a sense of which user. This might be possible in the case of an individual user of a malicious HIBP, but will be harder for it to learn this about a 1Password user. The protocols for checking an email address in HIBP are very different than the password breach lookups, and in the case of doing this through 1Password will not come from the same sources.

    And, as we state at every opportunity, if your passwords are unrelated to each other (as strong, unique passwords will be) then this poses no threat at all. And even if the worst case, it is still going to require some highly restricted circumstances that are very unlikely to affect 1Password users.

    Re-opt-in

    When we first integrated with 1Password, we did so with the understanding of the original (uncorrelated queries) security analysis k-anonymity. When we became aware that under certain circumstances that original analysis doesn't hold, we felt that we needed to ask people to opt-in to HIBP integration again. We wouldn't be using the service if we didn't think it is safe, but because our initial statements about its safety (during first opt-in when we introduced it) changed, we felt we needed to ask people to opt-in again given the corrected analysis.

    More abstractly

    At a more abstract level, the initial security analysis of HIBP's k-anonymity was done looking at single queries, and that analysis still stands for single queries. There are tight bounds on what a malicious server could learn from single queries.

    But the sum of the information that can be learned from multiple queries from the same individual is greater than the sum what is learned from individual queries. The good news is that this potential information leak is limited to some very specific circumstances. And our advice remains unchanged: Use unique and strong passwords for each service.

This discussion has been closed.