Suggestions for passphrase generator
I've been clamoring for a diceware-style passphrase generator for years, and I just got my first glimpse of one in 1P4/Win running on my Mac under CrossOver 12.5.1 (Wine 1.6). It’s a great first cut, but I can't say that I am in love with the implementation.
Here is my 2¢:
- You call it diceware, but it's not. You aren't using dice, and you aren't using a diceware word list. Call it a passphrase generator.
- I see no documentation. Because I don't know anything about your word list, I don't know how many bits of entropy are present per word, so I cannot evaluate the strength of a generated 1Password passphrase or relate it to the strength of an n-word diceware passphrase.
- I don't understand the logic behind the separator options (space, period, comma, underscore). None of these options would yield a valid password on sites that only allow alphanumeric characters, or on sites that require at least one digit or capital letter. I have suggested elsewhere that the generator should have digit, symbol, and caps sliders, with all caps being initial caps and all digits and symbols appended to generated words. This may not be the best solution or the best UI, but please put in enough options (checkboxes, radio buttons, sliders, or what have you) so that the generator will produce passphrases that we can actually use without the need to manually edit whatever comes out of the generator.
- I really like this generator for generating responses to security questions, but for actual passphrases I think that I would prefer the short words on the traditional diceware word list. For localization and other reasons, you might want to offer the user a choice of word lists. Cf. Zyzzyva.
Still, a great first cut, and I will definitely be using it to generate answers to security questions. Thank you, and keep up the good work!
Comments
-
You aren't using dice
Passphrases are generated using a variant of the “Diceware” passphrase scheme. Each word is chosen at random using an SHA-256–based random number generator.
you aren't using a diceware word list
We're using list of 17679 words between 4 and 8 letters.
I don't know how many bits of entropy are present per word.
Every word is between 4 and 8 letters. The entropy in the passphrases comes directly from the entropy of the random numbers used to select each word.
0 -
Each word is chosen at random using an SHA-256–based random number generator.
You’re making me a wee bit nervous here. Jeff and I went back and forth on this subject a while ago. It is not easy to pick at random, with equal probability, from a list of words whose length is not a power of two. However, I say a wee bit because the impact of any bias is unlikely to be substantial.
The entropy in the passphrases comes directly from the entropy of the random numbers used to select each word.
Yes, but one cannot calculate that without knowing the size of the word list. If words are being selected at random with equal probability from a list of 17,679 words, then the entropy per word is log₂(17,679), or about 14.1 bits, compared with 12.9 bits per word—log₂(7,776)—for the standard diceware word list. The Diceware word list yields slightly less entropy per word, but considerably more entropy per character given the average word lengths for the two lists. This confirms my belief that I would prefer the standard Diceware list (or perhaps the Diceware 8K list) to the 1P4/Win word list for passphrase generation.
0 -
The Diceware word list yields slightly less entropy per word, but considerably more entropy per character given the average word lengths for the two lists
Are you sure? The average word length of the standard diceware word list is 4.2 characters and the biggest words are 6 characters long.
I would prefer the standard Diceware list
Understood. I have added this to #497
0 -
I like that Word list option!
My issue with the Separator settings remains. FYI, here is the full version of my Caps Slider request.
0 -
The Diceware word list yields slightly less entropy per word, but considerably more entropy per character given the average word lengths for the two lists
Excellent observation @benfdc! I haven't actually looked at the average and median lengths of words from these two lists. I also can't recall my reasons for picking 4 as the lower length limit for the list we constructed. I will have to rethink this.
One "problem" for us with the original Diceware list is that it includes digits. For example, "456" is a Diceware word. I wanted to avoid those for considering iPhone and other similar keyboards. But off of the top of my head, I don't see any reason not to include three letter words in our list.
0 -
I also can't recall my reasons for picking 4 as the lower length limit for the list we constructed. I will have to rethink this.
Why need there be a lower length limit at all? Dr. Reinhold writes:
Because some words on the diceware list are two characters or less, you can get a very short passphrase. If your passphrase, including the spaces between the words, is less than 17 characters long, we recommend that you start over and create a new passphrase. You should also start over if your passphrase is a recognizable English sentence or phrase. (These situations are very rare.)
Programming the 1Password passphrase generator to comply with the second half of Dr. Reinhold’s advice would be no mean feat, but automatically “re-rolling” too-short passphrases so that the user never sees them ought to be easy. This tweak introduces a slight bias against the selection of short words, but slight bias seems to me to be preferable to massive bias, which is what a minimum length cut-off amounts to!
Here is another way to look at it. If you pre-determine the length of your word list, setting a minimum word length reduces the entropy per character, but has no effect on entropy per word. However, if you start out with a word list and then strip out all words of three or fewer letters, you are also reducing the entropy per word. How can that possibly be a good idea? Zyzzyva lists 994 three-letter words in the Words With Friends lexicon, and 1,015 three-letter words in the OWL2 lexicon used in North American Scrabble® tournaments. The Diceware list stretches this by including proper nouns (a Scrabble® no-no), familiar acronyms, and easy-to-remember non-words like bbb and bcd. The Diceware list also includes all 26 letters and all 676 two-letter combinations.
One "problem" for us with the original Diceware list is that it includes digits. For example, "456" is a Diceware word. I wanted to avoid those for considering iPhone and other similar keyboards.
The Diceware list actually includes symbols as well as digits. I agree that this is an issue. As I explain in my Caps Slider proposal, IMO digits and symbols should only be generated at user request (to satisfy password complexity requirements at finicky websites), and even then should only be appended to words (for the sake of iPhone-friendliness).
0 -
It would be interesting to know whether the decreased entropy from excluding < 3 letter words is more or less than that resulting from re-rolling "short" phrases. My gut say that the later is better but I couldn't explain why.
The obvious concern with allow shorter phrases is the risk of non-diceware based attack. Simple brute forcing. A 6 word phrase consisting of 2 letter words in alpha only is really insecure.
How do you feel about a more generalised solution to brute force resistance? At the moment the diceware generator still calculates and displays the standard strength gauge.
How about a dropdown that specifies the required strength. ie: the dropdown would have list items: Fantastic, Excellent, Good, Fair, Weak, Terrible.
The generator would then repeatedly regenerate the diceware password until the calculated strength meets the requirement.
If the word count is set low then it could be impossible to meet the strength requirement, so I'd suggest re-rolling for 1 second and if no password could be generated in that time, warn the user.
Obviously the dropdown would default to Fantastic.0 -
How about a dropdown that specifies the required strength. ie: the dropdown would have list items: Fantastic, Excellent, Good, Fair, Weak, Terrible.
Too long for a dropdown, but I really like Dr. Reinhold’s analysis of passphrase strength:
- Five words are breakable with a thousand or so PCs equipped with high-end graphics processors. (Criminal gangs with botnets of infected PCs can marshal such resources.)
- Six words may be breakable by an organization with a very large budget, such as a large country's security agency.
>
…
Another way to think about passphrase length is to consider what security precautions you take to physically protect your computer and data. Here is a list of possible passphrase lengths and commensurate security precautions. …
>
5 words
>
- You would be content to keep paper copies of the encrypted documents you are protecting in an ordinary desk or filing cabinet in an un-secured office.
…
>
7 words
>
- Your computer is protected from unauthorized access at all times when not in your personal possession by being locked in a room or cabinet in a building where access is controlled 24 hours a day or that is protected by a high quality alarm service.
- Routine cleaning and building maintenance people do not have physical access to your computer when you are not present.
- You regularly use an up-to-date anti-virus program purchased off the floor at a computer store.
- You have verified the signatures on your copy of PGP or your installed Hushmail 2 client.
- You never run unverified downloaded software, e-mail attachments or unsolicited disks received through the mail on your computer.
0 -
Too long for a dropdown,
What makes you say that? The longest option is 9 characters; hardly enormous.
but I really like Dr. Reinhold’s analysis of passphrase strength:
that's all very nice but my suggestion was aimed at evaluating diceware passwords in a more conventional way. The number of words is irrelevant is a standard brute force attack on the characters in the words is practical.
0 -
Just out of interest, surely the word length is irrelevant, as is the fact that they are dictionary words, because the cracker has no way of knowing in advance what structure the password has. S/he can't tell (by looking at the encrypted files in the vault) how long the password is, whether it is random words, random characters or anything else. Nor can s/he tell whether it comprises just [a..z, A..Z, 1..9] or includes all the other symbol characters.
So isn't it just a case of making absolutely sure it is immune to a dictionary attack so that the only way to crack it is to guess the encryption key?
0 -
@RichardPayne writes:
The number of words is irrelevant [if] a standard brute force attack on the characters in the words is practical.
One randomly-chosen lowercase letter has log₂(26) or 4.7 bits of entropy. Do the math and you will see that a four-word passphrase generated from the diceware word list has the same amount of entropy as 11 randomly-chosen lowercase letters (51.7 bits). This means that a brute-force character-based attack on a four word diceware passphrase would have no advantage over a brute-force word list-based attack unless all of the words were very short (say, two 3-letter words and two 2-letter words). If you use spaces between your words, even a diceware passphrase that happened to be made up of four 2-letter words would be no more vulnerable to a character-based brute-force attack than to one based on a known diceware word list.
This is the math behind Dr. Reinhold’s observation that it is “very rare” for random dice rolls to result in a too-short passphrase. If you want to be conservative, just follow this simple rule of thumb: if your randomly-generated diceware passphrase has an average word length of less than three letters, toss it and start over. Easy enough with dice, and even easier for AgileBits to incorporate into its passphrase algorithm.
In other words, it’s a non-issue.
0 -
Maybe I'm just coming at this from my OCD attitude to the password strength bars in the login list. I want to see all "full green" bars and it drives me nuts having anything less (damn you HMRC!).
0