Words are the least secure way to generate a password of a given length because you are limiting your character set to 26, and character N gives you information about the character at position N+1
The most secure way to generate a password is to uniformly pick bytes from the entire character set using a suitable form of entropy
Edit: for the dozens of people still feeling the need to reply to me: RSA keys are fixed length, and you don’t need to memorize them. Using a dictionary of words to create your own RSA key is intentionally kneecapping the security of the key.
That’s only really true if you’re going to be storing the password in a secure vault after randomly generating it; otherwise, it’s terrible because 1) nobody will be able to remember it so they’ll be writing it down, and 2) it’ll be such a pain to type that people will find ways to circumvent it at every possible turn
Pass phrases, even when taken with the idea that it’s a limited character set that follows a semi predictable flow, if you look at it in terms of the number of words possible it actually is decently secure, especially if the words used are random and not meaningful to the user. Even limiting yourself to the 1000 most common words in the English language and using 4 words, that’s one trillion possible combinations without even accounting for modifying capitalisation, adding a symbol or three, including a short number at the end…
And even with that base set, even if a computer could theoretically try all trillion possibilities quickly, it’ll make a ton of noise, get throttled, and likely lock the account out long before it has a chance to try even the tiniest fraction of them
Your way is theoretically more secure, but practically only works for machines or with secure password storage. If it’s something a human needs to remember and type themselves, phrases of random words is much more viable and much more likely to be used in a secure fashion.
Thanks for that! I recommend anyone who wants to minimize risk to follow their instructions for self-hosting:
Is the source code available and can I run my own copy locally?
Yes! The source code is available on Github. Its a simple static HTML application and you can clone and run it by opening the index.html file in your browser. When run locally it should work when your computer is completely offline. The latest commits in the git repository are signed with my public code signing key.
We are talking about RSA though, so there is a fixed character length and it isn’t meant to be remembered because your private key is stored on disk.
Yes the word method is better than a random character password when length is unbounded, but creating secure and memorable passwords is a bit of an oxymoron in today’s date and age - if you are relying on remembering your passwords that likely means you are reusing at least some of them, which is arguably one of the worst things you can do.
Most of my passwords are based around strings of characters that are comfortable to type, then committing them to muscle memory. There’s a few downsides to this:
If I need to log in to something on mobile and don’t have a proper keyboard with me, it’s tough to remember which symbols I’ve used
I share some of my logins with friends and family for certain things, if they call and need to re-enter a password, it’s usually impossible to recite it to them over the phone (most of my shared logins have reverted back to proper words and numbers to make it easier for the others)
If I lose an arm, I’ll probably have to reset all of my passwords.
But yeah, words alone provide plenty of possibilities. There’s a reason cryptocurrency wallets use them for seed phrases.
And even with that base set, even if a computer could theoretically try all trillion possibilities quickly, it’ll make a ton of noise, get throttled, and likely lock the account out long before it has a chance to try even the tiniest fraction of them
One small correction - this just isn’t how the vast majority of password cracking happens. You’ll most likely get throttled before you try 5 password and banned before you get to try 50. And it’s extremely traceable what you’re trying to do. Most cracking happens after a data breach, where the cracker has unrestricted local access to (hopefully) encrypted and salted password hashes.
People just often re-use their password or even forget to change it after a breach. That’s where these leaked passwords get their value if you can decrypt them. So really, this is a non-factor. But the rest stands.
It’s still a rather large pool to crack through even without adding more than the 1000 most common words, extra digits, minimal character substitution, capitalization tweaks, etc
so you are saying 44 bits of entropy is not enough. the whole point of the comic is, that 4 words out of a list of 2000 is more secure then some shorter password with leetcode and a number and punctuation at the end. which feels rather intuitive given that 4 words are way easier to remember
If you know the key is composed of English language words you can skip strings of letters like “ZRZP” and “TQK” and focus on sequences that actually occur in a dictionary
we are talking about RSA keys - you don’t memorize your RSA keys
if you rely on memorizing all your passwords, I assume that means you have ample password reuse, which is a million times worse than using a different less-secure password on every site
While this comic is good for people that do the former or have very short passwords, it often misleads from the fact that humans simply shouldn’t try to remember more than one really good password (for a password manager) and apply proper supplementary techniques like 2FA. One fully random password of enough length will do better than both of these, and it’s not even close. It will take like a week or so of typing it to properly memorize it, but once you do, everything beyond that will all be fully random too, and will be remembered by the password manager.
The part where this falls flat is that using dictionary words is one of the first step in finding unsecured password. Starting with a character by character brute force might land you on a secure password eventually, but going by dictionary and common string is sure to land you on an unsecured password fast.
Even if an attacker knew that your password was exactly four words from a specific list of only 2048 common words, that password would still be more secure than something like Tr0ub4dor&3
That’s true but in practice it wouldn’t take 60^11 tries to break the password. Troubador is not a random string and all of the substitutions are common ( o -> 0, a ->4, etc. ). You could crack this password a lot easier with a basic dictionary + substitution brute force method.
I’m saying this because I had an assignment that showed this in an college cybersecurity class. Part of our lesson on password strength was doing a brute force attack on passwords like the one in the top of the xkcd meme to prove they aren’t secure. Any modern laptop with an i5 or higher can probably brute force this password using something like hashcat if you left it on overnight.
Granted, I probably wouldn’t use the xkcd one either. I’d either want another word or two or maybe a number/symbol in between each word with alternating caps or something like that. Either way it wouldn’t be much harder to remember.
except it is not troubador. it is troubador, ampersand, digit.
if you know there are exactly two additional characters and you know they are at the end of the string, the first number is really slightly bigger (like 11 times)
once the random appendix is 3 characters or more, the second number wins
and moral of the story is: don’t use xkcd comic, however funny it is, as your guidance to computer security. yes, the comic suggestions are better than having the password on a post-it on your monitor, but this is 21st century ffs, use password wallet.
if you know there are exactly two additional characters
this is pretty much irrelevant, as the amount of passwords with n+1 random characters is going to be exponentially higher than ones with n random characters. Any decent password cracker is going to try the 30x smaller set before doing the bigger set
and you know they are at the end of the string
that knowledge is worth like 2 bits at most, unless the characters are in the middle of a word which is probably even harder to remember
if you know there are exactly two additional characters and you know they are at the end of the string, the first number is really slightly bigger (like 11 times)
even if you assume the random characters are chosen from a large set, say 256 characters, you’d still get the 4-word one as over 50 times more. Far more likely is that it’s a regular human following one of those “you must have x numbers and y special characters” rules which would reduce it to something like 1234567890!?<^>@$%&±() which is going to be less than 30 characters
and even if they end up roughly equal in quessing difficulty, it is still far easier to remember the 4 random words
this assumes a dictionary is used. Otherwise the entropy would be 117 bits or more. The only problem is some people may fail to use actually uniformly random words drawn from a large enough set of words (okay, and you should also use a password manager for the most part)
see, you didn’t get the whole comic. 4 words out of a dicitionary with 2000 words has more combinations then a single uncommon non gibberish baseword with numeral and puction at the end. as long as the attacker knows your method.
a dicitonary attack will not lower the entropy of 44 bits, thats what the comic is trying to say
Yeah, except for the first few bytes. PKCS8 has some initial header information, but most of it is the OCTET_STRING of the private key itself.
The PEM (human “readable”) version is Base64, so you can craft up a string and make that your key. DER is that converted to binary again:
/**
* @see https://datatracker.ietf.org/doc/html/rfc5208#section-5
* @see https://datatracker.ietf.org/doc/html/rfc2313#section-11
* Unwraps PKCS8 Container for internal key (RSA or EC)
* @param {string|Uint8Array} pkcs8
* @param {string} [checkOID]
* @return {Uint8Array} DER
*/
export function privateKeyFromPrivateKeyInformation(pkcs8, checkOID) {
const der = derFromPrivateKeyInformation(pkcs8);
const [
[privateKeyInfoType, [
[versionType, version],
algorithmIdentifierTuple,
privateKeyTuple,
]],
] = decodeDER(der);
if (privateKeyInfoType !== 'SEQUENCE') throw new Error('Invalid PKCS8');
if (versionType !== 'INTEGER') throw new Error('Invalid PKCS8');
if (version !== 0) throw new Error('Unsupported PKCS8 Version');
const [algorithmIdentifierType, algorithmIdentifierValues] = algorithmIdentifierTuple;
if (algorithmIdentifierType !== 'SEQUENCE') throw new Error('Invalid PKCS8');
const [privateKeyType, privateKey] = privateKeyTuple;
if (privateKeyType !== 'OCTET_STRING') throw new Error('Invalid PKCS8');
if (checkOID) {
for (const [type, value] of algorithmIdentifierValues) {
if (type === 'OBJECT_IDENTIFIER' && value === checkOID) {
return privateKey;
}
}
return null; // Not an error, just doesn't match
}
return privateKey;
}
I wrote a “plain English” library in Javascript to demystify all the magic of Let’s Encrypt, ACME, and all those certificates. (Also to spin up my own certs in NodeJS/Chrome).
Edit: To be specific, PKCS8 is usually a PKCS1 (RSA) key with some wrapping to identify it (the OID). The integers (BigInts) you pick for RSA would have to line up in some way, but I would think it’s doable. At worst there is maybe a character or two of garbage at the breakpoints for the RSA integers. And if you account for which ones are absent in the public key, then anybody reading it could get a kick out of reading your public certificate.
It’s assymetric crypto. You’d need to find a matching public key. Or it’s just some useless characters. I suppose that’s impossible, or what we call that… Like take a few billion years to compute. But I’m not an expert on RSA.
I’m pretty sure the cryptographic parameters to generate a public key are included in the private key file. So while you can generate the other file from that file, it’s not only the private part in it but also some extra information and you can’t really change the characters in the private key part. Also not an expert here. I’m fairly certain that it can’t happen the other way round, or you could impersonate someone and do all kinds of MITM attacks… In this case I’ve tried it this way, changed characters and openssh-keygen complains and can’t generate anything anymore.
I wonder if you string together enough words can it be a valid key?
I would hope so, sentences and words are some of the most secure passwords/phrases you can use
Words are the least secure way to generate a password of a given length because you are limiting your character set to 26, and character N gives you information about the character at position N+1
The most secure way to generate a password is to uniformly pick bytes from the entire character set using a suitable form of entropy
Edit: for the dozens of people still feeling the need to reply to me: RSA keys are fixed length, and you don’t need to memorize them. Using a dictionary of words to create your own RSA key is intentionally kneecapping the security of the key.
That’s only really true if you’re going to be storing the password in a secure vault after randomly generating it; otherwise, it’s terrible because 1) nobody will be able to remember it so they’ll be writing it down, and 2) it’ll be such a pain to type that people will find ways to circumvent it at every possible turn
Pass phrases, even when taken with the idea that it’s a limited character set that follows a semi predictable flow, if you look at it in terms of the number of words possible it actually is decently secure, especially if the words used are random and not meaningful to the user. Even limiting yourself to the 1000 most common words in the English language and using 4 words, that’s one trillion possible combinations without even accounting for modifying capitalisation, adding a symbol or three, including a short number at the end…
And even with that base set, even if a computer could theoretically try all trillion possibilities quickly, it’ll make a ton of noise, get throttled, and likely lock the account out long before it has a chance to try even the tiniest fraction of them
Your way is theoretically more secure, but practically only works for machines or with secure password storage. If it’s something a human needs to remember and type themselves, phrases of random words is much more viable and much more likely to be used in a secure fashion.
Generally people don’t memorize private keys, but this is applicable when generating pass phrases to protect private keys that are stored locally.
Leaving this here in case anyone wants to use this method: https://www.eff.org/dice
And if you don’t feel like using physical dice:
https://diceware.rempe.us/#eff
Thanks for that! I recommend anyone who wants to minimize risk to follow their instructions for self-hosting:
We are talking about RSA though, so there is a fixed character length and it isn’t meant to be remembered because your private key is stored on disk.
Yes the word method is better than a random character password when length is unbounded, but creating secure and memorable passwords is a bit of an oxymoron in today’s date and age - if you are relying on remembering your passwords that likely means you are reusing at least some of them, which is arguably one of the worst things you can do.
You didn’t have to call me out like that.
Okay, that’s fair… Not sure how I missed that context but that’s totally on me
Most of my passwords are based around strings of characters that are comfortable to type, then committing them to muscle memory. There’s a few downsides to this:
If I need to log in to something on mobile and don’t have a proper keyboard with me, it’s tough to remember which symbols I’ve used
I share some of my logins with friends and family for certain things, if they call and need to re-enter a password, it’s usually impossible to recite it to them over the phone (most of my shared logins have reverted back to proper words and numbers to make it easier for the others)
If I lose an arm, I’ll probably have to reset all of my passwords.
But yeah, words alone provide plenty of possibilities. There’s a reason cryptocurrency wallets use them for seed phrases.
One small correction - this just isn’t how the vast majority of password cracking happens. You’ll most likely get throttled before you try 5 password and banned before you get to try 50. And it’s extremely traceable what you’re trying to do. Most cracking happens after a data breach, where the cracker has unrestricted local access to (hopefully) encrypted and salted password hashes.
People just often re-use their password or even forget to change it after a breach. That’s where these leaked passwords get their value if you can decrypt them. So really, this is a non-factor. But the rest stands.
That’s fair
It’s still a rather large pool to crack through even without adding more than the 1000 most common words, extra digits, minimal character substitution, capitalization tweaks, etc
Good luck remembering random bytes. That infographic is about memorable passwords.
You memorize your RSA keys?
you memorize the password required to decrypt whatever container your RSA key is in. Hopefully.
Sure but we aren’t talking about that
I think this specific chain of replies is talking about that actually… though it is a pretty big tangent from the original post
“can you string words to form a valid RSA key”
“Yes this is the most secure way to do it”
“No, it’s not when there is a fixed byte length”
-> where we are now
Sounds like a good point, but claiming that “Words are the least secure way to generate a password 84 characters long” would be pointless.
and some people will try to just hold a key down until it reaches the length limit… which is an even worse way to generate a password of that length
so you are saying 44 bits of entropy is not enough. the whole point of the comic is, that 4 words out of a list of 2000 is more secure then some shorter password with leetcode and a number and punctuation at the end. which feels rather intuitive given that 4 words are way easier to remember
No im saying if your password size is limited to a fixed number of characters, as is the case with RSA keys, words are substantially less secure
Not if you’re considering security gained versus difficulty of remembering.
You don’t memorize RSA keys
That’s why you need lots of words. (6) If you combine that with a large word list it gets very secure.
you are at the same time right, but … wooosh.
There is no point in a password cracking attempt during which the attacker knows the character at N but not the character at N+1
If you know the key is composed of English language words you can skip strings of letters like “ZRZP” and “TQK” and focus on sequences that actually occur in a dictionary
Edit: Oops forgot what the topic was.
we are talking about RSA keys - you don’t memorize your RSA keys
if you rely on memorizing all your passwords, I assume that means you have ample password reuse, which is a million times worse than using a different less-secure password on every site
Derp. Forgot where I was.
I find passphrases easy to remember and I have several. I appreciate the concern, but I understand basic password safety.
While this comic is good for people that do the former or have very short passwords, it often misleads from the fact that humans simply shouldn’t try to remember more than one really good password (for a password manager) and apply proper supplementary techniques like 2FA. One fully random password of enough length will do better than both of these, and it’s not even close. It will take like a week or so of typing it to properly memorize it, but once you do, everything beyond that will all be fully random too, and will be remembered by the password manager.
The part where this falls flat is that using dictionary words is one of the first step in finding unsecured password. Starting with a character by character brute force might land you on a secure password eventually, but going by dictionary and common string is sure to land you on an unsecured password fast.
That’d why words are from the eff long word list and there are 6 words
Even if an attacker knew that your password was exactly four words from a specific list of only 2048 common words, that password would still be more secure than something like
Tr0ub4dor&3
https://www.explainxkcd.com/wiki/index.php/936:_Password_Strength
If the attacker search for your password specifically then xkcd themself posted the reason why it wouldn’t really matter
https://www.explainxkcd.com/wiki/index.php/538:_Security
If you’re doing blind attemps on a large set of users you’ll aim for the least secured password first, dictionary words and known strings.
No, it would not. 2048 to the power of 4 is significantly less than 60 to the power of 11.
https://www.wolframalpha.com/input?i2d=true&i=Power[2048%2C4]—Power[60%2C11]
That’s true but in practice it wouldn’t take 60^11 tries to break the password. Troubador is not a random string and all of the substitutions are common ( o -> 0, a ->4, etc. ). You could crack this password a lot easier with a basic dictionary + substitution brute force method.
I’m saying this because I had an assignment that showed this in an college cybersecurity class. Part of our lesson on password strength was doing a brute force attack on passwords like the one in the top of the xkcd meme to prove they aren’t secure. Any modern laptop with an i5 or higher can probably brute force this password using something like hashcat if you left it on overnight.
Granted, I probably wouldn’t use the xkcd one either. I’d either want another word or two or maybe a number/symbol in between each word with alternating caps or something like that. Either way it wouldn’t be much harder to remember.
except it is not troubador. it is troubador, ampersand, digit.
if you know there are exactly two additional characters and you know they are at the end of the string, the first number is really slightly bigger (like 11 times)
once the random appendix is 3 characters or more, the second number wins
https://www.wolframalpha.com/input?i2d=true&i=Divide[Power[2048%2C4]%2CPower[256%2C3]*Power[2%2C4]*4*500000]
and moral of the story is: don’t use xkcd comic, however funny it is, as your guidance to computer security. yes, the comic suggestions are better than having the password on a post-it on your monitor, but this is 21st century ffs, use password wallet.
this is pretty much irrelevant, as the amount of passwords with n+1 random characters is going to be exponentially higher than ones with n random characters. Any decent password cracker is going to try the 30x smaller set before doing the bigger set
that knowledge is worth like 2 bits at most, unless the characters are in the middle of a word which is probably even harder to remember
even if you assume the random characters are chosen from a large set, say 256 characters, you’d still get the 4-word one as over 50 times more. Far more likely is that it’s a regular human following one of those “you must have x numbers and y special characters” rules which would reduce it to something like 1234567890!?<^>@$%&±() which is going to be less than 30 characters
and even if they end up roughly equal in quessing difficulty, it is still far easier to remember the 4 random words
then someone uses a dictionary attack and your password gets cracked within minutes
this assumes a dictionary is used. Otherwise the entropy would be 117 bits or more. The only problem is some people may fail to use actually uniformly random words drawn from a large enough set of words (okay, and you should also use a password manager for the most part)
see, you didn’t get the whole comic. 4 words out of a dicitionary with 2000 words has more combinations then a single uncommon non gibberish baseword with numeral and puction at the end. as long as the attacker knows your method.
a dicitonary attack will not lower the entropy of 44 bits, thats what the comic is trying to say
Yeah, except for the first few bytes. PKCS8 has some initial header information, but most of it is the OCTET_STRING of the private key itself.
The PEM (human “readable”) version is Base64, so you can craft up a string and make that your key. DER is that converted to binary again:
I wrote a “plain English” library in Javascript to demystify all the magic of Let’s Encrypt, ACME, and all those certificates. (Also to spin up my own certs in NodeJS/Chrome).
https://github.com/clshortfuse/acmejs/blob/96fcbe089f0f949f9eb6830ed2d7bc257ea8dc32/utils/certificate/privateKeyInformation.js#L40
Edit: To be specific, PKCS8 is usually a PKCS1 (RSA) key with some wrapping to identify it (the OID). The integers (BigInts) you pick for RSA would have to line up in some way, but I would think it’s doable. At worst there is maybe a character or two of garbage at the breakpoints for the RSA integers. And if you account for which ones are absent in the public key, then anybody reading it could get a kick out of reading your public certificate.
It’s assymetric crypto. You’d need to find a matching public key. Or it’s just some useless characters. I suppose that’s impossible, or what we call that… Like take a few billion years to compute. But I’m not an expert on RSA.
Public keys are derived from the private key. The asymmetric part is for communication not generation. Afaik
I’m pretty sure the cryptographic parameters to generate a public key are included in the private key file. So while you can generate the other file from that file, it’s not only the private part in it but also some extra information and you can’t really change the characters in the private key part. Also not an expert here. I’m fairly certain that it can’t happen the other way round, or you could impersonate someone and do all kinds of MITM attacks… In this case I’ve tried it this way, changed characters and openssh-keygen complains and can’t generate anything anymore.
The surprised man in the middle
Reddit did it in reverse for Tor
https://libraryofbabel.info/
It the length not the content for the most part. Some keys have syntax such as leading or trailing characters.