Also on Lobste.rs
This is a transcript of the question with the same title I've posted on Lobste.rs.
I republish it here mainly for archival (and discovery) purposes, but also to highlight any interesting ideas that emerged from that discussion thread.
If you have a different take than what was discussed (especially a non-mainstream one) email me as described in the contact section below.
This topic is also in relation with my latest open-source tool: z-tokens.
Question
So, just to focus this question,
I don't want to touch the subject of password patterns or password generation.
Let's focus only on "random passwords" (for whatever that means) of a given entropy.
(In other words, let's assume there is a fool-proof function genpwd(entropy_bits)
that generates a password with the given entropy and we'll just use whatever it spits out.)
Also, to focus even more, the intended use-case is for encrypted offline storage (which includes anything from GnuPG keys, Age encrypted files, full-disk-encryption, etc.).
Furthermore, the only threat we want to protect against is brute-force against the cryptographic primitives. Furthermore say the attacker has enough hardware power (say a few millions $) and enough patience (say a few years). And let's also assume we've reached the peak of the technology capabilities (both hardware and software).
(I am well aware that by now I have waved away so many potential attack vectors that we are currently in the theoretical space...)
Given all that, the obvious answer is "just use a password with at least as many entropy bits as the underlying cryptographic construct". (For example in the case of something based on AES-128, just use a 128 entropy bits password, passed through an at least 128 bit hash function; key derivation with iterations isn't needed.)
However, at the other end there is memorability which pushes the entropy down. As a user I would prefer to use a short and memorable password (regardless of what that means), which means having fewer entropy bits.
Another complication is password derivation functions (like Argon, scrypt, bcrypt, the newer bscrypt, or in the worst case some variant of PBKDF) which helps us gain some entropy bits through time spent.
So my question is: given adequate defaults for a password derivation function, how many bits of entropy do I require for my password to be practically unbreakable in my lifetime (~100 years)?
Is 64 bits enough? Is 72 bits enough? Should I go up to 128 bits?
Answers
There have been more answers, but the following are the ones I considered the most interesting or insightful ones.
[See this Lobste.rs comment thread started by github.com/davidchisnall.]
The number of bits depends a lot on the password KDF. Argon2 has tuneable parameters for CPU and RAM usage. You can pick settings that take, say, 1GiB of RAM and 30s of CPU time and then calculate the cost in cloud resources to be able to crack it for any amount of entropy. That then gives you a cost for someone to break and you can compare that to the value of whatever it's protecting. Like many other security questions, it's far easier to reason about as an economic question.
[See this Lobste.rs comment thread started by github.com/santiagotorres.]
I always liked this paper to talk a bit about this [1]. I think of particular interest is Figure 2. I stopped doing research on passwords circa 2016, so I wouldn't be surprised if there is newer work helping characterize password strength vs offline password cracking.
[1] Florencio, Dinei, Cormac Herley, and Paul C. Van Oorschot. An {Administrator's} Guide to Internet Password Research
[See this Lobste.rs comment thread started by
mjec
.]You can make an estimate of cost now. For example (and this isn't necessarily the best attack setup), you can use AES-NI on an Intel i7-908X to perform AES encryption at 1.3 cycles / byte. AES has a 16 byte block size. To make things round, let's say you're getting 16 cycles per block (and you're decrypting one block per guess), running at 3.6GHz. That gives you 225 mega guesses / second / core.
That means a 64 bit keyspace gives you 162 years with a single 16 core processor. A dedicated 16 core host on AWS is $0.449 / hour on demand. Renting 162 of them for a year, with public 1 year reservation discount, paid upfront, will set you back $375k.
So, in some sense, 64 bits of entropy costs $375k to break; and you can just do some multiplication to find new results. 65 bits? That's twice the space, so $750k. 64 bits but in three years when costs have come down by 75%? $94k.
But the number of variables is enormous, and the uncertainty over a long time is even larger. What if we get a usable quantum computer and Grover's algorithm is in play? What do you think the cost of computer time will be in 2097? Do you think clock speeds will increase dramatically in the next 100 years, or have we reached a limit? Will the cost of ASIC manufacturing come down enough to result in economical specialized attack hardware? How will population growth affect the cost of energy?
(Also, I know you asked about time and I talked about cost, but the two are very roughly interchangeable in this analysis; a $1e100 cost is just as unachievable as a 1e100s time to recover the key.)
[This was one of my replies.]
I took a similar approach, but got the numbers out of thin air:
- choose a number of years you want your data to remain secret;
- assume the derivation function is
PBKDF2-HMAC-MD5
with 999 iterations; (on Nvidia RTX 4090 it can generate these at a rate of ~46 millions per second;)- assume the attacker starts cracking from day 0 and won't give up until those number of years elapse;
- I've started with a cluster of 1000 of Nvidia RTX 4090 (at the moment it costs $2K brand new each);
- I've assumed that performance would double each 5 years;
- I've assumed that the cluster size would double each 5 years; (cost goes down;)
- I've also assumed that you are unlucky and your password is guessed within the first 10% tries;
- I've also assumed that all of the above is wrong, and the actual effort is 10x less;
If you multiply all those (with a bit of rounding as otherwise would overflow float64) you get the minimum number of bits of entropy.
But all all of the above is laughable because every line is just an assumption out of thin air...