Learning about passwords, and what makes one strong, is a fairly important part of modern computing.

What is Entropy?

Simply put, entropy is a measure of how many passwords are in the pool that you have selected from, assuming that you select from the pool randomly.

It can be expressed in the form:

log2(sizeOf(pool)) * length

For a dictionary based password, the pool is the words in the dictionary you chose from. If that’s an actual dictionary, that would be about 170,000, if it’s your vocabulary, for most people that’s just 3000.

For a lowercase password, it would be the 26 letters a-z, for mixed case, the 52 letters a-zA-Z, and for an alphanumeric password, it would be the 52 letters a-zA-Z, plus the numbers 0-9, for a total of 62. For a random printable ASCII password, there are also 33 non-alphanumeric characters including the space character, giving a total available pool size of 95.

An Example

Let’s take an example password of: 8NgK03PzZL1.

This password uses uppercase letters (pool of 26), lowercase letters (pool of 26), and numbers (pool of 10). This gives a total pool of 62 characters. It’s 11 characters long, which means it has an entropy of log2(26+26+10)*11, or ~65 bits, which is expressed as 2^65. This means the total number of passwords is equivalent to 2 x 2 x 2 x 2 … and so on 65 times. For most purposes, it is sufficient.

Increasing Entropy

You can increase password entropy by increasing length, or by increasing the character pool.

If we were to change the pool, without changing the length, like so: 8<gK03PzZL1, we now have a pool of: 26 + 26 + 10 + 33 (there are 33 non-alphanumeric printable ASCII characters including space), so our entropy is log2(26+26+10+33)*11, the number is now even larger, at ~72 bits of entropy.

Because these are powers of 2, every extra bit of entropy doubles the total number of possible passwords.

If we were to change the length, without changing the character pool, we would also see an increase in entropy, for example: 8NgK03PzZL1w is log2(26+26+10)*12, or ~71 bits of entropy.

If we were to do both, the benefits compound. For example: 8<gK03PzZL1w is log2(26+26+10+33)*12, or ~78 bits of entropy.

Randomness is Important

Passwords must be generated randomly from the total available pool in order for these calculations to be valid.

For example, “horse battery staple” is long, but the character-pool calculation is misleading if the phrase was invented by a person or copied from a familiar source. Humans choose patterns, and attackers try human-looking patterns early.

If the three words were chosen uniformly at random from a 3000-word list, it would be calculated as log2(3000)*3, or ~35 bits of entropy. If they were chosen from a 7776-word Diceware-style list, it would be log2(7776)*3, or ~39 bits. The way the password was chosen matters as much as the characters it contains.

Random passphrases can be very strong, but they usually need more than three words.

Why is this Important?

On average, an attacker will only need to attempt half of the available passwords, which means a password with 65 bits of entropy only has an effective defense of 64 bits, as in powers of 2 one less is half.

The amount of entropy required depends on the purpose of the password, how it is stored, and whether an attacker can guess online or offline.

Deciding how much entropy you need is a matter of what you’re trying to protect, and how motivated your adversary.

In this example, the 65 bits of entropy is probably enough for many ordinary uses if the password is unique and the service rate-limits online guesses. The total password pool is 2^65, and an attacker on average needs to check half of that, which is why we calculate with one less power of entropy.

Offline attacks are different. If an attacker has a password hash, their speed depends on hardware and on the password hashing algorithm. Fast hashes make guessing cheap; slow, salted password hashes such as Argon2id, bcrypt, or scrypt make each guess more expensive.1

Modern password guidance generally favours long passwords or randomly generated passphrases, checking new passwords against known-compromised lists, allowing password managers to generate unique random passwords, and avoiding arbitrary composition rules or forced rotation unless there is evidence of compromise.2