Password Security

Patent 162,952

Basic Rules to Follow

  1. Use 3-4 words in a way that is novel and memorable to you
  2. Be creative in using uncommon characters between the words
  3. Substituting characters and/or using nonstandard spelling helps reduce guessability

The security of a password is a complex relationship between length, number of symbols available, novelty and ability to recall, current and future computational power, and the current and future state of the art for computational algorithms.

Limit Memorization

There are a very narrow set of areas where you should have a good, secure password that you have memorized. A phone, desktop computer, a laptop and other personal computing devices and most importantly, a password manager.

By keeping focus on a limited group of devices, you can keep a small number of secure passwords memorized. Then, you can use password management software (the type that can provide strong encryption) to help generate unique and long passwords for dozens more passwords for as many services and accounts.

By using a password manager you can also store uniquely randomized answers to security questions, ones that cannot be guessed if someone has access to a biography of you or other information prone to social engineering hacks. Why divulge your place of birth, first car, and other personally identifying information that could be used to  build a dossier of information to hack other accounts you have?

When following this pattern, each unique password becomes more like a security token that you have created for each service and  being unique to that service, if it is compromised, only that particular service will be affected. Also, since it is a truly randomized password, no information about any patterns you might use for passwords is leaked.

A good choice for a memorized password is 3-4 words as a unique phrase that is easy to remember, varied upper and lower case and some substitution of numbers and punctuation symbols. For example: BoromirHatesShimpCocktail, Gellert/Loves\Albus, correcthorsebatterystaple.

A deeper analysis of why follows. 

Combinations & Sequences

Lets start with the mathematics behind the length and number of symbols in a sequence. At its most basic a password can be thought of as a sequence of symbols or characters. If a password were composed of one symbol from a list of 10 (0-9), it is trivial to guess. If it is chosen from a list of 1,112,064 characters (UTF-8), it is harder to guess.

The number of possible sequences for 0-9 can be counted using the simple base 10 number system. One place has 10 possible symbols: 0-9. As you add places, you are increasing the number of possibilities by 10 times. Counting the number of possibilities is easy; with five places, there are 100,000 possible combinations (0-99,999) or ${\sf 10^5}$.

Generalizing this, it can be represented as $b^n$ with $b$ the base size (or cardinality) of the set of symbols to choose from (0-9 in this example) and $n$ the total length of the symbols sequenced together (5 in the case of a number like 99,999). Assuming symbols are allowed to repeat, any password that is composed of lower case letters from the english language (a-z, 26 characters) and upper case letters (A-Z, 26 characters), and the numbers (0-9, 10 characters), can be represented as having ${\sf (26+26+10)^n }$ possible sequences or ${\sf 62^n}$.

If a password is chosen from these 62 symbols and is 10 characters long, then the total number of possible passwords is ${\sf 62^{10}}$ or 839,299,365,868,340,224 different possible combinations. If more symbols from standard US keyboards are included in the list, such as +=!@#\$%^&*()_-\"|'?/{}[]`~,.<> an additional 31 symbols are available for a total of 92 and a password with a length of 10, or ${\sf 92^{10}}$, has 43,438,845,422,363,213,824 possible combinations (almost 52 times more than ${\sf 62^{10}}$). As more characters are added to the length of the password, the possibilities increase faster for the set composed of 92 characters compared the set of 62 symbols. For a password of length 15, ${\sf 92^{15}}$ has 372 times more possibilities than ${\sf 62^{15}}$.

This is why some software and services require you to choose passwords that include symbols that are drawn from punctuation; every extra possible symbols helps.

Novelty & Ability to Recall

A password is no good if you cannot remember it or have it stored securely. It also not very good if the password has a higher order pattern to it that is easy to guess. For example, a pattern across a keypad or keyboard that draws out a shape.

Instead of considering the number of individual symbols at each character position in a password, consider the larger context. What if we pull a word randomly from an english dictionary? If you consider each word as a separate symbol unto itself, the total symbol set to choose from are the words in that dictionary. If using the Oxford English Dictionary (2nd edition), there are approximately 230,000 words or symbols to choose from.

Lets choose the word disappear; there is a 1 in 230,000 chance for a single guess to land on that password (this actually depends more on common word usage vs uncommon). If we counted the number of characters instead, and assumed only lowercase characters, there are ${\sf 26^9}$ possible combinations or a 1 in 5,429,503,678,976 chance to guess the password on the first try.

By simply guessing words from the english language, probability of guessing correctly on the first try has been reduced by 23,606,537 times. If the password is a common word, one which is used in the 20,000-40,000 words that individuals actually use, the probabilities for guessing are increasingly higher.

This is why having very simple passwords composed of a single word are a terrible choice.

Treating each word as a symbol, and a phrase with four words, an accurate way of measuring an upper bound to its complexity is then ${\sf \approx 230,000^4}$ or ${\sf 2.79841 \times 10^{21}}$ possible combinations.

By choosing words that have some meaning to you and then stringing them together in a novel way, this increases your ability to memorize them, while maintaining novelty of the end password or pass phrase.

Social & Computational Complexity

Three attack vectors against passwords:

  1. limited guessing
  2. unlimited guessing
  3. sniffing

Limited guessing applies to services and devices that keep track of the number of attempts that have been made to access the service or device. Good services will generally not silently allow more than a dozen guesses before either locking an account down and/or notifying the owner. The primary threat in this situation is if you have used the same password across many services. This provides a limitation via social complexity.

Unlimited guessing applies to data and software that can be isolated and allow iterative guesses without end. This is the most important threat to password security and is where the mathematics described for sequences comes into play.

Even when passwords are encrypted on end services, that data is stored somewhere and can be stolen and copied onto many computers used to guess potentially billions of character combinations per second. The primary limitations here are the complexity of the password and the complexity of the algorithms used for encrypting the password. These sorts of attacks could be left running for months attempting guesses and are embarrassingly parallelizable.

It is this last attack vector that is key for choosing a randomized password. 

Suppose you have picked a randomized, unique password using a password manager, with 92 symbol choices and 8 characters, or ${\sf 92^{8}}$ possibilities. With GPU computing hardware in 2012 computing 350 billion guesses per second across 25 boards, the total number of seconds to exhaust all possibilities would take: $$t = \frac{92^{8}}{350\times10^{9}} \approx 5.2\text{ hours}$$

However, it is important to keep in mind that after only half of the time, there is a 50% chance of one of the guesses being correct or about 2.5 hours. Push the password up to 15 characters and the timing is: $$t = \left(\frac{1}{2}\right)\left(\frac{92^{15}}{350\times10^{9}}\right) \approx 25,921,330,490 \text{ years}$$

This also assumes that current encryption techniques do not fall prey to some other attack on how the encryption works and shortens the time required to determine what the password is.

Password sniffing can occur whenever you type your password or copy-paste it, either because of a device or software recording your keystrokes. That is another and equally depressing topic.

Related Content