Patrick Elsen

Posted on Aug 14, 2022

Passgen: A password generator that uses a regex-like syntax to create secure passwords of any shape.

#password #secure #regex #security

I have a bit of a love-hate relationship with passwords. On the one hand, they are necessary for authenticating with services, and for decrypting drives. On the other hand, losing them means losing data or access to accounts. Choosing simpler passwords or reusing them across accounts makes dealing with them easier, but compromises their security.

Ideally, passwords should be as long as possible¹. While previously, the advice was to use password policies (such as enforcing the use of special characters), current best practice is to avoid that², and instead to encourage long passwords. Popular computer comic XKCD #936 notes that long word-based passwords have more strength and memorability than special-character-based passwords:

Based on this, I wanted to set out to build a tool that could help with generating random, strong passwords with different shapes. I wanted it to be very flexible, so that you can easily create passwords with different shapes (like, 16 alphanumeric characters, or 4 english words).

Regular Expressions

I'm also a very happy used of regular expressions. This is a language that is often used for filtering or searching text-based content. For example, the regular expression [a-z]{8} will match on any input that consists of exactly eight lowercase latin characters. If you are not familiar with regular expressions, here is a table that showcases their syntax:

Description	Regex
The literal string `abc`	`abc`
Either an `a`, a `b`, or a `c`	`[abc]`
Any lowercase latin character	`[a-z]`
16 lowercase latin characters	`[a-z]{16}`

Normally, regular expressions are used in code to validate input. For example, you might use a regular expression like [a-z0-9.]{3,16}@yourcompany.com to validate if something is a valid company email address.

What if I could build a tool that you can give a regular expression, and it evaluates it in reverse, spitting out a string that matches it rather than determining if a given string matches?

Out of that idea, passgen was born. It started out as a small single-file C project, and evolved into something I consider actually useful. It supports a small subset of the regular expression syntax and will produce a random string matching that. For example:



# generate a password with eight lower-case latin characters
$ passgen "[a-z]{8}"
nsaaemni
# generate a password with twelve alphanumeric latin characters
$ passgen "[a-zA-Z0-9]{12}"
Lp7FBOldLhJC

Passgen is fully unicode-aware, meaning that you can generate passwords for other languages, too. Here's some examples of that in action:



# password with german characters
$ passgen "[a-zßäüöA-ZÄÜÖ0-9]{8}"
6Mu6cüYI
# password with sixteen japanese characters
$ passgen "[\u{3040}-\u{309F}]{16}"
しがぢゐめまゔょぎっきす゚だぃぷ

Wordlists

One of the reasons I built Passgen was because I wanted to create more memorable passwords for myself. Using completely random letters makes for passwords that are short, but not memorable, as the XKCD earlier points out. So passgen has the ability to read wordlists and pick random words from these.

Example	Pattern
Random word from wordlist	`\w{english}`
Six random word from wordlist, separated by a hyphen	`(\w{english}-){5}\w{english}`
Six random word from wordlist, separated by a hyphen, followed by a random number	`(\w{english}-){5}\w{english}[0-9]{1,2}`

For example, if you are on Debian or Ubuntu, and you have the wamerican package installed, you will have a wordlist available at /usr/share/dict/american-english. If you are on another platform, there are likely similar preinstalled wordlists you can use. Using the --wordlist command-line flag lets you load this wordlist and then reference it in the pattern.



# pick 6 random words
$ passgen --wordlist english:/usr/share/dict/words "(\w{english}-){5}\w{english}"
vichyssoise-outstretching-requisite-weekended-homogenizes-calypsos
# pick 6 random words, and a one or two digit number
$ passgen --wordlist english:/usr/share/dict/words "(\w{english}-){5}\w{english}[0-9]{1,2}"
Karamazov-tasselling-pianos-shirr-devoutly-bullrings0

Markov Chain

While I think it is a good idea to use random words from a wordlist, I experimented with ways to come up with passwords that are similarly memorable (because they are pronounceable), but not neccessarily straight from a wordlist.

The idea here is using a Markov Chain to "learn" which letters frequently occur together from a dictionary file, and then generating new, random words from that. The outcome is words that are pronounceable, but have ideally more entropy, and thus are harder to crack.

To use this feature, use \m{wordlist} rather than \w{wordlist} in the pattern, it's really as simple as that!



# generate five words from english markov chain
$ passgen --wordlist english:/usr/share/dict/words "(\m{english}-){4}\m{english}"
dam-bloodling-tempouthed-distnuminest-muchee
# generate four words from english markov chain and a number
$ passgen --wordlist english:/usr/share/dict/words "(\m{english}-){3}\m{english}[0-9]{1,2}"
lariticitude's-lieu's-gadorayons-flers44

Complexity

Anytime passgen has a choice between multiple options, it knows how many options there are and it can thus keep track of how "random" your password is. This means that it can tell you how many bits of randomness your password contains, which can tell you something about how difficult it would be for an attacker to crack your password, if the attacker knew the pattern that you used to generate it. For example, if your password has 80 bits of entropy, it would take on average $2^{80}=1208925819614629174706176$ attempts to guess it.

Use the --complexity flag to tell passgen to output it's estimate of the complexity. Here's some examples:



# eight lowercase alphabetic characters
$ passgen --complexity "[a-z]{8}"
entropy: 37.603518 bits
qhziaohr
# twelve alphanumeric characters
$ passgen --complexity "[a-zA-Z0-9]{12}"
entropy: 71.450356 bits
IoSNTUNU3M1Q
# six random dictionary words
$ passgen --complexity --wordlist english:/usr/share/dict/words "(\w{english}-){5}\w{english}"
entropy: 100.025099 bits
smirked-Ila-ounces-circumstance's-bedpan-process
# six random markov chain words
$ passgen --complexity --wordlist english:/usr/share/dict/words "(\m{english}-){5}\m{english}"
entropy: 125.801760 bits
confabriquing-mas-Onei-scrite-elaw-inast

It is important to keep in mind that this feature is not perfect. If you give passgen a pattern like (a|a){5}, which technically has zero bits of entropy (it always produces aaaaa), passgen still calculates a nonzero complexity because passgen makes choices (between the first a and the second a).

Randomness

Passgen will preferentially use the Kernel's secure random number generator as a source of randomness, which is getrandom() on Linux and arc4random_buf() on macOS. It falls back to using /dev/urandom. However, the source of random data is configurable, it can also use a seeded pseudorandom number generator, but at this time only the insecure xorshift algorithm is implemented for that.

Trivia

Passgen uses libseccomp³ on amd64 platforms to sandbox itself, restricting it to only a small amount of allowed syscalls.

It can also output as JSON by passing the --json option.

Try it out

If you want to try it out, the source code is available here, and the releases page has signed releases for a bunch of different architectures and platforms. Please be so kind as to file an issue if something does not work, as I don't have a way of testing all of the builds currently.

If you are on Debian or Ubuntu, the .deb releases will be useful for you, because they let you install (and remove) passgen with APT easily.

Summary

The website has an overview of the syntax that is available to use, but here is a quick summary.

Example	Pattern
Generate the literal string `abc`	`passgen "abc"`
Pick a character from `a`, `b`, `c`	`passgen "[abc]"`
Pick a character from the range `a-z`	`passgen "[a-z]"`
Repeat `a` four times	`passgen "a{4}"`
Pick four lowercase characters	`passgen "[a-z]{4}"`

To generate passwords, you have a number of constructs available to use:

Verbatim characters. The pattern abc generates the "password" abc. They are simply passed through.
Characters sets, enclosed in brackets. You can either list all allowed characters or use ranges of characters, which are inclusive. Examples: [abc], [a-zA-Z0-9], [0123456789abcdef].
Groups, enclosed in paretheses. Groups let you list one or more possible patterns. Examples: (word) generates word, (this|that) generates this or that, ([a-z]|[0-9]) generates a lowercase latin character ([a-z]) or a number ([0-9]).
Repeat. Using braces lets you repeat whatever was before. You can either specify a fixed number or a lower and an upper bound, separated by a comma. Examples: [abc]{3}, [a-z]{9,12}, (this|that){2}.

If this syntax is too complicated for you, the passgen binary also comes with some pattern presets, that you can use with the -p flag. These are named after the software that generates passwords matching these patterns.



# generated by older versions of safari
$ passgen -p apple1
asG-mIQ-7jz-LAT
# generated by current safari
$ passgen -p apple2
lHODYM-MgdmfG-llo5kD
# generated by firefox
$ passgen -p firefox
iBpDBlTg8bX051H

Development

If you are a real nerd, you may want to compile it for yourself. You need some dependencies:

CMake
C compiler (gcc or clang)
Git

With those dependencies, you should be able to build it by cloning the repository, fetching the submodules (this is needed for libutf8proc⁴).



git clone https://gitlab.com/xfbs/passgen
git submodule update --init
mkdir passgen/build
cd passgen/build
cmake ..
make -j8

If you want to understand how pieces of it work, there is also some documentation available that is built off the master branch.

Tests

Passgen has a number of unit tests built-in, you can run these with make test. You can also run these tests under valgrind if you have it installed, this will let you make sure that there are no memory leaks.



$ valgrind ./src/test/passgen-test -v

Support

Your support is appreciated. If you run passgen, and something doesn't work, please let me know. You can file an issue in the repository and I will try to fix it.

If you enjoy passgen, and you find bugs, have suggestions, or want to help get it into any package repositories, please don't hesitate to reach out to me, I appreciate any help. I hope that it can be useful to other people.