Regex cheatsheet¶

Quick reference for regular expressions, with grep -P and sed examples.

Character classes¶

Pattern	Matches
`[abc]`	a, b, or c
`[^abc]`	anything except a, b, c
`[a-z]`	any lowercase letter
`[0-9]`	any digit (same as `\d` in PCRE)
`\w`	word character `[a-zA-Z0-9_]`
`\s`	whitespace (space, tab, newline)
`\d`	digit `[0-9]`
`.`	any character except newline

Anchors¶

Pattern	Meaning
`^`	Start of line
`$`	End of line
`\b`	Word boundary
`\A`	Start of string (PCRE)
`\Z`	End of string (PCRE)

Quantifiers¶

Pattern	Meaning
`*`	0 or more (greedy)
`+`	1 or more (greedy)
`?`	0 or 1
`{n}`	exactly n
`{n,m}`	between n and m
`*?` `+?`	lazy (shortest match)

Groups and backreferences¶

Capturing group — referenced as \1, \2, etc.:

echo "2024-01-15" | sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3.\2.\1/'
# → 15.01.2024

Non-capturing group (group without a backreference):

(?:foo|bar)+

Named group (PCRE):

(?P<year>[0-9]{4})-(?P<month>[0-9]{2})

Lookahead and lookbehind¶

Pattern	Meaning
`foo(?=bar)`	`foo` followed by `bar` (positive lookahead)
`foo(?!bar)`	`foo` NOT followed by `bar` (negative lookahead)
`(?<=@)\w+`	word after `@` (positive lookbehind)
`(?<!un)\w+ed`	words ending in `-ed` not preceded by `un`

# extract domain from email — everything after @
echo "user@example.com" | grep -Po '(?<=@)[^>]+'
# → example.com

Greedy vs lazy¶

Greedy — takes as much as possible:

echo '<b>bold</b> and <i>italic</i>' | grep -Po '<.*>'
# → <b>bold</b> and <i>italic</i>   (whole line)

Lazy (?) — takes as little as possible:

echo '<b>bold</b> and <i>italic</i>' | grep -Po '<.*?>'
# → <b>
# → </b>
# → <i>
# → </i>

grep -P examples¶

# find lines with an IPv4 address
grep -P '\b(\d{1,3}\.){3}\d{1,3}\b' access.log

# lines where "error" appears but NOT "404"
grep -P '(?i)error(?!.*404)' app.log

# extract HTTP status codes from an nginx log
grep -Po '(?<= )\d{3}(?= \d)' access.log | sort | uniq -c | sort -rn

sed examples¶

Replace first occurrence per line:

sed 's/foo/bar/' file.txt

Replace ALL occurrences per line (g flag):

sed 's/foo/bar/g' file.txt

Delete lines matching a pattern:

sed '/^#/d' config.ini       # strip comment lines

In-place edit (GNU sed):

sed -i 's/localhost/127.0.0.1/g' config.txt

Extract lines between two markers:

sed -n '/BEGIN/,/END/p' file.txt

See also: Linux CLI recipes for find … | xargs grep to search across multiple files.