Skip to content

Regex cheatsheet

Quick reference for regular expressions, with grep -P and sed examples.

Character classes

Pattern Matches
[abc] a, b, or c
[^abc] anything except a, b, c
[a-z] any lowercase letter
[0-9] any digit (same as \d in PCRE)
\w word character [a-zA-Z0-9_]
\s whitespace (space, tab, newline)
\d digit [0-9]
. any character except newline

Anchors

Pattern Meaning
^ Start of line
$ End of line
\b Word boundary
\A Start of string (PCRE)
\Z End of string (PCRE)

Quantifiers

Pattern Meaning
* 0 or more (greedy)
+ 1 or more (greedy)
? 0 or 1
{n} exactly n
{n,m} between n and m
*? +? lazy (shortest match)

Groups and backreferences

Capturing group — referenced as \1, \2, etc.:

echo "2024-01-15" | sed -E 's/([0-9]{4})-([0-9]{2})-([0-9]{2})/\3.\2.\1/'
# → 15.01.2024

Non-capturing group (group without a backreference):

(?:foo|bar)+

Named group (PCRE):

(?P<year>[0-9]{4})-(?P<month>[0-9]{2})

Lookahead and lookbehind

Pattern Meaning
foo(?=bar) foo followed by bar (positive lookahead)
foo(?!bar) foo NOT followed by bar (negative lookahead)
(?<=@)\w+ word after @ (positive lookbehind)
(?<!un)\w+ed words ending in -ed not preceded by un
# extract domain from email — everything after @
echo "user@example.com" | grep -Po '(?<=@)[^>]+'
# → example.com

Greedy vs lazy

Greedy — takes as much as possible:

echo '<b>bold</b> and <i>italic</i>' | grep -Po '<.*>'
# → <b>bold</b> and <i>italic</i>   (whole line)

Lazy (?) — takes as little as possible:

echo '<b>bold</b> and <i>italic</i>' | grep -Po '<.*?>'
# → <b>
# → </b>
# → <i>
# → </i>

grep -P examples

# find lines with an IPv4 address
grep -P '\b(\d{1,3}\.){3}\d{1,3}\b' access.log

# lines where "error" appears but NOT "404"
grep -P '(?i)error(?!.*404)' app.log

# extract HTTP status codes from an nginx log
grep -Po '(?<= )\d{3}(?= \d)' access.log | sort | uniq -c | sort -rn

sed examples

Replace first occurrence per line:

sed 's/foo/bar/' file.txt

Replace ALL occurrences per line (g flag):

sed 's/foo/bar/g' file.txt

Delete lines matching a pattern:

sed '/^#/d' config.ini       # strip comment lines

In-place edit (GNU sed):

sed -i 's/localhost/127.0.0.1/g' config.txt

Extract lines between two markers:

sed -n '/BEGIN/,/END/p' file.txt

See also: Linux CLI recipes for find … | xargs grep to search across multiple files.