What is a Regular Expression?
Regex explained: character classes, quantifiers, anchors, groups, and practical patterns for everyday use.
A regular expression (regex) is a sequence of characters that defines a search pattern. You use regex to find, validate, or replace text: check that an email looks valid, extract all URLs from a page, or reformat a date string. Every major programming language has built-in regex support.
The syntax looks cryptic at first, but it follows a handful of rules. Once you know them, you can read most regex patterns in seconds.
Anatomy of a regex
A regex literal is written between two forward slashes, followed by optional flags:
This pattern matches a 7-digit number in the format 123-4567, from start to end of string.
Regex cheat sheet
Character classes
| Symbol | Matches | Example | Matches |
|---|---|---|---|
. | Any character (except newline) | a.c | "abc", "aXc", "a1c" |
\d | Digit 0–9 | \d+ | "0", "42", "2024" |
\D | Non-digit | \D | " ", "a", "!" |
\w | Word char (letter, digit, _) | \w+ | "hello", "a1_b" |
\W | Non-word char | \W | " ", "@", "-" |
\s | Whitespace (space, tab, newline) | \s+ | " ", "\t", "\n" |
[abc] | Any char in set | [aeiou] | "a", "e", "i", "o", "u" |
[^abc] | Any char not in set | [^0-9] | any non-digit |
[a-z] | Char range | [a-z]{3} | "abc", "xyz" |
Quantifiers
| Symbol | Meaning | Example | Matches |
|---|---|---|---|
* | 0 or more | bo* | "b", "bo", "boo" |
+ | 1 or more | ho+ | "ho", "hoo" (not "h") |
? | 0 or 1 | colou?r | "color", "colour" |
{n} | Exactly n | \d{4} | "2024", "0000" |
{n,} | n or more | \d{2,} | "12", "123", "9999" |
{n,m} | Between n and m | \w{2,5} | 2–5 word characters |
*? +? ?? | Lazy (shortest match) | <.*?> | each tag individually |
Anchors & boundaries
| Symbol | Meaning | Example | Matches |
|---|---|---|---|
^ | Start of string | ^Hello | "Hello world" (not "Say Hello") |
$ | End of string | end$ | "the end" (not "endless") |
\b | Word boundary | \bcat\b | "cat" but not "catch" |
\B | Non-word boundary | \Bcat | "catch" but not "cat" |
Groups & alternation
| Symbol | Meaning | Example | Matches |
|---|---|---|---|
(...) | Capturing group | (\d{4})-(\d{2}) | saves year and month |
(?:...) | Non-capturing group | (?:ab)+ | groups without saving |
(?=...) | Positive lookahead | \d(?= px) | "2" in "2 px" |
(?!...) | Negative lookahead | \d(?! px) | digit not followed by " px" |
a|b | Alternation (or) | cat|dog | "cat" or "dog" |
Flags
Flags follow the closing slash and modify how the pattern is applied.
| Flag | Name | Effect |
|---|---|---|
g | Global | Find all matches, not just the first |
i | Case-insensitive | Treat uppercase and lowercase as equal |
m | Multiline | ^ and $ match start/end of each line, not just the whole string |
s | Dot-all | . also matches newlines (\n) |
u | Unicode | Enable full Unicode matching (required for \p{} properties) |
Common patterns
Ready-to-use regex for the most frequent validation tasks.
user@example.comfirst.last@sub.domain.org@example.comuser@user @example.comhttps://example.comhttp://api.site.org/pathftp://example.comexample.com2024-01-312000-12-012024-13-0124-01-312024/01/31#fff#F97316#000000#gg0000F97316#1234542-73.14-0.53..51,000abcGreedy vs. lazy matching
By default, quantifiers are greedy — they match as much text as possible. Adding ? after a quantifier makes it lazy — it matches as little as possible.
.*Input: <b>bold</b> and <i>italic</i>
Pattern: <.*> → matches the entire string from first < to last >
.*?Same input.
Pattern: <.*?> → matches <b>, then </b>, then each tag individually