Patterns & Character Classes

A regular expression is more than a string of literal characters — its power comes from a small vocabulary of special symbols that describe shapes of text rather than exact words. Once you know how character classes, anchors, and quantifiers fit together, you can express ideas like “a five-digit number at the start of a line” or “one or more letters followed by an optional dash” in a few terse characters. This page walks through the building blocks of a pattern and how to escape them when you mean the literal symbol.

Metacharacters

Most characters in a regex match themselves: cat matches the three letters c, a, t. Metacharacters are the exceptions — symbols that carry special meaning. The ones you will meet constantly are . ^ $ * + ? ( ) [ ] { } | \ /. The dot . is the most common: it matches any single character except a newline.

console.log(/c.t/.test("cat")); // true
console.log(/c.t/.test("cut")); // true
console.log(/c.t/.test("ct"));  // false (dot needs one char)

Output:

true
true
false

Character classes

A character class, written with square brackets [], matches exactly one character out of a set you define. Ranges with a hyphen keep the set compact, and a leading caret ^ negates the class so it matches anything not listed.

console.log(/[aeiou]/.test("sky"));    // false (no vowels)
console.log(/[a-z]/.test("Hello"));    // true (the lowercase letters)
console.log(/gr[ae]y/.test("grey"));   // true (grey or gray)
console.log(/[^0-9]/.test("123a"));    // true (the non-digit 'a')

Output:

false
true
true
true

JavaScript also ships shorthand classes for the sets you reach for most. Each has an uppercase form that negates it.

Shorthand	Matches	Equivalent
`\d`	A digit	`[0-9]`
`\D`	A non-digit	`[^0-9]`
`\w`	A word character	`[A-Za-z0-9_]`
`\W`	A non-word character	`[^A-Za-z0-9_]`
`\s`	Whitespace (space, tab, newline)	`[ \t\r\n\f\v]`
`\S`	Non-whitespace	`[^ \t\r\n\f\v]`
`.`	Any char except newline	—

const id = "user_42";
console.log(/\w+/.test(id));         // true
console.log("a b\tc".match(/\s/g));  // [ ' ', '\t' ]

Output:

true
[ ' ', '\t' ]

Inside a character class, most metacharacters lose their special meaning. [.+*] matches a literal dot, plus, or asterisk — no escaping needed. The exceptions are ], \, ^ (at the start), and - (between two characters).

Anchors and boundaries

Anchors do not match characters; they match positions. ^ asserts the start of the string (or line, with the m flag) and $ asserts the end. The word boundary \b matches the edge between a word character and a non-word character, which is perfect for matching whole words.

console.log(/^https/.test("https://x")); // true (starts with https)
console.log(/\.com$/.test("a.com"));     // true (ends with .com)
console.log(/\bcat\b/.test("the cat"));  // true (whole word)
console.log(/\bcat\b/.test("category")); // false (cat is part of a word)

Output:

true
true
true
false

Quantifiers

A quantifier says how many times the preceding element may repeat. They turn single-character matches into flexible-length ones.

Quantifier	Meaning
`*`	Zero or more
`+`	One or more
`?`	Zero or one (optional)
`{n}`	Exactly n
`{n,}`	n or more
`{n,m}`	Between n and m

console.log(/colou?r/.test("color"));      // true (u optional)
console.log(/\d{3}-\d{4}/.test("555-1234")); // true (3 then 4 digits)
console.log(/a+/.test("aaa"));             // true (one or more 'a')
console.log(/go*gle/.test("ggle"));        // true (zero 'o' allowed)

Output:

true
true
true
true

Greedy vs lazy matching

By default quantifiers are greedy: they grab as much text as possible while still allowing the overall match to succeed. Adding a ? after a quantifier makes it lazy, matching as little as possible. This difference matters most when scanning markup or delimited text.

const html = "<b>one</b><b>two</b>";

console.log(html.match(/<b>.*<\/b>/)[0]);  // greedy: spans both tags
console.log(html.match(/<b>.*?<\/b>/)[0]); // lazy: stops at first </b>

Output:

<b>one</b><b>two</b>
<b>one</b>

A greedy .* that backtracks over a long string can be slow and can match far more than you intended. When you want the smallest match, reach for the lazy .*? or a more specific class like [^<]*.

Escaping special characters

To match a metacharacter literally, prefix it with a backslash. To match a real dot, write \.; for a literal ?, write \?. When you build a pattern from untrusted input, escape it programmatically so user text cannot inject regex syntax.

const escapeRegex = (s) => s.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");

const price = "$9.99";
const safe = new RegExp(escapeRegex(price));
console.log(safe.test("It costs $9.99")); // true
console.log(/\$\d+\.\d{2}/.test("$9.99")); // true (escaped manually)

Best Practices

Reach for shorthand classes (\d, \w, \s) over manual ranges — they are shorter and convey intent.
Anchor patterns with ^ and $ when validating an entire value, not just a substring.
Use \b to match whole words and avoid accidental matches inside longer words.
Prefer specific quantifiers like {2,4} over an open-ended + when you know the expected length.
Default to lazy quantifiers (*?, +?) when matching between delimiters to prevent runaway greedy matches.
Always escape user-supplied text before placing it in a pattern to avoid bugs and injection.
Remember that most metacharacters are inert inside [], so character classes rarely need escaping.

Patterns & Character Classes

Metacharacters

Character classes

Anchors and boundaries

Quantifiers

Greedy vs lazy matching

Escaping special characters

Best Practices

Related Topics