JavaScript Regular Expressions

Regular expressions (regex) are patterns used to match character combinations in strings. They provide powerful, flexible string searching, validation, and manipulation capabilities that would be extremely difficult with regular string methods.

RegExp Basics

Regular expressions are patterns that describe sequences of characters for matching text. They're used for validation (email, phone numbers), searching (finding all URLs in text), replacing (changing formats), and extracting (pulling data from strings). Regex is a mini-language within JavaScript specifically designed for pattern matching.

Regular expressions are created using literal notation with slashes /pattern/modifiers or the RegExp constructor: new RegExp("pattern", "modifiers"). Literal notation (/hello/) is more common and concise. Constructor syntax is useful when the pattern is dynamic or contains variables. Both create RegExp objects with methods for pattern matching.

The test() method tests whether a pattern exists in a string, returning true or false. pattern.test(string) is the simplest way to check if a string matches a pattern. For example, /hello/.test("hello world") returns true. test() is perfect for validation—checking if input matches required formats.

The exec() method executes a search for a match and returns detailed information including the matched text, index, and capturing groups. If no match is found, it returns null. pattern.exec(string) returns an array-like object with match details. For simple true/false checks, test() is simpler; for extracting matched content, use exec().

Strings have several methods that work with regex: match() finds matches and returns them as an array, search() returns the index of the first match, replace() replaces matched text with new text. These string methods make regex practical for real-world text processing tasks like formatting, cleaning, or transforming data.

// Creating RegExp
const pattern1 = /hello/;
const pattern2 = new RegExp("hello");

// test() method
console.log(pattern1.test("hello world")); // true
console.log(pattern1.test("goodbye"));     // false

// exec() method
const result = /world/.exec("hello world");
console.log(result[0]); // "world"

// String search()
console.log("hello world".search(/world/)); // 6

// String match()
const matches = "hello world".match(/o/g);
console.log(matches); // ["o", "o"]

// String replace()
const newStr = "hello world".replace(/world/, "JavaScript");
console.log(newStr); // "hello JavaScript"

RegExp Modifiers and Patterns

The i modifier performs case-insensitive matching, treating uppercase and lowercase as equivalent. /hello/i matches "hello", "HELLO", "Hello", etc. Without i, regex is case-sensitive by default. Use i when case doesn't matter, like matching user input where capitalization varies.

The g modifier performs global matching, finding all matches rather than stopping at the first. /a/g in "banana" finds all three "a"s. Without g, regex stops at the first match. Use g with match() to get all matches, or with replace() to replace all occurrences instead of just the first.

The m modifier enables multiline mode where ^ and $ match line beginnings/endings instead of string beginning/ending. Useful when processing multi-line text where you want to match patterns at the start or end of each line. Less commonly used than i and g.

Character classes like [abc] match any single character inside the brackets. [aeiou] matches any vowel, [0-9] matches any digit, [A-Z] matches uppercase letters. You can combine ranges: [a-zA-Z0-9] matches letters and digits. Character classes are fundamental building blocks of regex patterns.

Ranges like [0-9] match any digit from 0 to 9, [a-z] matches lowercase letters, [A-Z] matches uppercase. Ranges use a hyphen between the starting and ending characters. You can negate character classes with ^: [^0-9] matches anything that's NOT a digit.

Shorthand character classes provide concise patterns: \d matches any digit (same as [0-9]), \w matches word characters (letters, digits, underscore), \s matches whitespace (spaces, tabs, newlines). Their uppercase counterparts are negations: \D matches non-digits, \W matches non-word chars, \S matches non-whitespace. These shorthands make patterns more readable.

// Modifiers
console.log(/hello/i.test("HELLO")); // true (case-insensitive)

const text = "cat bat rat";
console.log(text.match(/at/g)); // ["at", "at", "at"] (global)

// Character classes
console.log(/[aeiou]/.test("hello")); // true
console.log(/[0-9]/.test("abc123")); // true

// Shorthand classes
console.log(/\d/.test("123"));   // true (digit)
console.log(/\w/.test("word"));  // true (word char)
console.log(/\s/.test("a b"));   // true (whitespace)

// Quantifiers
console.log(/a+/.test("aaaa"));   // true (one or more)
console.log(/a*/.test(""));       // true (zero or more)
console.log(/a?/.test(""));       // true (zero or one)
console.log(/a{3}/.test("aaa"));  // true (exactly 3)