You need to validate an email address, extract all URLs from a block of text, and match ISO 8601 dates — three distinct tasks, three regex patterns. Here are ten battle-tested patterns that cover the most common developer needs, with explanations and the gotchas that burn people who copy-paste without reading.
1. Email Validation (RFC 5321 Practical Subset)
^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$
Matches: user@example.com, user.name+tag@sub.domain.co.uk
Rejects: user@, @example.com, user@.com
Gotcha: RFC 5321 technically allows quoted strings, comments, and IP address literals in email addresses — none of which you want to accept. This pattern covers 99.9% of real-world addresses. The full RFC-compliant regex is 6,318 characters and still fails on some edge cases. Use this pattern, not that one.
Performance note: Always add ^ and $ anchors. Without them, the engine scans the full string for any substring match, which is 5-10x slower on long inputs.
2. URL Extraction from Text
https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_+.~#?&/=]*)
Matches: https://www.example.com/path?query=1#anchor
Rejects: ftp:// (not HTTP/HTTPS), bare domains without protocol
Gotcha: URLs at the end of sentences get a trailing period captured. Post-process matches: url.replace(/[.,;!?]$/, ''). Also, this doesn't match mailto: or protocol-relative URLs (//example.com) — extend the protocol part if needed.
3. ISO 8601 Date (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Matches: 2026-03-23, 2026-12-31
Rejects: 2026-13-01, 2026-03-32, 26-03-23
Gotcha: This validates format and range but not calendar validity — 2026-02-30 passes. For real date validation, parse with new Date() or a date library after the format check. The regex is a format gate, not a calendar.
4. Phone Numbers (E.164 International Format)
^\+[1-9]\d{6,14}$
Matches: +14155552671, +447911123456
Rejects: 4155552671 (missing +), +1, +123456789012345 (too long)
Gotcha: This validates E.164 format — the standard format for storing and transmitting phone numbers. If users enter (415) 555-2671 or 415-555-2671, strip all non-digit characters first (str.replace(/[^\d+]/g, '')), then validate. Never store formatted phone numbers; store E.164 and format on display.
5. IPv4 Address
^(?:(?:25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d\d?)$
Matches: 192.168.1.1, 10.0.0.0, 255.255.255.255
Rejects: 256.0.0.1, 192.168.1, 192.168.1.1.1
Gotcha: The breakdown: 25[0-5] matches 250-255, 2[0-4]\d matches 200-249, [01]?\d\d? matches 0-199. The alternation order matters — most regex engines use leftmost-first alternation, so 25[0-5] must come before 2[0-4]\d must come before the general case.
6. Semantic Version (semver)
^(0|[1-9]\d*)\.(0|[1-9]\d*)\.(0|[1-9]\d*)(?:-((?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+([0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$
Matches: 1.0.0, 2.1.3-alpha.1, 3.0.0+build.123
Rejects: 1.0, 01.0.0 (leading zeros), 1.0.0.0
Gotcha: This is the official semver.org regex. It prohibits leading zeros in numeric identifiers (i.e., 01.0.0 is invalid per the spec, even though many tools accept it). Test against the semver specification before deploying.
7. Hex Color Code
^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$
Matches: #FF0000, #f00, #1a2b3c
Rejects: FF0000 (missing #), #GG0000, #FF00 (4 digits)
Gotcha: This doesn't match CSS rgba(), hsl(), or 8-digit hex with alpha (#FF0000FF). If you need to validate any CSS color value, you'll need a more complex pattern or a dedicated CSS parser.
8. Slug (URL-safe Lowercase String)
^[a-z0-9]+(?:-[a-z0-9]+)*$
Matches: my-blog-post, product-123, hello
Rejects: My-Post (uppercase), -starts-with-dash, ends-with-dash-, double--dash
Gotcha: This enforces the most common slug convention — lowercase alphanumeric with single hyphens, no leading/trailing hyphens. Some systems allow underscores; adjust the pattern to [a-z0-9_]+(?:-[a-z0-9_]+)* if needed.
9. Credit Card Number (Luhn-checkable Format)
^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|6(?:011|5[0-9]{2})[0-9]{12})$
Matches: Visa (starts with 4), Mastercard (51-55), Amex (34/37), Discover (6011/65) Gotcha: Format validation only. Always run the Luhn algorithm after the regex check — the regex confirms structure; Luhn confirms the check digit. Never log, store, or transmit raw card numbers in your app code.
10. Strong Password Validation
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
Requires: Minimum 8 characters, one lowercase, one uppercase, one digit, one special character (@$!%*?&)
Gotcha: This is a common pattern but not actually a security best practice. Length is more important than character diversity — a 20-character lowercase passphrase beats an 8-character "complex" password every time (77 bits vs 52 bits of entropy). Use this for legacy UI requirements; for new systems, enforce minimum length (12+) and check against breach databases instead.
Greedy vs. Lazy: The Performance Gotcha
The .* quantifier is greedy by default — it matches as much as possible. In patterns like <.*> on the string <div>hello</div>, greedy matching captures the entire <div>hello</div>, not just <div>. Add ? to make it lazy: <.*?> captures <div> then </div> as separate matches.
Greedy quantifiers on large inputs can cause catastrophic backtracking. The pattern (a+)+$ on the string aaaaaaaaaaaaaaab will trigger exponential backtracking in most regex engines. If you're running user-supplied regex patterns or regex on large untrusted inputs, add a timeout or use a linear-time engine (RE2, Rust's regex crate).
Test all patterns with the Regex Tester before shipping — paste your pattern, add edge cases as test strings, and verify matching behavior before it hits production.
Anchors: The Single Most Important Concept
Every pattern above uses ^ (start of string) and $ (end of string) anchors. Without them, a regex engine searches for the pattern anywhere within the string — a "match" check becomes a "contains" check.
For validation, you almost always want anchored patterns:
\d+on "abc123def" matches "123" — the string contains digits^\d+$on "abc123def" doesn't match — the entire string is not all digits
Without anchors, your email validation accepts "notanemail@but wait this isn't an email" because the pattern finds wait@but as a match somewhere in the string. Anchors prevent this class of bug entirely.
In Python with re.fullmatch(), anchors are implicit — the function requires the entire string to match. In JavaScript, String.prototype.test() requires explicit anchors for whole-string matching.
Flags That Change Behavior
Most regex implementations support modifier flags that change how the engine processes patterns:
/i — case insensitive: [a-z]+ with the i flag matches Hello, HELLO, and hello. Equivalent to writing [a-zA-Z]+ without the flag. Prefer the flag — it's more readable.
/g — global: Finds all matches in a string instead of stopping at the first. Essential for extraction tasks (finding all URLs in a document). With JavaScript's String.prototype.match(), the g flag returns all matches as an array; without it, only the first match is returned.
/m — multiline: Changes ^ and $ to match start and end of each line rather than start and end of the entire string. Critical when processing log files or multi-line text where you want to match line by line.
/s — dotall (single-line): Makes . match newline characters. Without this flag, . matches any character except \n. With it, . matches everything including newlines — useful for patterns that span multiple lines.
When Regex Is the Wrong Tool
Regex is fast for pattern matching but gets unmaintainable quickly for recursive or deeply nested structures. Don't use regex to:
- Parse HTML or XML — use an HTML/XML parser. HTML is not a regular language; regex-based HTML parsing breaks on nested tags, CDATA sections, and attribute variations.
- Parse URLs for routing logic — use a proper URL parser that handles encoding, ports, and path normalization correctly.
- Validate complex structured data like JSON, CSV with quoted fields, or YAML — use a dedicated parser for each format.
The test: if you're spending more than 30 minutes writing a regex and it's still not handling all your edge cases, switch to a parser or split the problem into multiple simpler patterns with application-level logic between them.
Regex Tester & Explainer
Test regex patterns against real input with live match highlighting and capture group breakdown.