Flags
d- Generate indices for substring matches.
g- Global search.
i- Case-insensitive search.
m- Allows
^and$to match newline characters. s- Allows
.to match newline characters. u- "Unicode"; treat a pattern as a sequence of Unicode code points.
y- Perform a "sticky" search that matches starting at the current position in the target string.
Groups and backreferences
(x)- Capturing group
(?<Name>x)- Named capturing group
(?:x)- Non-capturing group
\N- Where "N" is a positive integer. A back reference to the last substring matching the N parenthetical in the regular expression (counting left parentheses).
\k<Name>- A back reference to the last substring matching the named capture group specified by
<Name>.
Character classes
[xyz][a-c]- A character class.
[^xyz][^a-c]- A negated or complemented character class.
.- Matches any single character except line terminators:
\n,\r,\u2028or\u2029.
Inside a character class, the dot loses its special meaning and matches a literal dot.
Note that the m multiline flag doesn't change the dot behavior. So to match a pattern across multiple lines, the character class[^]can be used — it will match any character including newlines.
Thes"dotAll" flag allows the dot to also match line terminators. \d- Matches any digit. Equivalent to
[0-9]. \D- Matches any character that is not a digit. Equivalent to
[^0-9]. \w- Matches any alphanumeric character from the basic Latin alphabet, including the underscore. Equivalent to
[A-Za-z0-9_]. \W- Matches any character that is not a word character from the basic Latin alphabet. Equivalent to
[^A-Za-z0-9_]. \s- Matches a single white space character, including space, tab, form feed, line feed, and other Unicode spaces.
\S- Matches a single character other than white space.
\t- Matches a horizontal tab.
\r- Matches a carriage return.
\n- Matches a linefeed.
\v- Matches a vertical tab.
\f- Matches a form-feed.
[\b]- Matches a backspace.
\0- Matches a NUL character. Do not follow this with another digit.
\cX- Matches a control character using caret notation, where "X" is a letter from A–Z (corresponding to code points
U+0001–U+001A). \xhh- Matches the character with the code hh (two hexadecimal digits).
\uhhhh- Matches a UTF-16 code-unit with the value hhhh (four hexadecimal digits).
\u{hhhh}or\u{hhhhh}- (Only when the
uflag is set.) Matches the character with the Unicode valueU+hhhhorU+hhhhh(hexadecimal digits). \p{UnicodeProperty},\P{UnicodeProperty}- Matches a character based on its Unicode character properties.
Assertions
^- Matches the beginning of input. If the multiline flag is set to true, also matches immediately after a line break character.
$- Matches the end of input. If the multiline flag is set to true, also matches immediately before a line break character.
\b- Matches a word boundary. This is the position where a word character is not followed or preceded by another word-character, such as between a letter and a space. Note that a matched word boundary is not included in the match.
\B- Matches a non-word boundary.
x(?=y)- Lookahead assertion: Matches "x" only if "x" is followed by "y".
x(?!y)- Negative lookahead assertion: Matches "x" only if "x" is not followed by "y".
(?<=y)x- Lookbehind assertion: Matches "x" only if "x" is preceded by "y".
(?<!y)x- Negative lookbehind assertion: Matches "x" only if "x" is not preceded by "y".
Quantifiers
x*- Matches the preceding item "x" 0 or more times.
x+- Matches the preceding item "x" 1 or more times.
x?- Matches the preceding item "x" 0 or 1 times.
x{n}- Where "n" is a positive integer, matches exactly "n" occurrences of the preceding item "x".
x{n,}- Where "n" is a positive integer, matches at least "n" occurrences of the preceding item "x".
x{n,m}- Where "n" is 0 or a positive integer, "m" is a positive integer, and
m > n, matches at least "n" and at most "m" occurrences of the preceding item "x". x*?x+?x??x{n}?x{n,}?x{n,m}?- The
?character after the quantifier makes the quantifier "non-greedy": meaning that it will stop as soon as it finds a match.
Source: MDN Web Docs