Characters, metacharacters, and metasequences

Flash Player 9 and later, Adobe AIR 1.0 and later

The simplest regular expression is one that matches a sequence of characters, as in the following example:

var pattern:RegExp = /hello/;

However, the following characters, known as metacharacters , have special meanings in regular expressions:

^ $ \ . * + ? ( ) [ ] { } |

For example, the following regular expression matches the letter A followed by zero or more instances of the letter B (the asterisk metacharacter indicates this repetition), followed by the letter C:

/AB*C/

To include a metacharacter without its special meaning in a regular expression pattern, you must use the backslash ( \ ) escape character. For example, the following regular expression matches the letter A followed by the letter B, followed by an asterisk, followed by the letter C:

var pattern:RegExp = /AB\*C/;

A metasequence, like a metacharacter, has special meaning in a regular expression. A metasequence is made up of more than one character. The following sections provide details on using metacharacters and metasequences.

About metacharacters

The following table summarizes the metacharacters that you can use in regular expressions:

Metacharacter

Description

^ (caret)

Matches at the start of the string. With the m ( multiline ) flag set, the caret matches the start of a line as well (see Flags and properties ). Note that when used at the start of a character class, the caret indicates negation, not the start of a string. For more information, see Character classes .

$ (dollar sign)

Matches at the end of the string. With the m ( multiline ) flag set, $ matches the position before a newline ( \n ) character as well. For more information, see Flags and properties .

\ (backslash)

Escapes the special metacharacter meaning of special characters.

Also, use the backslash character if you want to use a forward slash character in a regular expression literal, as in /1\/2/ (to match the character 1, followed by the forward slash character, followed by the character 2).

. (dot)

Matches any single character.

A dot matches a newline character ( \n ) only if the s ( dotall ) flag is set. For more information, see Flags and properties .

* (star)

Matches the previous item repeated zero or more times.

For more information, see Quantifiers .

+ (plus)

Matches the previous item repeated one or more times.

For more information, see Quantifiers .

? (question mark)

Matches the previous item repeated zero times or one time.

For more information, see Quantifiers .

( and )

Defines groups within the regular expression. Use groups for the following:

  • To confine the scope of the | alternator: /(a|b|c)d/

  • To define the scope of a quantifier: /(walla.){1,2}/

  • In backreferences. For example, the \1 in the following regular expression matches whatever matched the first parenthetical group of the pattern:

  • /(\w*) is repeated: \1/

For more information, see Groups .

[ and ]

Defines a character class, which defines possible matches for a single character:

/[aeiou]/ matches any one of the specified characters.

Within character classes, use the hyphen ( - ) to designate a range of characters:

/[A-Z0-9]/ matches uppercase A through Z or 0 through 9.

Within character classes, insert a backslash to escape the ] and

- characters:

/[+\-]\d+/ matches either + or - before one or more digits.

Within character classes, other characters, which are normally metacharacters, are treated as normal characters (not metacharacters), without the need for a backslash:

/[$]/ £ matches either $ or £.

For more information, see Character classes .

| (pipe)

Used for alternation, to match either the part on the left side or the part on the right side:

/abc|xyz/ matches either abc or xyz .

About metasequences

Metasequences are sequences of characters that have special meaning in a regular expression pattern. The following table describes these metasequences:

Metasequence

Description

{ n }

{ n ,}

and

{ n , n }

Specifies a numeric quantifier or quantifier range for the previous item:

/A{27}/ matches the character A repeated 27 times.

/A{3,}/ matches the character A repeated 3 or more times.

/A{3,5}/ matches the character A repeated 3 to 5 times.

For more information, see Quantifiers .

\b

Matches at the position between a word character and a nonword character. If the first or last character in the string is a word character, also matches the start or end of the string.

\B

Matches at the position between two word characters. Also matches the position between two nonword characters.

\d

Matches a decimal digit.

\D

Matches any character other than a digit.

\f

Matches a form feed character.

\n

Matches the newline character.

\r

Matches the return character.

\s

Matches any white-space character (a space, tab, newline, or return character).

\S

Matches any character other than a white-space character.

\t

Matches the tab character.

\u nnnn

Matches the Unicode character with the character code specified by the hexadecimal number nnnn . For example, \u263a is the smiley character.

\v

Matches a vertical feed character.

\w

Matches a word character ( AZ –, az –, 0-9 , or _ ). Note that \w does not match non-English characters, such as é , ñ , or ç .

\W

Matches any character other than a word character.

\\x nn

Matches the character with the specified ASCII value, as defined by the hexadecimal number nn .

// Ethnio survey code removed