Characters, metacharacters, and metasequences
Flash Player 9 and later, Adobe AIR 1.0 and
later
The simplest regular expression is
one that matches a sequence of characters, as in the following example:
var pattern:RegExp = /hello/;
However, the following characters, known as metacharacters
,
have
special meanings in regular expressions:
^ $ \ . * + ? ( ) [ ] { } |
For example, the following regular expression matches the letter
A followed by zero or more instances of the letter B (the asterisk
metacharacter indicates this repetition), followed by the letter
C:
/AB*C/
To include a metacharacter without its special meaning in a regular
expression pattern, you must use the backslash (
\
)
escape character. For example, the following regular expression
matches the letter A followed by the letter B, followed by an asterisk,
followed by the letter C:
var pattern:RegExp = /AB\*C/;
A
metasequence,
like a metacharacter, has special meaning
in a regular expression. A metasequence is made up of more than
one character. The following sections provide details on using metacharacters
and metasequences.
About metacharacters
The
following table summarizes the metacharacters that you can use in
regular expressions:
Metacharacter
|
Description
|
^
(caret)
|
Matches at the start of the string. With
the
m
(
multiline
) flag set, the
caret matches the start of a line as well (see
Flags and properties
). Note that when used at the start of a character
class, the caret indicates negation, not the start of a string.
For more information, see
Character classes
.
|
$
(dollar sign)
|
Matches at the end of the string. With the
m
(
multiline
)
flag set,
$
matches the position before a newline (
\n
)
character as well. For more information, see
Flags and properties
.
|
\
(backslash)
|
Escapes the special metacharacter meaning
of special characters.
Also, use the backslash character
if you want to use a forward slash character in a regular expression
literal, as in
/1\/2/
(to match the character 1,
followed by the forward slash character, followed by the character
2).
|
.
(dot)
|
Matches any single character.
A dot
matches a newline character (
\n
) only if the
s
(
dotall
)
flag is set. For more information, see
Flags and properties
.
|
*
(star)
|
Matches the previous item repeated zero
or more times.
For more information, see
Quantifiers
.
|
+
(plus)
|
Matches the previous item repeated one or
more times.
For more information, see
Quantifiers
.
|
?
(question mark)
|
Matches the previous item repeated zero
times or one time.
For more information, see
Quantifiers
.
|
(
and
)
|
Defines groups within the regular expression.
Use groups for the following:
-
To confine the scope
of the | alternator:
/(a|b|c)d/
-
To define the scope of a quantifier:
/(walla.){1,2}/
-
In backreferences. For example, the
\1
in
the following regular expression matches whatever matched the first
parenthetical group of the pattern:
-
/(\w*) is repeated: \1/
For
more information, see
Groups
.
|
[
and
]
|
Defines a character class, which defines
possible matches for a single character:
/[aeiou]/
matches
any one of the specified characters.
Within character classes,
use the hyphen (
-
) to designate a range
of characters:
/[A-Z0-9]/
matches uppercase
A through Z or 0 through 9.
Within character classes, insert
a backslash to escape the ] and
- characters:
/[+\-]\d+/
matches
either
+
or
-
before one or more
digits.
Within character classes, other characters, which
are normally metacharacters, are treated as normal characters (not
metacharacters), without the need for a backslash:
/[$]/
£
matches either
$
or
£.
For
more information, see
Character classes
.
|
|
(pipe)
|
Used for alternation, to match either the
part on the left side or the part on the right side:
/abc|xyz/
matches
either
abc
or
xyz
.
|
About metasequences
Metasequences
are sequences of characters that have special meaning in a regular
expression pattern. The following table describes these metasequences:
Metasequence
|
Description
|
{
n
}
{
n
,}
and
{
n
,
n
}
|
Specifies a numeric quantifier or quantifier
range for the previous item:
/A{27}/
matches
the character
A
repeated
27
times.
/A{3,}/
matches
the character
A
repeated
3
or
more times.
/A{3,5}/
matches the character
A
repeated
3
to
5
times.
For
more information, see
Quantifiers
.
|
\b
|
Matches at the position between a word character
and a nonword character. If the first or last character in the string
is a word character, also matches the start or end of the string.
|
\B
|
Matches at the position between two word
characters. Also matches the position between two nonword characters.
|
\d
|
Matches a decimal digit.
|
\D
|
Matches any character other than a digit.
|
\f
|
Matches a form feed character.
|
\n
|
Matches the newline character.
|
\r
|
Matches the return character.
|
\s
|
Matches any white-space character (a space,
tab, newline, or return character).
|
\S
|
Matches any character other than a white-space
character.
|
\t
|
Matches the tab character.
|
\u
nnnn
|
Matches the Unicode character with the character
code specified by the hexadecimal number
nnnn
. For example,
\u263a
is
the smiley character.
|
\v
|
Matches a vertical feed character.
|
\w
|
Matches a word character (
AZ
–,
az
–,
0-9
,
or
_
). Note that
\w
does not match
non-English characters, such as
é
,
ñ
,
or
ç
.
|
\W
|
Matches any character other than a word
character.
|
\\x
nn
|
Matches the character with the specified
ASCII value, as defined by the hexadecimal number
nn
.
|
|
|
|
|
|