You can specify a group in
a regular expression by using parentheses, as follows:
/class-(\d*)/
A group is a subsection of a pattern. You can use groups to do
the following things:
-
Apply a quantifier to more than one character.
-
Delineate subpatterns to be applied with alternation (by
using the
|
character).
-
Capture substring matches (for example, by using
\1
in
a regular expression to match a previously matched group, or by
using
$1
similarly in the
replace()
method
of the String class).
The following sections provide details on these uses of groups.
Using groups with quantifiers
If you do not use a group, a quantifier
applies to the character or character class that precedes it, as
the following shows:
var pattern:RegExp = /ab*/ ;
// matches the character a followed by
// zero or more occurrences of the character b
pattern = /a\d+/;
// matches the character a followed by
// one or more digits
pattern = /a[123]{1,3}/;
// matches the character a followed by
// one to three occurrences of either 1, 2, or 3
However,
you can use a group to apply a quantifier to more than one character or
character class:
var pattern:RegExp = /(ab)*/;
// matches zero or more occurrences of the character a
// followed by the character b, such as ababab
pattern = /(a\d)+/;
// matches one or more occurrences of the character a followed by
// a digit, such as a1a5a8a3
pattern = /(spam ){1,3}/;
// matches 1 to 3 occurrences of the word spam followed by a space
For
more information on quantifiers, see
Quantifiers
.
Using groups with the alternator (|) character
You
can use groups to define the group of characters to which you want
to apply an alternator (
|
) character, as follows:
var pattern:RegExp = /cat|dog/;
// matches cat or dog
pattern = /ca(t|d)og/;
// matches catog or cadog
Using groups to capture substring matches
When you define a standard parenthetical
group in a pattern, you can later refer to it in the regular expression.
This is known as a
backreference
, and these sorts of groups
are known as
capturing groups
. For example, in the following
regular expression, the sequence
\1
matches whatever
substring matched the capturing parenthetical group:
var pattern:RegExp = /(\d+)-by-\1/;
// matches the following: 48-by-48
You can specify
up to 99 of these backreferences in a regular expression by typing
\1
,
\2
,
... ,
\99
.
Similarly, in the
replace()
method
of the String class, you can use
$1$99
– to insert
captured group substring matches in the replacement string:
var pattern:RegExp = /Hi, (\w+)\./;
var str:String = "Hi, Bob.";
trace(str.replace(pattern, "$1, hello."));
// output: Bob, hello.
Also, if you use capturing
groups, the
exec()
method of the RegExp class and the
match()
method
of the String class return substrings that match the capturing groups:
var pattern:RegExp = /(\w+)@(\w+).(\w+)/;
var str:String = "bob@example.com";
trace(pattern.exec(str));
// bob@example.com,bob,example,com
Using noncapturing groups and lookahead groups
A noncapturing group
is one that is used for grouping only; it is not “collected,” and
it does not match numbered backreferences. Use
(?:
and
)
to
define noncapturing groups, as follows:
var pattern = /(?:com|org|net);
For
example, note the difference between putting
(com|org)
in
a capturing versus a noncapturing group (the
exec()
method
lists capturing groups after the complete match):
var pattern:RegExp = /(\w+)@(\w+).(com|org)/;
var str:String = "bob@example.com";
trace(pattern.exec(str));
// bob@example.com,bob,example,com
//noncapturing:
var pattern:RegExp = /(\w+)@(\w+).(?:com|org)/;
var str:String = "bob@example.com";
trace(pattern.exec(str));
// bob@example.com,bob,example
A special type
of noncapturing group is the
lookahead group,
of which there
are two types: the
positive lookahead group
and the
negative lookahead group.
Use
(?=
and
)
to
define a positive lookahead group, which specifies that the subpattern
in the group must match at the position. However, the portion of
the string that matches the positive lookahead group can match remaining
patterns in the regular expression. For example, because
(?=e)
is
a positive lookahead group in the following code, the character
e
that
it matches can be matched by a subsequent part of the regular expression—in
this case, the capturing group,
\w*)
:
var pattern:RegExp = /sh(?=e)(\w*)/i;
var str:String = "Shelly sells seashells by the seashore";
trace(pattern.exec(str));
// Shelly,elly
Use
(?!
and
)
to
define a negative lookahead group that specifies that the subpattern
in the group must
not
match at the position. For example:
var pattern:RegExp = /sh(?!e)(\w*)/i;
var str:String = "She sells seashells by the seashore";
trace(pattern.exec(str));
// shore,ore
Using named groups
A named group is a type of group in
a regular expression that is given a named identifier. Use
(?P<name>
and
)
to
define the named group. For example, the following regular expression
includes a named group with the identifier named
digits
:
var pattern = /[a-z]+(?P<digits>\d+)[a-z]+/;
When
you use the
exec()
method, a matching named group
is added as a property of the
result
array:
var myPattern:RegExp = /([a-z]+)(?P<digits>\d+)[a-z]+/;
var str:String = "a123bcd";
var result:Array = myPattern.exec(str);
trace(result.digits); // 123
Here is another example,
which uses two named groups, with the identifiers
name
and
dom
:
var emailPattern:RegExp =
/(?P<name>(\w|[_.\-])+)@(?P<dom>((\w|-)+))+\.\w{2,4}+/;
var address:String = "bob@example.com";
var result:Array = emailPattern.exec(address);
trace(result.name); // bob
trace(result.dom); // example
Note:
Named
groups are not part of the ECMAScript language specification. They
are an added feature in ActionScript 3.0.