The WikiParser class includes methods that convert Wiki input
text into the equivalent HTML output. This is not a very robust
Wiki conversion application, but it does illustrate some good uses
of regular expressions for pattern matching and string conversion.
The constructor function, along with the
setWikiData()
method,
simply initializes a sample string of Wiki input text, as follows:
public function WikiParser()
{
wikiData = setWikiData();
}
When the user clicks the Test button in the sample application,
the application invokes the
parseWikiString()
method
of the WikiParser object. This method calls a number of other methods,
which in turn assemble the resulting HTML string.
public function parseWikiString(wikiString:String):String
{
var result:String = parseBold(wikiString);
result = parseItalic(result);
result = linesToParagraphs(result);
result = parseBullets(result);
return result;
}
Each of the methods called—
parseBold()
,
parseItalic()
,
linesToParagraphs()
,
and
parseBullets()
—uses the
replace()
method
of the string to replace matching patterns, defined by a regular expression,
in order to transform the input Wiki text into HTML-formatted text.
Converting boldface and italic patterns
The
parseBold()
method
looks for a Wiki boldface text pattern (such as
'''foo'''
)
and transforms it into its HTML equivalent (such as
<b>foo</b>
), as
follows:
private function parseBold(input:String):String
{
var pattern:RegExp = /'''(.*?)'''/g;
return input.replace(pattern, "<b>$1</b>");
}
Note that the
(.?*)
portion of
the regular expression matches any number of characters (
*
)
between the two defining
'''
patterns. The
?
quantifier
makes the match nongreedy, so that for a string such as
'''aaa''' bbb '''ccc'''
,
the first matched string will be
'''aaa'''
and
not the entire string (which starts and ends with the
'''
pattern).
The
parentheses in the regular expression define a capturing group,
and the
replace()
method refers to this group by
using the
$1
code in the replacement string. The
g
(
global
)
flag in the regular expression ensures that the
replace()
method
replaces all matches in the string (not simply the first one).
The
parseItalic()
method
works similarly to the
parseBold()
method, except
that it checks for two apostrophes (
''
) as the
delimiter for italic text (not three):
private function parseItalic(input:String):String
{
var pattern:RegExp = /''(.*?)''/g;
return input.replace(pattern, "<i>$1</i>");
}
Converting bullet patterns
As the following example shows, the
parseBullet()
method
looks for the Wiki bullet line pattern (such as
* foo
)
and transforms it into its HTML equivalent (such as
<li>foo</li>
):
private function parseBullets(input:String):String
{
var pattern:RegExp = /^\*(.*)/gm;
return input.replace(pattern, "<li>$1</li>");
}
The
^
symbol at the beginning
of the regular expression matches the beginning of a line. The
m
(
multiline
)
flag in the regular expression causes the regular expression to
match the
^
symbol against the start of a line,
not simply the start of the string.
The
\*
pattern
matches an asterisk character (the backslash is used to signal a literal
asterisk instead of a
*
quantifier).
The
parentheses in the regular expression define a capturing group,
and the
replace()
method refers to this group by
using the
$1
code in the replacement string. The
g
(
global
)
flag in the regular expression ensures that the
replace()
method
replaces all matches in the string (not simply the first one).
Converting paragraph Wiki patterns
The
linesToParagraphs()
method
converts each line in the input Wiki string to an HTML
<p>
paragraph
tag. These lines in the method strip out empty lines from the input
Wiki string:
var pattern:RegExp = /^$/gm;
var result:String = input.replace(pattern, "");
The
^
and
$
symbols
the regular expression match the beginning and end of a line. The
m
(
multiline
)
flag in the regular expression causes the regular expression to
match the ^ symbol against the start of a line, not simply the start of
the string.
The
replace()
method replaces
all matching substrings (empty lines) with an empty string (
""
).
The
g
(
global
) flag in the regular
expression ensures that the
replace()
method replaces
all matches in the string (not simply the first one).