In the latest version of Perl 5 the regex engine also got a big upgrade. There are many changes that made it faster and more correct for certain regexes. This time I will explain the new features called Named capture buffers.
Named capture buffers are similar to the numbered capture buffers, like $1
and $2
. The named versions of these work the same except that you can give
them a name like name
or value
. This will help you with documenting the
regex that you use.
Here is a small example:
use 5.010;
# The regex with named capture buffers
my $regex = qr{(?<name>\w+)=(?<value>\d+)};
# For testing
my @lines = ('hello=1', 'test=2', 'perl=5010');
# The test program
for (@lines) {
if (m/$regex/) {
say 'Name: ', $+{name}, "\tValue: ", $+{value};
}
}
The output of the program:
$ perl namevalue.pl
Name: hello Value: 1
Name: test Value: 2
Name: perl Value: 5010
The syntax for specifying the buffers in the regex is:
(?<name>pattern)
The name
should match /^[_A-Za-z][_A-Za-z0-9]*\z/
, pattern
can be any
legal perl regex.
After you successfully match the regex with a string, you can refer to the matched
value with the %+
hash. In the example it is the $+{name}
value of the hash that
contains the matched value.
To created named backreference to a named capture buffer, you can use the
\k<name>
syntax. You could for example do the following:
(?<name>\w+) \k<name>
This would match with hello hello
for example.
The best thing about this new feature is that it helps with documenting your program if you use meaningful names in your regexes.