The writings of Peter Stuifzand

Weblog: perl5

The feature I'll talk about here is the given/when construct, which was added in perl 5.10. It works like switch/case in other programming languages, but is much more powerful. The matching is based on smart matching, which is another feature added in 5.10;

I will start with a simple example to give you an idea of the syntax that is used.

use 5.010;

my $x = <>;
chomp $x;

given ($x) {
    when ([0..99]) {
        say "Looking good";
    }
    when ([100..199]) {
        say "That's a bit much";
    }
    default {
        say "This could be a problem";
    }
}

This code compare the value of $x with the array's in the when statements. If $x is between 0 and 99 (inclusive) it will the text Looking good. If it's between 100 and 199 then it will say That's a bit much. The default block will be called when the value isn't matched by the when blocks.

Next I will give a more useful example, but not much more.

use 5.010;

my ($x, $y) = (0,0);

LINE: while (<>) {
    my @parts = split /\s+/;

    for (@parts) {
        when (/^x(\d+)/) {
            $x = $1;
        }
        when (/^y(\d+)/) {
            $y = $1;
        }
        when (/^p/) {
            say $x + $y;
        }
        when (/^q/) {
            last LINE;
        }
    }
}

This example reads lists of tokens from STDIN and matches them and executes code based on the input. In effect it's a small programming language. Notice that this code doesn't use the given statement. It's not needed here, because the for already assigns each element of @parts to $_.

It's also possible to use simple expressions like you would use in an if statement. For example:

use 5.010;

my $age = <>;
chomp $age;

given ($age) {
    when (!/^\d+$/) {
        say "Not a number";
    }
    when ($_ > 100) {
        say "That's quite old";
    }
    when (18) {
        say "Now your life begins...";
    }
    when (0) {
        say "Just born, and already using the computer.";
    }
    default {
        say "I have nothing useful to say about '$age'";
    }
}

As you can see when is quite smart about what to do with different expressions. The first when clause contains a negated regular expression. This will be matched using $age !~ m/REGEX/. The second one do what you expect. The 18 and 0 clauses will match using $age == 18 and $age == 0. You should watch out with comparing to 0 because this will also match empty strings or just strings. For example if $age = 'hello', when(0) will match.

Smartmatching is really powerful. With given and when it's easy to use this power for deciding what to do with the value that you've been given. You should take a look at the manual for more information about the possible smart matches and the things you can with given and when.

In the latest version of Perl 5 the regex engine also got a big upgrade. There are many changes that made it faster and more correct for certain regexes. This time I will explain the new features called Named capture buffers.

Named capture buffers are similar to the numbered capture buffers, like $1 and $2. The named versions of these work the same except that you can give them a name like name or value. This will help you with documenting the regex that you use.

Here is a small example:

use 5.010;

# The regex with named capture buffers
my $regex = qr{(?<name>\w+)=(?<value>\d+)};

# For testing
my @lines = ('hello=1', 'test=2', 'perl=5010');

# The test program
for (@lines) {
    if (m/$regex/) {
        say 'Name: ', $+{name}, "\tValue: ", $+{value};
    }   
}

The output of the program:

$ perl namevalue.pl
Name: hello Value: 1
Name: test  Value: 2
Name: perl  Value: 5010

The syntax for specifying the buffers in the regex is:

(?<name>pattern)

The name should match /^[_A-Za-z][_A-Za-z0-9]*\z/, pattern can be any legal perl regex.

After you successfully match the regex with a string, you can refer to the matched value with the %+ hash. In the example it is the $+{name} value of the hash that contains the matched value.

To created named backreference to a named capture buffer, you can use the \k<name> syntax. You could for example do the following:

(?<name>\w+) \k<name>

This would match with hello hello for example.

The best thing about this new feature is that it helps with documenting your program if you use meaningful names in your regexes.

View archived entries