Posted August 25, 2011
Monday I was creating a graph for my webshop software. I use the
Chart module, but the colors
and the graph of the default settings aren't really pretty. So I tried to
create a better looking chart. I think I succeeded at least a bit.

Here is the code that I used to get this result:
my $c = Chart::Composite->new(1024, 250);
$c->set(composite_info => [
[ 'Lines', [1] ],
[ 'Points', [2] ],
]);
$c->set(precision => 0);
$c->set(skip_x_ticks => 7);
$c->set(max_y_ticks => 11);
$c->set(min_y_ticks => 2);
$c->set(legend => 'none');
$c->set(grey_background => 'false');
$c->set(y_grid_lines => 'true');
$c->set(colors => {
'background' => [255,255,255],
'misc' => [200,200,200],
'x_grid_lines' => [220,220,220],
'y_grid_lines' => [220,220,220],
'dataset0'=>[150,180,250],
'dataset1'=>[120,140,240],
});
my $label = 'x';
my $val = 10;
$c->add_pt($label, $val, $val);
You should call add_pt for each point you want to show. I use $val twice
because there are two graphs that both need the same value.
I also noticed that setting min_y_ticks to a lower value (default=6) will
produce a nicer chart when you have few values.
Posted April 13, 2011
Now that we have two programs that parse log files, we can start to take a
look at how many lines the program parses per second. First we have to make
the two programs as similar as possible. In pseudocode it looks likes this.
- Load all modules
- Take the start time using Time::HiRes
- Put the code of the program here
- Set line_count = 0
- Using stdin: loop through all lines
- Parse the line
- Set line_count++
- Find the time difference
- Divide and line_count / time as
n lines/s
In Perl this looks like:
use Time::HiRes 'gettimeofday', 'tv_interval';
my $start = [gettimeofday];
# Your program
my $line_count = 0;
while (<>) {
# Parse one line using your software
$line_count++;
}
my $diff = tv_interval($start);
printf "%.2f lines/s\n", $line_count / $diff;
Now run the two programs a few times and look at the parsing speed. In my case
there was a big difference between the speed of the two programs. I expect a
difference in your run as well.
Posted April 12, 2011
Writing an apache access log parser isn't that hard. Below is a parser that
does just that. It creates Data::Dumper output of all the lines. No warranty.
use Data::Dumper;
use Parse::RecDescent;
$Parse::RecDescent::skip = '';
my $grammar = q{
line: ip ws '-' ws user ws datetime ws request ws status ws responsesize
ws referrer ws useragent "\n"
{ $return = {
ip => $item[1],
user => $item[5],
datetime => $item[7],
method => $item[9]->{method},
url => $item[9]->{url},
protocol => $item[9]->{protocol},
status => $item[11],
size => $item[13],
referrer => $item[15],
useragent => $item[17],
} }
user: '-' | /\w+/
request: '"' method ws url ws protocol '"'
{ $return = { method => $item[2], url => $item[4], protocol => $item[6] } }
datetime: '[' date ':' time ws timezone ']'
{ $return = $item[2] . ' ' . $item[4] . ' ' . $item[6] }
status: /\d{3}/
protocol: 'HTTP/' version
method: 'GET' | 'POST' | 'PUT' | 'DELETE'
ws: /[ ]+/
url: /\S+/
referrer: quotedstring2
responsesize: '-' | /\d+/
useragent: quotedstring2
date: day '/' month '/' year
{ $return = join('/', $item[1], $item[3], $item[5]) }
day: /\d+/
month: 'Jan' | 'Feb' | 'Mar' | 'Apr' | 'May' | 'Jun' |
'Jul' | 'Aug' | 'Sep' | 'Oct' | 'Nov' | 'Dec'
year: /\d{4}/
time: /\d{2}:\d{2}:\d{2}/
timezone: ('+'|'-') /\d{4}/ { $return = $item[1].$item[2] }
octet: /\d+/
ip: octet ('.' octet)(3) { $return = $item[1] . '.' . join('.', @{$item[2]}) }
version: /\d.\d/
quotedstring2: '"' /[^"]+/ '"' {$return = $item[2]}
};
my $parser = Parse::RecDescent->new($grammar) or die "Bad Grammer";
while (<>) {
my $ret = $parser->line($_) or print "Parse error\n";
print Dumper($ret);
}
Posted April 12, 2011
Today I tried to create a report of some basic statistics about
Abacus downloads. Normally I would use
grep, awk and a few other commandline tools to find a rough estimate of
these numbers. However this time I needed a bit more information than these
tools could give me. A problem in need of a solution.
My first question was: how many people have downloaded Abacus? The answer is
grep '/abacus/files/Abacus' | grep -v '<localip>' \
| grep -v 'somebots' | awk '{print $1}'
| sort | uniq | wc -l
The pattern here is the following. First find the lines you want. Then remove the
lines you don't want. Print the first field-the client-and makes this list
unique. I don't want to count multiple downloads from the same ip.
The next questions was: where do people who download Abacus come from?
For this I take the answer from the last question (without wc -l) and
write it to a file. Now I can use the file as extra argument for grep like
this:
grep -f abacus-downloads.txt -F logs/access.log
This makes grep use the lines in abacus-downloads.txt as the patterns that
it needs to find in logs/access.log. Now I need to find the first line
where a match appears, which should contain the referrer where the person
comes from. How to do that? I did the following:
- Download
Parse::AccessLogEntry from CPAN
- Write a little script
This script will only print a line if it's the first line containing a
client.
use Parse::AccessLogEntry;
my $p = Parse::AccessLogEntry->new();
my %hosts;
while (<>) {
my $line = $p->parse($_);
if (!$hosts{$line->{host}}) {
print;
$hosts{$line->{host}} = 1;
}
}
I pipe the output of the previous grep through this program and now I have
the lines with the referrers I'm looking for. A small improvement could be
to filter out favicons because in my case one browser downloaded the favicon
before it got the page itself.
Just add
next if $line->{file} =~ m{^/favicon};
at the appropriate spot. Now I need a list of the referrers from these
lines. I could change the print statement in this program to that for me.
That wouldn't be the unix way. So I wrote another small program that prints
the field from the log if it's specified in the arguments.
use Parse::AccessLogEntry;
my $p = Parse::AccessLogEntry->new();
my @args = @ARGV;
@ARGV=();
while (<>) {
my $line = $p->parse($_);
print join("\t", map { $line->{$_} } @args) . "\n";
}
This program can be called using one or more arguments. The argument should
be a key from the $line hashref, like host, user, date, time, diffgmt,
rtype, file, proto, code, bytes, refer or agent.
Using refer as an argument, the program ave me a list of the referrers
from the log file. Using sort | uniq -c | sort -rn on this gave me a top X
list of the referrer where the people who downloaded Abacus came from.
Posted April 7, 2011
Sometimes you need to create a large tree of subdirectories. But why? Two
examples that I think of are structured directories for weblogs, e.g.
/[year]/[month]/[day]/, or the automatic backing up of files e.g.
invoices/[company]/[year].
Before you begin, you know that using a split and chdir, or some other combination of
built-ins, will just make big mess. A call to mkdir -p could also work,
but let's use the available modules this time.
If you use Perl there is always the CPAN that can help
you. So, also this time. Enter
File::Path.
use File::Path 'make_path';
make_path('posts/2011/04/04');
This will create this structure below the current directory. Simple. There
are many of these modules hidden (or less hidden) in the CPAN. THey like the
light.
Posted September 10, 2010
I've been using Plack for a few days now, and I have to say, I really
like it. I rewrote the base of my webshop platform to use it. The change wasn't
really difficult and the code is a lot cleaner now.
Especially with the Middleware structure I was able to remove a lot of code
from the main application. Middleware classes are classes that are called
between the webserver and the main application. These classes have the
opportunity to rewrite the request and the response classes for each request.
Because I use the same code for each webshop, but different databases for each
customer, so I need a way to select the right database for a request. This is a
great use for a Middleware class. The software selects and connects to the
database based on the hostname in the request. The connection is then added to
the environment hash. And the way back the database connection gets closed.
Furthermore there are already a few nice Middleware classes for logging and
debugging. That is code you don't have to write. I really like it.
Posted April 13, 2010
The latest version of Perl, version 5.12 has been released. There are many
changes and bugfixes and I would recommend you check the release notes for
the whole list.
- Implicit strictures when you use an version number greater than 5.12.0
- The yada, yada operator
- Y2038 compliance
- New package naming for including the version
- Updated to the latest Unicode standard
Posted March 8, 2010
The feature I'll talk about here is the given/when construct, which was
added in perl 5.10. It works like switch/case in other programming
languages, but is much more powerful. The matching is based on smart matching,
which is another feature added in 5.10;
I will start with a simple example to give you an idea of the syntax that is
used.
use 5.010;
my $x = <>;
chomp $x;
given ($x) {
when ([0..99]) {
say "Looking good";
}
when ([100..199]) {
say "That's a bit much";
}
default {
say "This could be a problem";
}
}
This code compare the value of $x with the array's in the when statements.
If $x is between 0 and 99 (inclusive) it will the text Looking good. If it's
between 100 and 199 then it will say That's a bit much. The default block
will be called when the value isn't matched by the when blocks.
Next I will give a more useful example, but not much more.
use 5.010;
my ($x, $y) = (0,0);
LINE: while (<>) {
my @parts = split /\s+/;
for (@parts) {
when (/^x(\d+)/) {
$x = $1;
}
when (/^y(\d+)/) {
$y = $1;
}
when (/^p/) {
say $x + $y;
}
when (/^q/) {
last LINE;
}
}
}
This example reads lists of tokens from STDIN and matches them and executes code
based on the input. In effect it's a small programming language. Notice that
this code doesn't use the given statement. It's not needed here, because the for
already assigns each element of @parts to $_.
It's also possible to use simple expressions like you would use in an if statement.
For example:
use 5.010;
my $age = <>;
chomp $age;
given ($age) {
when (!/^\d+$/) {
say "Not a number";
}
when ($_ > 100) {
say "That's quite old";
}
when (18) {
say "Now your life begins...";
}
when (0) {
say "Just born, and already using the computer.";
}
default {
say "I have nothing useful to say about '$age'";
}
}
As you can see when is quite smart about what to do with different
expressions. The first when clause contains a negated regular expression.
This will be matched using $age !~ m/REGEX/. The second one do what you
expect. The 18 and 0 clauses will match using $age == 18 and $age == 0.
You should watch out with comparing to 0 because this will also match empty
strings or just strings. For example if $age = 'hello', when(0) will match.
Smartmatching is really powerful. With given and when it's easy to use this
power for deciding what to do with the value that you've been given. You
should take a look at the manual for more information about the possible smart
matches and the things you can with given and when.
Posted February 24, 2010
Example of how you should send email in 2010 with Perl.
#!/usr/bin/perl -w
use strict;
use Email::Sender::Simple qw(sendmail);
use Email::Simple;
use Email::Simple::Creator;
my $email = Email::Simple->create(
header => [
To => '"Peter Foo" <foo@example.com>',
From => '"Peter Bar" <bar@example.org>',
Subject => "What's up",
],
body => "Hey, how are you doing?",
);
sendmail($email);
You can see that this is simple enough. First you create an
Email::Simple object. This object will format the message that wil be
sent.
After that sendmail will send this message using it's default transport.
The nice thing about using these modules is that you can, while testing,
replace the default transport with testing modules.
use Test::More;
BEGIN { $ENV{EMAIL_SENDER_TRANSPORT} = 'Test' }
use YourModule;
YourModule->run;
my @deliveries = Email::Sender::Simple->default_transport->deliveries;
is(@deliveries, 1);
This way you can check how many emails were send using in the code your
testing. By picking apart @deliviries you can check out the text of the
emails that were send. See Email::Sender::Transport::Test for more
information about this.
Posted February 17, 2010
I use prove from Test::Harness to test the code of my Perl projects. The
program prove runs tests files and summarizes the output. If tests fail then
it shows the error in its output.
The problem is, my defaults aren't the same as the defaults that prove uses.
I keep my main application files in app/lib and a few vendor libraries in
vendor. The problem with this is that I have type out the options of prove
every time I wanted to test. For the longest time I used a Makefile to solve
this problem.
Then I found out about .proverc. If you put this file in your current
directory of in your home directory, then prove will use each line as an
extra command-line option.
The .proverc in the root dir of my project looks like this:
-Iapp/lib
-Ivendor
-r
It includes two extra directories in the Perl include path; app/lib and
vendor. It also find all test files recursively in the t directory, which
is also not a default setting of prove.
Posted January 22, 2010
Perl 5.11.4 is released:
This is the fifth DEVELOPMENT release in the 5.11.x series leading to a stable
release of Perl 5.12.0. You can find a list of high-profile changes in this
release in the file
perl5114delta.pod
inside the distribution.
Posted January 22, 2010
Plack is a Perl Web Server:
Plack is the superglue interface between perl web application frameworks and
web servers, just like Perl is the duct tape of the internet.
This is interesting and maybe useful in my webshop platform.
Posted January 5, 2010
Let's say I have this line of code
my $numbers = [ 1, 2, 3, 4, 5, 6 ];
and I want to reverse the items in the list. In Perl 5 there is this useful
builtin function which does just that. There problem is, it only works on
lists, not arrayrefs. If you use it on an arrayref, you will reverse the string
representation of the arrayref. Which will return something like this:
)8936f22x0(YARRA
This is not very useful. I think, the only way to reverse a arrayref is by
dereferencing it to an array, then reverse it, and then make it an arrayref
again. Something like this:
my $newarray = [ reverse @$numbers ];
This line contains synthetic code. It should be easier to reverse arrays.
Posted December 27, 2009
osfameron asked:
I've been prototyping ways for customers to import data into our system.
[...] How would you tackle this task? Is there even an elegant way to do it
imperatively?
He is trying to import data into his system, but his information is not all in
a flat structure. It looks like this:
Name Price Tag
Brie 2.00 Dairy
Cheese
Food
Chablis 5.00 Wine
Alcohol
Drink
As you can see the there is a start row and a few extra data rows after that.
The program that I would write to import this data looks like this:
use strict;
use warnings;
use Product;
use Data::Dumper;
my @products;
while (<>) {
if (my ($name, $price, $tag) = m/^(\w+)\s+(\d\.\d{2})\s+(\w+)$/) {
my $product = Product->new(product => $name, price => $price);
$product->add_tag($tag);
push @products, $product;
}
elsif (($tag) = m/^\s+(\w+)$/) {
my $product = $products[$#products];
$product->add_tag($tag);
}
}
print Dumper(\@products);
In the way that I have written this code it doesn't need two pushes. Pushing
it on the array is the first thing that I do after creating the product. The
other parts of the code that need to refer to the product that was added last
use the last product in the array.
My program is different to osfamerons code by using regexes, but this allowed
my to actually test my code.
Posted November 30, 2009
This is a small Perl 5.10 tip. You can use named capture buffers to fill a
hash. Before 5.10 you needed to 'unpack' the values from a regex match into a
hash.
I will give a small example that shows the problem this creates.
my $str = "name=value";
my %result;
if ($str =~ m/(\w+)=(\w+)/) {
$result{name} = $1;
$result{value} = $2;
}
The problem with this is that you need to number the variables and assign each
individually to the hash. In Perl 5.10 you can do the following.
if ($str =~ m/(?<name>\w+)=(?<value>\w+)/) {
%result = %+;
}
This is a lot simpler, cleaner and contains less possible bugs.
Posted November 16, 2009
This week I will show a simple example about how to use the underscore
prototype which is new in Perl 5.10.
This feature was added to allow you to write functions that work like builtin
functions. It's up to you to not abuse it and write unreadable code, of course.
This example shows how to use this new feature. It allows you to use the $_
variable as an argument for a user defined function.
use strict;
use feature 'say';
sub greeting(_) {
my ($greeting) = @_;
$greeting //= 'world';
say "Hello $greeting";
return;
}
for (qw/planet people/) {
greeting();
}
greeting();
The output of the program is:
Hello planet
Hello people
Hello world
As you can see, you don't need to specify an argument to the greeting
function. The first two calls in the for loop use the two values from the
loop.
The call outside the loop uses the default value ('world') that was specified
in the function.
Posted October 25, 2009
This is the first article in a series about new Perl 5.10 features. I'll try to
write one article each week, but you'll never know.
This first article is about the new builtin function say. The say function
is similar to print in how it works. Whenever you want to use say or other
features from Perl 5.10, you need to declare that you want to use these
features.
use 5.010;
After you have done that you can use the function like this.
say 'Hello world';
This will print the text 'Hello, world' to STDOUT and add a newline "\n"
after it, equal to:
print "Hello, world\n";
This doesn't seem like a big feature, but it will also help with the
following.
say for @lines;
This is instead of
print "$_\n" for @lines;
The say function removes some complexity in this example. You don't have
to add the newline anymore.
Posted October 5, 2009
Jesse Vincent
released a new
development version of Perl 5:
It gives me great pleasure to announce the release of Perl 5.11.0.
Perl 5.11.0 is a DEVELOPMENT release. We're making it available to you today
to make it easy for you to test your software on what will eventually become
Perl 5.12.
The nice thing about this release is that there will be releases of Perl every
month, with the dates already scheduled. The next release (5.11.1) is coming on
October 20.
I hope this will increase the development speed of Perl. I also think that
this way it's easier to make new releases.
Posted August 18, 2008
I created a small Perl program to convert relative dates to absolute dates in
the format that I use for my calendar. My current calendar file looks like this.
2008
08
2008-08-12
2008-08-13
09
2008-09-10
...
If I want to add a date and I don't know the actual numbers, I can use the
following program to convert the date. It will also respect the whitespace in
front of the text.
#!/usr/bin/perl -w
use v5.10;
use strict;
use warnings;
use Date::Manip;
Date_Init('Language=Dutch');
my $inp = <>;
if (my ($ws, $date) = $inp =~ m/^(\s*)(.+)$/) {
say $ws . UnixDate($date, "%Y-%m-%d");
}
else {
print $inp;
}
I use Date::Manip for parsing the date. It works with human language style
dates like thursday. I added Date_Init so it will parse Dutch days like
donderdag. Also I use Perl 5.10, because I can. It has some nice features.
To use it put this script in the path. I called it refdate.pl. To use it type
a date and type !!refdate.pl<Enter>. In Vim this will call the program on the
current line. The date will be formatted in the YYYY-MM-DD format.
Posted October 15, 2007
The results for the perl survey are in. You can
download the raw data from the results page. The analysis of the results will
be on that same page in some time.
Posted September 4, 2007
The first survey results are in about the conversion rates. There is still time to complete the questions. For all the reasons mentioned on the page: fill out the perl survey.
Posted December 30, 2006
I was thinking about writing a list of the perl modules that I used most often,
liked best or somehow used. Some are golden oldies, others are new this year.
There is no order in the list, so no Top 10 or anything.
DBIx::DWIW this is a module that helps writing database code. It
contains methods for retrieving data and executing query's. It removes
almost all of the repetitive database coding work. It's nice to be able to
subclass and create a file with the configuration of the databases.
With all this niceness there are also a few hairy parts to this module.
Especially when using it in a web environment. The 'verbose' statements
are printed to STDOUT, which will result in 500 Internal server errors.
The logging itself is really useful, but could use some work still.
View the module on CPAN: DBIx::DWIW
File::Slurp, this is one of those modules that will make you really lazy.
You provide a new and the functions of this module will write or read
them. Not much more to be said.
use File::Slurp qw/read_file/;
my $text = read_file('config.yml');
View the module on CPAN: File::Slurp
POE, this is of course more of a framework for creating applications
than just one module. There is much to be found in POE namespace. When
you're creating network applications are daemons this is the place where
you should start your search. POE has many
different modules that can be reused. Creating networking applications
has never been easier.
View the module on CPAN: POE
YAML, I think more programs should use yaml. It has many different uses: config file,
data transfer between applications. You could use it where you would normally
use XML. But yaml is more consise and humand readable and writable. I use it
for config files and my blogging software uses it as an index to all the
different blog entries.
View the module on CPAN: YAML
svk, not an actual perl module, but still a great perl application.
SVK homepage
plagger, the perl rss aggregator.
Plagger homepage
Posted December 21, 2005
Yesterday I received the Perl Testing Developers Notebook. It only took about six
days to get here, but it still is three months ago, since I won. Of course this is
no problem and I'm very happy with the new book.
I like to thank Josh, for the contest. Which was
actually quite easy :).
I've started reading it and it looks like it contains real nice examples for perl
testing. Some stuff that will be really useful, when I'm working on the webshop and
the concert website.
Posted October 28, 2005
I created a script that allows me to update a page on a media wiki. It
uses WWW::Mechanize,
IO::All and
Config::Std. The last
two modules aren't needed probably. But it makes it all a lot simpler.
#!/usr/bin/perl -w
use IO::All;
use WWW::Mechanize;
use Config::Std;
read_config "$ENV{HOME}/.mediawiki" => my %config;
my $username = $config{login}{username};
my $password = $config{login}{password};
my $hostname = $config{login}{hostname};
my $title = $ARGV[0] or die "usage: mediawiki [page title with under scores]\n\n";
my $url = "http://$hostname/mediawiki/index.php?title=$title&action=edit";
my $ua = WWW::Mechanize->new;
my $login_url = "http://$hostname/mediawiki/index.php?title=Special:Userlogin&returnto=Main_Page";
$ua->get($login_url);
$ua->field('wpName', $username);
$ua->field('wpPassword', $password);
$ua->click('wpLoginattempt');
$ua->get($url);
$ua->form_number(1);
my $form = $ua->current_form();
my $tmpfile = io('/tmp/mediawiki-1');
$form->value('wpTextbox1') > $tmpfile;
system('vi /tmp/mediawiki-1');
$tmpfile->open('r');
my $content = $tmpfile->slurp;
$ua->field('wpTextbox1', $content);
$ua->click('wpSave');
print "Done!\n\n";
The configuration is read from /home/$USER/.mediawiki. It should look something like this.
[login]
username=your name
password=your password
hostname=your wiki hostname
Where hostname should be something like 'localhost' or 'www.somedomain.org'.
Oh, this code comes without any warranty. Use at your own risk.
Posted October 23, 2005
This post is not about perl plugins (what would that be?). But
about programs that use plugins. One of those programs is
Qpsmtpd, a perl mail server. They use
a special and very nice way to use plugins.
Moveable Type is another perl
program, that uses plugins. These plugins are a little bit simpler than
the qpsmtpd plugins, but just as powerful.
And now there is another perl program that uses plugins, my webshop. It is really
nice to see how easy it is to create a new type of shop on top of the current code.
I can't yet show it, but within a few days it will be online.
Posted October 1, 2005
At the company, where I'm doing my internship, I needed a code
generator for some C code. There was some C code needed. So I started
of writing it. After some lines, there started to show some patterns,
which could be easily handled by some description language.
The first description language was a line based format. First a
line with the name of the object. And after that the names of the
functions. This format is easy to parse and easy to use. The output
was done with print statements.
For code generation I always used this method. But why should the
webapplications have all the template fun? So I downloaded Template
(which is the Template Toolkit),
from the CPAN and I tried some things with it. It was really easy to generate
code like this. The code was also much cleaner and shorter.
All nice and easy, but not good enough, because it should be possible
to add C code to the functions. I almost started to try and create
regular expressions for this task, but then the laziness started to
kick in again. I remembered something about a Text::Balanced, which is
written by Damian Conway. I can extract delimeted text from strings. Which
was exactly what I wanted to do.
The code is again much smaller, and writing the regexen would be the
wrong way to go. In this case Text::Balanced is the middle ground between
regexen and some parser written with something like Parse::RecDescent.
Laziness + CPAN is a good thing.
Posted September 7, 2005
I won the Perl Testing Developer Notebook written by Ian Langworth and
chromatic on
perlcast.com. I'm
curious about when the book will arrive.
Posted November 13, 2004
It was hard work, but at last a new version of the
hobby-koopjes.nl website is online. It
has become a nice website, with an interface that's actually quite
pretty.