Write your own Accesslog parser in Perl

Writing an apache access log parser isn’t that hard. Below is a parser that does just that. It creates Data::Dumper output of all the lines. No warranty. use Data::Dumper; use Parse::RecDescent; $Parse::RecDescent::skip = ''; my $grammar = q{ line: ip ws '-' ws user ws datetime ws request ws status ws responsesize ws referrer ws useragent "\n" { $return = { ip => $item[1], user => $item[5], datetime => $item[7], method => $item[9]->{method}, url => $item[9]->{url}, protocol => $item[9]->{protocol}, status => $item[11], size => $item[13], referrer => $item[15], useragent => $item[17], } } user: '-' | /\w+/ request: '"' method ws url ws protocol '"' { $return = { method => $item[2], url => $item[4], protocol => $item[6] } } datetime: '[' date ':' time ws timezone ']' { $return = $item[2] ....

April 12, 2011

Apache accesslog reporting tools

Today I tried to create a report of some basic statistics about Abacus downloads. Normally I would use grep, awk and a few other commandline tools to find a rough estimate of these numbers. However this time I needed a bit more information than these tools could give me. A problem in need of a solution. My first question was: how many people have downloaded Abacus? The answer is grep '/abacus/files/Abacus' | grep -v '<localip>' \ | grep -v 'somebots' | awk '{print $1}' | sort | uniq | wc -l The pattern here is the following....

April 12, 2011