Until today I couldn't use variables in my template that are pieces of code. I added one piece of code that executes the piece of code in a the stash and returns its value. In the template it looks like this.

[% FOR p IN products %]
    <p>[% p.name %]
[% END %]

There are two places in this piece of code that could contain code references. The first is products. This could be implemented as follows.

my $stash = {
    products => sub { my $db=shift; return $db->ProductList(); },
};

Here I show the implementation of the template evaluation code.

sub find_value_in_stash {
    my ($db, $stash, $name) = @_;

    my $it = $stash;

    for my $p (split /\./, $name) {
        $it = $it->{$p};

        if (ref($it) eq 'CODE') {
            $it = $it->($db);
        }
    }
    return $it;
}

This code doesn't contain the error-checking code that's necessary for a production environment. This code allows us to add variables to the stash without knowing the value when we add it. The nice thing is, we don't need to execute the potentially expensive code, for retrieving all the products from the database.

By adding a simple two-line feature like this to a the templating system, we can write simpler controller code. The controllers don't need to retrieve all the information from the database if it isn't used. If the variables are used in the template, then the values will automatically be loaded by the templating engine.

The second place in the template where we can use code, is in the second line, where we get the name field. This field could be an value in a hash. On the other hand it could be a method in the object p. By added another line to the find_value method we can use objects, as well as, simple hash values in templates.

The line that move to the next value in the stash needs to be changed to the following. The line

        $it = $it->{$p};

becomes

        if (my $meth = $it->can($p)) {
            $it = $it->$meth();
        }
        else {
            $it = $it->{$p};
        }

This change allows us to use methods on objects. It enables us to write code in classes, that is executed when needed, instead of when the controller was written to build the parameter hash.

To be clear, Template::Toolkit provides both these features. I have written and seen a few web applications and most of them didn't use and create many objects, because there was a tendency to think of objects as being slow and using much memory. I do think we should watch out for creating many unused objects or loading many rows from a database, because it can slow down your web application a lot. I consider not using method calls and sub references here a form of premature optimisation.

While browsing around StackOVerflow I found a page about Square root function implementation. One of the answer points to an article containing different implementations of square root.

One of the things I like to do is pointing out obvious errors in benchmarks. It's not that I don't have an intimate understanding of these things, but if I can point out the error, is has to be obvious. One of the obvious errors in time measurement can be shown using a short pseudo code example.

Begin = Time();
For (I = N; I > 0; I--) {
     F();
}
End = Time();

AvgTime = (End - Begin) / N

This program starts by retrieving the time at the start. Then the function F is called N times. At the end we set End to the current time. The absolute difference between Begin and End is the time it took to call the function F, N times. We divide by N to find the time it took to call the function once.

A benchmark should be structured like this. It reduces the error of your measurement. Benchmarks that are nog structured like this are wrong.

Now back to the code in the article. On the surface the code looks a lot like my pseudo code. I will show why it isn't the same.

for (int j = 0; j < AVG; j++) {   
    dur.Start();
    for (int i=1;i<M;i++) {
        RefTotalPrecision+=sqrt((float) i);
    }
    dur.Stop();
    Temp+=dur.GetDuration();
}
RefTotalPrecision/=AVG;
Temp/=AVG;
RefSpeed=(float)(Temp)/CLOCKS_PER_SEC; 

In this code the variable AVG is set to 10 and M is set to 10.000. After running the variable RefTotalPrecision will contain the average precision of M sqrt calls. The variable RefSpeed should contain the average speed of M sqrt calls. It – however – doesn't.

The M loop makes it look like we average over many calls. We don't because the value we average with is AVG, which is used in the outer loop. To show what happens we remove the code that is measured. I also removed the averaging code at the end.

for (int j=0;j<AVG;j++) {   
    dur.Start();
    // insert code to be measured
    dur.Stop();
    Temp+=dur.GetDuration();
}

The code measures the time 2 * AVG times, while it should only be measured twice, once at the start, and once at the end. To fix this, we need to move two lines and remove one character.

dur.Start();
for (int j=0;j<AVG;j++)  
{   
    // insert code to be measured
}
dur.Stop();
Temp = dur.GetDuration();

We move the Start and Stop calls outside the loop. The error introduced by the measuring functions will now be divided by AVG, which is AVG times smaller. I also removed the +, it wasn't needed anymore.

In the last two essays we established that there is a dispatcher, multiple controllers and multiple actions. The dispatchers creates a controllers and calls the action. Why do we split the application into these parts?

First the Dispatcher. The Dispatcher applies rules to a URL and chooses the corresponding Controller and Action. The Controller is a container for action and applies some default values to the action. The Action contains the code that's necessary to perform some transformation on the application data.

By structuring the design like this, the only task for the Controller is as a container for actions. We group the Action together with semantically similar actions. Actions that apply to the same type of information, e.g. the Guestbook controller contains the two actions: list and add_entry. The first action shows a list of guest book entries, the second action adds a new entry to the list. On the surface it makes sense to group Action like this in the controller. The action in the controller perform operations on the same data type, e.g. the list of guest book entries.

If we begin with Actions instead, then we can structure the application in another way. Each type of Action gets its own class, e.g. the action show is performed by the ShowAction class. The ShowAction contains the code for showing one data item. It could be possible to generalize the code for every type in the system. And like the ShowAction we can also create an EditAction and an UpdateAction.

The controller based structure enables us to spell out every part of the computation, from beginning till end. We're free to do whatever we want. The action based structure forces us to become more structured programmers. The action can't contain any specific code. We'd have to write Actions for every action in the system.

I think it's better if we specify and generate code for the specific differences and let the general case be handled by the Action class. On the other hand couldn't we apply these lessons to a Controller based model?

Yesterday we looked at the structure inside of two simple controller actions. Now let us look at the outside. The structure of the two actions can be shown as a tree.

  • Guestbook
    • list
  • Orders
    • list

I hid the rest of methods that would normally be in these controllers. In front of these controllers there is another class, called the Dispatcher. This Dispatcher uses a request and calls the appropriate controller and action.

The Dispatcher translates a URL into two pieces of information: the controller and the action. The Dispatcher then translates the controller name into a class name like this.

my ($controller, $action, $id) = ($url =~ m{^/(\w+)(?:/(\w+)(?:/(\d+))?)?});
my $classname = 'AppName::Controller::' . ucfirst $controller;
eval "require $classname";
my $obj = $classname->BUILD();
$obj->$action();

This simplified version of the code calls the controller. This contains no error checking, which is really important in a secure web application.

While I was thinking about writing this article at one point I thought about the direction of the calls and which part controls the execution flow. In the example the URL is passed to the Dispatcher which finds the controller. It looks up the controller class, then looks up the method and calls it. This way the URL and the code is coupled.

The URL determines the class and method that gets called. In this design we can't split up classes, because all controller methods need to be contained in the same class. We can't split or join classes or create smarter software because of this design decision. Furthermore, because the controller is created at the start of a request, we can't use the same controller for different URLs. Each URL needs its own piece of code. I argue that because of these problems we can't even use the techniques we know to improve the design.

We could increase the flexibility of the design by using objects instead of classes. Objects are more flexible, we can replace parts of the system by setting an instance variable to a different object.

To find out how we can redesign the code with this new knowledge, we have to take a look at the structure of more controllers.

I have still a few questions about the structure of the system. Where is the boundary between controllers and actions. Is there a boundary? Do we need controllers at all, or is having actions enough? As always we have to consider more sides of this problem.

Alex Stepanov writes in Notes on Programming:

We often get the idea that a mathematical theory is built in a logical way starting from definitions and axioms. This is not the case. The definitions and axioms appear at the very end of the development of a good theory. It invariably starts with simple facts that later on are generalized into theorems, and only at the very end the formal definitions and axioms are developed.

In an effort to become a better programmer (while at the same time rewriting one of my software programs,) we'll look at a piece of web application code, that I wrote a few years ago. The language used in these examples is Perl.

The idea is to find a better abstraction, to make it easier to add and change code related to the guestbook (and other controllers), while at the same time finding the underlying abstraction that isn't yet obvious from the code.

The piece of code in question is the list function of the Guestbook controller. It lists a some entries limited by a constant. The function loads the entries from the database using the guestbook_load_entries function.

The webserver and a few other classes parse the url /guestbook/list and dispatch the request to this function. Inside the function we validate all parameters that are needed.

package WebWinkel::Controller::Guestbook;
use strict;
use base qw/WebWinkel::Controller/;
use WebWinkel::DB::Guestbook 'guestbook_load_entries';

sub list {
    my $self = shift;
    my $page = $self->validate(-as_integer => 'page');
    my $entries = guestbook_load_entries($page, 10);
    return $self->render_template('guestbook/list', {
        page => $page, entries => $entries 
    });
}

Each of the lines in the function performs a small part of the whole action. I don't think it's important to look too much at the syntax. We're interested in the structure of the action.

The first line in the function validates the page parameter. The guestbook_load_entries function retrieves a list of guestbook entries from the database in the second line. In the third line the controller renders a specific template with the parameters we retrieved from the database.

It seems the structure of the process is quite simple.

  1. Validate the query parameters,
  2. Load some database entities using the validated parameters,
  3. Render a template using the database entities.

There are no branches or loops in this piece of code on this level. The template hides the loops we need for rendering a list.

We can simplify the process to a simple graph.

(1) { Validate → Load → Render }

We omit the specifics in this graph, because we're looking for an abstraction. Each node in this graph depends on the values provided by the previous step. The Validate step depends on data outside the controller.

Now it's time to look if this model we just created, can be applied to other controller and their actions. Let's take a look and see if we can find at least one action that satisfies this model.

package WebWinkel::Controller::Orders;
use strict;
use WebWinkel::DB::Orders qw/orders_find_all_open/;
use Date::Simple qw/today date/;

use base 'WebWinkel::Controller';

sub list {
    my ($self) = @_;
    my ($orders, $payed) = orders_find_all_open();
    my $today = today();
    return $self->render_template('orders/list', {
        today        => $today->format("%d-%m-%Y"),
        orders       => $orders,
        payed_orders => $payed,
    });
}

I found this piece of code in the Orders controller. Let's see which steps this function contains.

It starts with the retrieval of the open orders from the database on the first line. The second line gets the current date. The third line renders the template with the database gathered in the previous two lines. The structure of the function look like this.

{ Load → GatherData → Render }

This shows us that our previous model doesn't completely describe all controllers and actions. In this instance the GatherData step doesn't depend on the Load step. We could switch the two statements and the structure will still be the same. This means we should rewrite this model to include this new piece information. We'll use the $ to describe a relation between two steps where the first doesn't depend on the second and vice versa. The model now looks like this.

{ { Load $ GatherData } → Render }

The between the first two steps and Render remains, because all data found in those steps is used in that Step. This model can also be written as follows

{ { GatherData $ Load } → Render }

because we declare the operator $ as being commutative. We can't yet know if this is an important property of our model, but it describes the examples better. The model is still quite different from the other model.

How could we combine the two models while still being specific about the steps? What have the steps Validate and Load in common? If we look at the two code examples we see that both steps create information that is used in a later step. A later step doesn't need to be the next step. The step GatherData also satisfies this property.

{ { { Validate → Load } $ GatherData } → Render }

The variables found by Validate are passed into Load, but not specifically into GatherData. How can I say that this model is the same as model (1)?

Let's say step Validate creates zero or more pieces of information. If a step creates zero pieces of information for the next step, then this is same as not performing the step at all, at least with the current understanding of the model.

This concludes our first look at the structure of controllers and actions. From experience I know that there are more ways to write a controller. We will leave those for another time, just like the specifics about finding out which functions to call. Maybe we can discover a pattern there as well?

Steve Yegge wrote (and writes, again) about interesting stuff. Two of my favorite essays are written by him. One is about Compilers and why you should know them, the other about why it's important that programmers learn mathematics. Read them both to better understand where I come from.

I came to the conclusion that saying mathematics has nothing to do with programming, is like saying programming and software contain no patterns or structure.

For me this means I get a little bit happier everytime I understand a bit more mathematics, or when I read some essay or see a video that explains how they used understanding of mathematics in writing a piece of software.

It also explains why I get a little less happy when I have to write another piece of database code, error-checking code for forms or controllers that have every bit of functionality spelled out. Computers are smarter than that. And we should be, too.

My new found optimism comes from two papers that I read in the last two weeks. In the first paper, Notes for the Programming course at Adobe, Alex Stepanov writes:

If traditional mathematics deals with sets of values and operations on them, value algebras, we have to deal with sets of locations and operations on them: location algebras.

This helps us understand how we can see mathematics together with the use of pointers. A pointer is a location. Stepanov talks about how iterators are an abstraction of location, and how you write general algorithms for sorting and searching on top of these abstraction, without knowing which data structure is used underneath.

In this second paper about Regular Expressions, Fischer, Huch and Wilke show how their regular expression algorithms can be abstracted over semirings (a mathematical structure). In practice this means you can use the semiring { S = { False, True }, zero = False, one = True, ⊕=∨, ⊗=∧ } to find out if a regex matches and the semiring of non-negative integers { S=N0, zero = 0, one = 1, ⊕=+, ⊗=∙ } to find the number of times a regex matches for a particular string. Another semiring finds the leftmost position of a match. These are all practical applications of abstract mathematical concepts.

I think that we can become better programmers, if we better understand what mathematics is. Patterns are hiding in our code and we don't even know it. Mathematics will help us discover these patterns.

I was reading presentation slides from YAPC::EU 2010 and found this one: Web Automation with WWW::Mechanize::Firefox. It explains a module that connects to a running Firefox instance and will use it to follows links, create screenshots and more.

I rewrote the example a little bit make it work. I ended up with the this. These are two methods that work. The original method didn't work for me

# The original line in the example
my $png = $mech->content_as_png();

# Method 1
my $png = $mech->element_as_png($mech->selector('html')); 

# Method 2
my $png = $mech->content_as_png(undef, 
                {left=>0,top=>0,width=>200, height=>200});

Now I can automatically create screenshots from webpages. I will still need some way to control the width of the page and maybe a way to crop parts. I could use the element_as_png method for cropping, which allows me to get a screenshot of part of a page, maybe that's enough.

Here are two talks about Generic Programming, that I like very much. Generic Programming is about much more then just C++ templates.

Alexander Stepanov: STL and its Design Principles

In this first talk Stepanov talks about concepts and not much about STL. As he says (paraphrased): STL is maybe the biggest library of its kind, but it's just a small step in Generic Programming. There is much that can be done with Generic Programming, STL is just the beginning.

A Possible Future of Software Development

I really like it when people show there are more ways to bring mathematics and algorithms into programming. In this talk Sean Parent shows a way how this can be done. As programmers, we all learn about MVC in school, but it seems we all forget about it, as soon as we start writing web applications. However, this talk addresses a deeper problem.

Update on Google Wave:

Wave has taught us a lot, and we are proud of the team for the ways in which they have pushed the boundaries of computer science. We are excited about what they will develop next as we continue to create innovations with the potential to advance technology and the wider web.

It's sad that Google Wave was cancelled. I use it every day. And altough I have to admit that there are some problems with it, it is an amazing web application for collaboration between a few people. I wrote some project plans and lists of todo items. Other people commented and rewrote posts and I did the same.

I think I would like it to be more like a realtime wiki than a communication device. So it should be easier to link between posts. And as a realtime wiki won't be created by Google and supported, maybe this is the time to salvage some Google Wave technology and build it myself. As if I have the time for that kind of project.

New study suggests full-fat milk might be better:

We find that the persons who consumed the highest amount of full-fat dairy had a 70 per cent reduced risk of dying from cardiovascular disease, compared to people who do not eat full-fat dairy; so that's quite a large effect.

View archived entries