On August 31, 2010
Until today I couldn't use variables in my template that are pieces of code. I
added one piece of code that executes the piece of code in a the stash and
returns its value. In the template it looks like this.
[% FOR p IN products %]
<p>[% p.name %]
[% END %]
There are two places in this piece of code that could contain code references.
The first is products. This could be implemented as follows.
my $stash = {
products => sub { my $db=shift; return $db->ProductList(); },
};
Here I show the implementation of the template evaluation code.
sub find_value_in_stash {
my ($db, $stash, $name) = @_;
my $it = $stash;
for my $p (split /\./, $name) {
$it = $it->{$p};
if (ref($it) eq 'CODE') {
$it = $it->($db);
}
}
return $it;
}
This code doesn't contain the error-checking code that's necessary for a
production environment. This code allows us to add variables to the stash
without knowing the value when we add it. The nice thing is, we don't need to
execute the potentially expensive code, for retrieving all the products from the
database.
By adding a simple two-line feature like this to a the templating system, we can
write simpler controller code. The controllers don't need to retrieve all the
information from the database if it isn't used. If the variables are used in the
template, then the values will automatically be loaded by the templating engine.
The second place in the template where we can use code, is in the second line,
where we get the name field. This field could be an value in a hash. On the
other hand it could be a method in the object p. By added another line to the
find_value method we can use objects, as well as, simple hash values in
templates.
The line that move to the next value in the stash needs to be changed to the
following. The line
$it = $it->{$p};
becomes
if (my $meth = $it->can($p)) {
$it = $it->$meth();
}
else {
$it = $it->{$p};
}
This change allows us to use methods on objects. It enables us to write code in
classes, that is executed when needed, instead of when the controller was
written to build the parameter hash.
To be clear, Template::Toolkit provides both these features. I have written and
seen a few web applications and most of them didn't use and create many objects,
because there was a tendency to think of objects as being slow and using much
memory. I do think we should watch out for creating many unused objects or
loading many rows from a database, because it can slow down your web application
a lot. I consider not using method calls and sub references here a form of
premature optimisation.
On August 27, 2010
While browsing around StackOVerflow I found a page about Square root function
implementation.
One of the answer points to an article containing different implementations of
square root.
One of the things I like to do is pointing out obvious errors in benchmarks.
It's not that I don't have an intimate understanding of these things, but if I
can point out the error, is has to be obvious. One of the obvious errors in
time measurement can be shown using a short pseudo code example.
Begin = Time();
For (I = N; I > 0; I--) {
F();
}
End = Time();
AvgTime = (End - Begin) / N
This program starts by retrieving the time at the start. Then the function F
is called N times. At the end we set End to the current time. The absolute
difference between Begin and End is the time it took to call the function
F, N times. We divide by N to find the time it took to call the function
once.
A benchmark should be structured like this. It reduces the error of your
measurement. Benchmarks that are nog structured like this are wrong.
Now back to the code in the article. On the surface the code looks a lot like
my pseudo code. I will show why it isn't the same.
for (int j = 0; j < AVG; j++) {
dur.Start();
for (int i=1;i<M;i++) {
RefTotalPrecision+=sqrt((float) i);
}
dur.Stop();
Temp+=dur.GetDuration();
}
RefTotalPrecision/=AVG;
Temp/=AVG;
RefSpeed=(float)(Temp)/CLOCKS_PER_SEC;
In this code the variable AVG is set to 10 and M is set to 10.000. After running the
variable RefTotalPrecision will contain the average precision of M sqrt
calls. The variable RefSpeed should contain the average speed of M sqrt
calls. It – however – doesn't.
The M loop makes it look like we average over many calls. We don't because
the value we average with is AVG, which is used in the outer loop. To show
what happens we remove the code that is measured. I also removed the averaging
code at the end.
for (int j=0;j<AVG;j++) {
dur.Start();
// insert code to be measured
dur.Stop();
Temp+=dur.GetDuration();
}
The code measures the time 2 * AVG times, while it should only be measured
twice, once at the start, and once at the end. To fix this, we need to move two
lines and remove one character.
dur.Start();
for (int j=0;j<AVG;j++)
{
// insert code to be measured
}
dur.Stop();
Temp = dur.GetDuration();
We move the Start and Stop calls outside the loop. The error introduced by
the measuring functions will now be divided by AVG, which is AVG times
smaller. I also removed the +, it wasn't needed anymore.
On August 26, 2010
In the last two essays we established that there is a dispatcher, multiple
controllers and multiple actions. The dispatchers creates a controllers and
calls the action. Why do we split the application into these parts?
First the Dispatcher. The Dispatcher applies rules to a URL and chooses the
corresponding Controller and Action. The Controller is a container for action
and applies some default values to the action. The Action contains the code
that's necessary to perform some transformation on the application data.
By structuring the design like this, the only task for the Controller is as a
container for actions. We group the Action together with semantically similar
actions. Actions that apply to the same type of information, e.g. the Guestbook
controller contains the two actions: list and add_entry. The first action
shows a list of guest book entries, the second action adds a new entry to the
list. On the surface it makes sense to group Action like this in the
controller. The action in the controller perform operations on the same
data type, e.g. the list of guest book entries.
If we begin with Actions instead, then we can structure the application in
another way. Each type of Action gets its own class, e.g. the action show is
performed by the ShowAction class. The ShowAction contains the code for
showing one data item. It could be possible to generalize the code for every
type in the system. And like the ShowAction we can also create an
EditAction and an UpdateAction.
The controller based structure enables us to spell out every part of the
computation, from beginning till end. We're free to do whatever we want. The
action based structure forces us to become more structured programmers. The
action can't contain any specific code. We'd have to write Actions for every
action in the system.
I think it's better if we specify and generate code for the specific differences
and let the general case be handled by the Action class. On the other hand couldn't
we apply these lessons to a Controller based model?
On August 25, 2010
Yesterday we looked at the structure inside of two simple controller actions.
Now let us look at the outside. The structure of the two actions can be shown as
a tree.
I hid the rest of methods that would normally be in these controllers. In front of
these controllers there is another class, called the Dispatcher. This Dispatcher
uses a request and calls the appropriate controller and action.
The Dispatcher translates a URL into two pieces of information: the controller
and the action. The Dispatcher then translates the controller name into a class
name like this.
my ($controller, $action, $id) = ($url =~ m{^/(\w+)(?:/(\w+)(?:/(\d+))?)?});
my $classname = 'AppName::Controller::' . ucfirst $controller;
eval "require $classname";
my $obj = $classname->BUILD();
$obj->$action();
This simplified version of the code calls the controller. This
contains no error checking, which is really important in a secure web
application.
While I was thinking about writing this article at one point I thought about the
direction of the calls and which part controls the execution flow. In the
example the URL is passed to the Dispatcher which finds the controller. It looks
up the controller class, then looks up the method and calls it. This way the URL
and the code is coupled.
The URL determines the class and method that gets called. In this design we
can't split up classes, because all controller methods need to be contained in
the same class. We can't split or join classes or create smarter software
because of this design decision. Furthermore, because the controller is created
at the start of a request, we can't use the same controller for different URLs.
Each URL needs its own piece of code. I argue that because of these problems we
can't even use the techniques we know to improve the design.
We could increase the flexibility of the design by using objects instead of
classes. Objects are more flexible, we can replace parts of the system by
setting an instance variable to a different object.
To find out how we can redesign the code with this new knowledge, we have to
take a look at the structure of more controllers.
I have still a few questions about the structure of the system. Where is the
boundary between controllers and actions. Is there a boundary? Do we need
controllers at all, or is having actions enough? As always we have to consider
more sides of this problem.
On August 24, 2010
Alex Stepanov writes in Notes on Programming:
We often get the idea that a mathematical theory is built in a logical way
starting from definitions and axioms. This is not the case. The definitions
and axioms appear at the very end of the development of a good theory. It
invariably starts with simple facts that later on are generalized into
theorems, and only at the very end the formal definitions and axioms are
developed.
In an effort to become a better programmer (while at the same time rewriting one
of my software programs,) we'll look at a piece of web application code, that I
wrote a few years ago. The language used in these examples is Perl.
The idea is to find a better abstraction, to make it easier to add and change
code related to the guestbook (and other controllers), while at the same time
finding the underlying abstraction that isn't yet obvious from the code.
The piece of code in question is the list function of the Guestbook
controller. It lists a some entries limited by a constant. The function loads
the entries from the database using the guestbook_load_entries function.
The webserver and a few other classes parse the url /guestbook/list and
dispatch the request to this function. Inside the function we validate all
parameters that are needed.
package WebWinkel::Controller::Guestbook;
use strict;
use base qw/WebWinkel::Controller/;
use WebWinkel::DB::Guestbook 'guestbook_load_entries';
sub list {
my $self = shift;
my $page = $self->validate(-as_integer => 'page');
my $entries = guestbook_load_entries($page, 10);
return $self->render_template('guestbook/list', {
page => $page, entries => $entries
});
}
Each of the lines in the function performs a small part of the whole action. I
don't think it's important to look too much at the syntax. We're interested in
the structure of the action.
The first line in the function validates the page parameter. The
guestbook_load_entries function retrieves a list of guestbook entries from
the database in the second line. In the third line the controller renders a
specific template with the parameters we retrieved from the database.
It seems the structure of the process is quite simple.
- Validate the query parameters,
- Load some database entities using the validated parameters,
- Render a template using the database entities.
There are no branches or loops in this piece of code on this level. The
template hides the loops we need for rendering a list.
We can simplify the process to a simple graph.
(1)
{ Validate → Load → Render }
We omit the specifics in this graph, because we're looking for an abstraction.
Each node in this graph depends on the values provided by the previous step.
The Validate step depends on data outside the controller.
Now it's time to look if this model we just created, can be applied to other
controller and their actions. Let's take a look and see if we can find at least
one action that satisfies this model.
package WebWinkel::Controller::Orders;
use strict;
use WebWinkel::DB::Orders qw/orders_find_all_open/;
use Date::Simple qw/today date/;
use base 'WebWinkel::Controller';
sub list {
my ($self) = @_;
my ($orders, $payed) = orders_find_all_open();
my $today = today();
return $self->render_template('orders/list', {
today => $today->format("%d-%m-%Y"),
orders => $orders,
payed_orders => $payed,
});
}
I found this piece of code in the Orders controller. Let's see which steps
this function contains.
It starts with the retrieval of the open orders from the database on the first
line. The second line gets the current date. The third line renders the
template with the database gathered in the previous two lines. The structure of
the function look like this.
{ Load → GatherData → Render }
This shows us that our previous model doesn't completely describe all
controllers and actions. In this instance the GatherData step doesn't depend
on the Load step. We could switch the two statements and the structure will
still be the same. This means we should rewrite this model to include this new
piece information. We'll use the $ to describe a relation between two steps
where the first doesn't depend on the second and vice versa. The model now
looks like this.
{ { Load $ GatherData } → Render }
The → between the first two steps and Render remains, because all data found
in those steps is used in that Step. This model can also be written as follows
{ { GatherData $ Load } → Render }
because we declare the operator $ as being commutative. We can't yet know if
this is an important property of our model, but it describes the examples
better. The model is still quite different from the other model.
How could we combine the two models while still being specific about the steps?
What have the steps Validate and Load in common? If we look at the two
code examples we see that both steps create information that is used in a later
step. A later step doesn't need to be the next step. The step GatherData also
satisfies this property.
{ { { Validate → Load } $ GatherData } → Render }
The variables found by Validate are passed into Load, but not specifically
into GatherData. How can I say that this model is the same as model (1)?
Let's say step Validate creates zero or more pieces of information. If a step
creates zero pieces of information for the next step, then this is same as not
performing the step at all, at least with the current understanding of the
model.
This concludes our first look at the structure of controllers and actions.
From experience I know that there are more ways to write a controller. We will
leave those for another time, just like the specifics about finding out which
functions to call. Maybe we can discover a pattern there as well?
On August 23, 2010
Steve Yegge wrote (and writes, again) about interesting stuff. Two of
my favorite essays are written by him. One is about
Compilers and why you should know them, the other about
why it's important that programmers learn mathematics. Read them both
to better understand where I come from.
I came to the conclusion that saying mathematics has nothing to do with
programming, is like saying programming and software contain no patterns or
structure.
For me this means I get a little bit happier everytime I understand a bit more
mathematics, or when I read some essay or see a video that explains how they
used understanding of mathematics in writing a piece of software.
It also explains why I get a little less happy when I have to write another
piece of database code, error-checking code for forms or controllers that have
every bit of functionality spelled out. Computers are smarter than that. And we
should be, too.
My new found optimism comes from two papers that I read in the last two weeks.
In the first paper, Notes for the Programming course at Adobe, Alex
Stepanov writes:
If traditional mathematics deals with sets of values and operations on them,
value algebras, we have to deal with sets of locations and operations on
them: location algebras.
This helps us understand how we can see mathematics together with the use of
pointers. A pointer is a location. Stepanov talks about how iterators are an
abstraction of location, and how you write general algorithms for sorting and
searching on top of these abstraction, without knowing which data structure is
used underneath.
In this second paper about Regular Expressions, Fischer, Huch and
Wilke show how their regular expression algorithms can be abstracted over
semirings (a mathematical structure). In practice this means you
can use the semiring { S = { False, True }, zero = False,
one = True, ⊕=∨, ⊗=∧ } to find out if a regex matches and the semiring
of non-negative integers { S=N0, zero = 0, one
= 1, ⊕=+, ⊗=∙ } to find the number of times a regex matches for a
particular string. Another semiring finds the leftmost position of a match.
These are all practical applications of abstract mathematical concepts.
I think that we can become better programmers, if we better understand what
mathematics is. Patterns are hiding in our code and we don't even know it.
Mathematics will help us discover these patterns.
On August 19, 2010
I was reading presentation slides from YAPC::EU 2010 and found this one: Web
Automation with
WWW::Mechanize::Firefox.
It explains a module that connects to a running Firefox instance and will use
it to follows links, create screenshots and more.
I rewrote the example a little bit make it work. I ended up with the this.
These are two methods that work. The original method didn't work for me
# The original line in the example
my $png = $mech->content_as_png();
# Method 1
my $png = $mech->element_as_png($mech->selector('html'));
# Method 2
my $png = $mech->content_as_png(undef,
{left=>0,top=>0,width=>200, height=>200});
Now I can automatically create screenshots from webpages. I will still need
some way to control the width of the page and maybe a way to crop parts. I
could use the element_as_png method for cropping, which allows me to get a
screenshot of part of a page, maybe that's enough.
On August 5, 2010
Update on Google Wave:
Wave has taught us a lot, and we are proud of the team for the ways in which
they have pushed the boundaries of computer science. We are excited about
what they will develop next as we continue to create innovations with the
potential to advance technology and the wider web.
It's sad that Google Wave was cancelled. I use it every day. And altough I have
to admit that there are some problems with it, it is an amazing web application
for collaboration between a few people. I wrote some project plans and lists of
todo items. Other people commented and rewrote posts and I did the same.
I think I would like it to be more like a realtime wiki than a communication
device. So it should be easier to link between posts. And as a realtime wiki
won't be created by Google and supported, maybe this is the time to salvage
some Google Wave technology and build it myself. As if I have the time
for that kind of project.
On August 4, 2010
New study suggests full-fat milk might be better:
We find that the persons who consumed the highest amount of full-fat dairy
had a 70 per cent reduced risk of dying from cardiovascular disease, compared
to people who do not eat full-fat dairy; so that's quite a large effect.