On March 29, 2010
People write blog posts. And when you have written a lot of blog posts, there
comes a time when it becomes necessary to divide the posts into smaller
collections of posts. One way to do this is pagination.
Pagination divides a list of items into a few pages. Each page
has an URI, contains a few items and links to other pages in the list.
There are many places where this method is used. Two examples are search result
pages and blogs. Search result pages can contain many, many results, sometimes
as much as a few million. Showing all the result is a waste of space and
bandwidth, as most people won't even look past the first page.

For blogs this is a little bit different. The posts are in a reverse
chronological order, thus starting with the latest post. Sometimes the last ten
posts are shown on the same page, sometimes only one post.
A big difference between these two examples is that in search engines, the
list is ephemeral. This list doesn't need to be the same every time you look at
it. Some results move up and some results move down. Search engines shouldn't
even index them, as there is no value in those pages for them.
Blogs on the other hand have a lot of value in the older posts. These posts are
useful for search engines and all have permanent URIs. But that isn't always
the way people find them. Sometimes a person finds an archive page with more
items, that contains the post. The problem with this is that the posts move
deeper and deeper into later pages, because the blog orders the posts from new
to old.
For example, a blog with ten posts.
| Page 1 |
| 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 |
When the author now writes a new post, all the posts move one position to
the right.
| Page 1 | Page 2 |
| 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 | 1 |
The first post was on page 1, but now moves to the second page. A search engine
or user that thought the post was on page 1, now has to find it again, because
the URI has changed. If the author writes even more posts, these posts move as
well.
The historical solution
The solution that I describe above was used for a long time, and probably still
is, because (1) it's easy to implement and fits the way the pages are generated
and (2) because each page, except for the last contains the maximum number of
items for a page.
A program that generates the pages for a blog, has a reversed list of all the
posts that are on the blog. It loops through the list from the first to the
last, starting a new page whenever it has shown some number of posts, for
example, ten. The program writes the footer of the last page when there are no
more posts left to show.
Other solutions
A better solution takes the moving post problem into account. To solve the
problem, we should find another way to divide the list of posts into
different pages.
A way to divide the pages is by grouping the post by a value that doesn't
change, for example, the combination of the year and month of the creation date
of the post. You could create a list of pages for each month of posts. This
depends on the number of posts written, because you don't want more than
about ten or twenty posts on one page.
A third solution would be to create the pages from the first to the last. This
way a post always stays on the same page, because its index in the list
doesn't change. The problem with this solution is that the homepage
contains, nine out of ten times, less items than the other pages.
The fourth solution only works on a dynamically generated blog. The other three
solutions all work for a statically generated blog. Twitter uses this solution,
which I'll call the More solution.

First we show a list of the first ten items of the blog. At the end of the
list e show a link or button with the text More. Clicking this link
loads the next ten items from the list of posts. This works because the More
button has the timestamp, or id of the last item in the list and clicking
loads the next ten posts that have an id or timestamp smaller than the current
last post.
When a search engine finds these links, it creates more search results than in
a model where each post can only be added to one page. In this model each post
can be on as much as ten pages on any given time, depending on how often a
search engine (or user) finds a link to such a page.
Conclusion
Each solution works best in a different situation. I prefer blogs that use the
year-month approach for splitting up the pages, because the posts are split in
a natural way.
In a searchengine however, or on other ephemeral pages, the More approach is
better. Because most people don't want to go deeper into the results, but if
they want, they can use the More button.
On March 22, 2010
It becomes more and more apparent that we need to work on software and hardware
that will allow us to our own version of services that we use online. We need a
kind of home server computer that acts like a telephone. Every home needs one
and (I'm predicting) will have one in ten years.
This presentation by Eben Moglen about Freedom in the
Cloud
talks about how we, as geeks and software developers, can accomplish this goal
a bit faster. It really isn't that hard. We should start building
these home server devices and iterate and search useful features, that will
help people find what they need and how it can help them in their lives.
Danny O'Brien presented about a similar idea on OpenTech 2008, called
Living on the Edge.
On March 10, 2010
Inspired by the "Benificially Relating Elements" phrase of Kent Beck, I started
out creating a builtin weblog for my webshop platform. The nice thing about the
idea is that it helps you find relations between elements that you already have.
I started with the "What is a weblog?" A weblog is a chronological list of
pages. By answering this way, I can reuse two elements that already are
supported in the webshop: collections and pages.
Collections are lists of products and other collections. Nothing more nothing
less. By increasing the scope of collections a little bit, I can also include
pages.
A page is a piece of text that can be shown in the webshop. It has an url. By
making a weblog post to be a page, I can reuse all the infrastructure of pages
for weblog posts. This includes: creating, editing, saving and showing. The only
thing missing from a page is the creation date, which is needed to sort the
weblog posts chronologically.
The other two things that simplified the weblog feature are plug-ins and
routes. Plug-ins are small packages of code that are loaded on start of the
request and connect to rendering and loading and saving code.
Routes are ways to convert urls to controllers. The latest release made it
possible to create routes based on regular expressions and all urls are parsed
using this. This allowed me to use the names of the pages in the urls.
By relating the pieces, I created a new feature, that is useful in itself
without having to write a lot of new code. Now when I add 'comments' as a
feature to the weblog, I will also automatically add comments as a feature to
pages, because they are the same thing.
Two features for the price of one. I like it.
On March 8, 2010
The next feature I will talk about, is the given/when construct. This was
added in perl 5.10. It works like switch/case in other programming
languages, but is much more powerful. The matching is based on smart matching,
which is another feature added in 5.010;
I will start with a simple example to give you an idea of the syntax that is
used.
use 5.010;
my $x = <>;
chomp $x;
given ($x) {
when ([0..99]) {
say "Looking good";
}
when ([100..199]) {
say "That's a bit much";
}
default {
say "This could be a problem";
}
}
This code compare the value of $x with the array's in the when statements.
If $x is between 0 and 99 (inclusive) it will the text Looking good. If it's
between 100 and 199 then it will say That's a bit much. The default block
will be called when the value isn't matched by the when blocks.
Next I will give a more useful example, but not much more.
use 5.010;
my ($x, $y) = (0,0);
LINE: while (<>) {
my @parts = split /\s+/;
for (@parts) {
when (/^x(\d+)/) {
$x = $1;
}
when (/^y(\d+)/) {
$y = $1;
}
when (/^p/) {
say $x + $y;
}
when (/^q/) {
last LINE;
}
}
}
This example reads lists of tokens from STDIN and matches them and executes code
based on the input. In effect it's a small programming language. Notice that
this code doesn't use the given statement. It's not needed here, because the for
already assigns each element of @parts to $_.
It's also possible to use simple expressions like you would use in an if statement.
For example:
use 5.010;
my $age = <>;
chomp $age;
given ($age) {
when (!/^\d+$/) {
say "Not a number";
}
when ($_ > 100) {
say "That's quite old";
}
when (18) {
say "Now your life begins...";
}
when (0) {
say "Just born, and already using the computer.";
}
default {
say "I have nothing useful to say about '$age'";
}
}
As you can see when is quite smart about what to do with different
expressions. The first when clause contains a negated regular expression.
This will be matched using $age !~ m/REGEX/. The second one do what you
expect. The 18 and 0 clauses will match using $age == 18 and $age == 0.
You should watch out with comparing to 0 because this will also match empty
strings or just strings. For example if $age = 'hello', when(0) will match.
Smartmatching is really powerful. With given and when it's easy to use this
power for deciding what to do with the value that you've been given. You
should take a look at the manual for more information about the possible smart
matches and the things you can with given and when.
On March 2, 2010
When I'm programming, I sometimes need to use a problem solving pattern, that
Kent Beck called Parallel.
The pattern tells us that when your making a change you should also leave the
old way of doing things in the program, so you can gradually move to the new
solution. This helps in situations, where you can't just flip a switch to
migrate.
The problem with this however is, that you do have to migrate all the way to
the new solution, because otherwise you will have twice the code and data in
various states.
When I'm using Parallel, I like to keep an eye on the progress and I don't like
to go back to the old way. To help me with this I created a unit test that monitors
the progress that I make, while changing the program and the data.
It would be nice to have a test module that keeps track of this, but at the moment
I only have a small test program. It does three things:
- Read the previous count (of old way occurences)
- Check if the current count is smaller or equal to the previous count.
- If true, the test succeeds and then writes the count to a file. If false, the test will fail.
This way the tests will only succceed if I make improvements or if the code
stays the same. This way it can only improve the code.