Posted April 7, 2012
The three problems that need to be fixed are subscription, feed reading and
feed creation.
Subscription
Subscribing to feeds is really hard. It should be really simple. Unsubscribing
should be just as simple.
- See something that you want to subscribe to? Click the subscribe button.
- Done!
Unsubscribing should be just as easy.
- You see a post in your feed reader that you don't want to see.
- Click Hide.
- Show an option for unsubscription.
- Click, gone!
Or:
- Visit the page you want unsubscribe from in your browser.
- Click the unsubscribe button.
- No step three!
That's easy. Why isn't this possible yet?
Feed reading
Reading a feed should also be really simple. Dave Winer
has the River of News. A small paragraph
with optional title, description and link. That's how simple it should be.
Do you see something that you want to read? Read it. If you don't want to read
it, just scroll further for other posts.
Nothing is remembered or saved. Unless you want it too. Click 'star',
'favorite' or 'bookmark', and it is saved in a to read feed, so you can find it
later.
Feed creation
You have a feed. Write an update in a textarea. Click 'Publish'. Done. It
appears in your feed and the rivers of everyone who subscribed.
These three parts are easy and important.
Posted March 25, 2011
The use cases of Camlistore or all very
interesting. I'm interested to see how this all pans out.
Posted March 25, 2011
Camlistore seems like a really cool project when it
gets farther along. Or actually it seems really cool already, but it's not
useable at the moment. Something to take a look at sometime in the future.
Posted March 12, 2011
This week I created a way for me to post small messages, as there is not yet a
good decentralized way to make this happen. One part of this research is how
can I create, host and share RSS feeds without using to much bandwidth and
server time.
One way we can do this, is by using the features of the HTTP protocol. One of
those features is the Last-Modified header. This header allows web servers
and user agents to see when a resource was last modified. Together with a
If-Modified-Since header we can let our software check if it needs to send the
whole body or just the a simple 304 header.
However to make sure if this works as advertised, we need a way to simulate
this situation. I use lwp-mirror to test this, which is included with the
LWP package from Perl. It mirrors a remote resource to a local file.
Let's start the test. First download your resource to a file with the
lwp-mirror command.
lwp-mirror <url> <local_file>
Now download the file again and check if it sends a 304 code now. Then do
something that changes the resource on the server. In my case this is done by
added a new post. Now download the resource again and see if it sends a new
version of the file. Once downloaded, you check if you get another 304 code.
Posted August 31, 2010
Until today I couldn't use variables in my template that are pieces of code. I
added one piece of code that executes the piece of code in a the stash and
returns its value. In the template it looks like this.
[% FOR p IN products %]
<p>[% p.name %]
[% END %]
There are two places in this piece of code that could contain code references.
The first is products. This could be implemented as follows.
my $stash = {
products => sub { my $db=shift; return $db->ProductList(); },
};
Here I show the implementation of the template evaluation code.
sub find_value_in_stash {
my ($db, $stash, $name) = @_;
my $it = $stash;
for my $p (split /\./, $name) {
$it = $it->{$p};
if (ref($it) eq 'CODE') {
$it = $it->($db);
}
}
return $it;
}
This code doesn't contain the error-checking code that's necessary for a
production environment. This code allows us to add variables to the stash
without knowing the value when we add it. The nice thing is, we don't need to
execute the potentially expensive code, for retrieving all the products from the
database.
By adding a simple two-line feature like this to a the templating system, we can
write simpler controller code. The controllers don't need to retrieve all the
information from the database if it isn't used. If the variables are used in the
template, then the values will automatically be loaded by the templating engine.
The second place in the template where we can use code, is in the second line,
where we get the name field. This field could be an value in a hash. On the
other hand it could be a method in the object p. By added another line to the
find_value method we can use objects, as well as, simple hash values in
templates.
The line that move to the next value in the stash needs to be changed to the
following. The line
$it = $it->{$p};
becomes
if (my $meth = $it->can($p)) {
$it = $it->$meth();
}
else {
$it = $it->{$p};
}
This change allows us to use methods on objects. It enables us to write code in
classes, that is executed when needed, instead of when the controller was
written to build the parameter hash.
To be clear, Template::Toolkit provides both these features. I have written and
seen a few web applications and most of them didn't use and create many objects,
because there was a tendency to think of objects as being slow and using much
memory. I do think we should watch out for creating many unused objects or
loading many rows from a database, because it can slow down your web application
a lot. I consider not using method calls and sub references here a form of
premature optimisation.
Posted August 26, 2010
In the last two essays we established that there is a dispatcher, multiple
controllers and multiple actions. The dispatchers creates a controllers and
calls the action. Why do we split the application into these parts?
First the Dispatcher. The Dispatcher applies rules to a URL and chooses the
corresponding Controller and Action. The Controller is a container for action
and applies some default values to the action. The Action contains the code
that's necessary to perform some transformation on the application data.
By structuring the design like this, the only task for the Controller is as a
container for actions. We group the Action together with semantically similar
actions. Actions that apply to the same type of information, e.g. the Guestbook
controller contains the two actions: list and add_entry. The first action
shows a list of guest book entries, the second action adds a new entry to the
list. On the surface it makes sense to group Action like this in the
controller. The action in the controller perform operations on the same
data type, e.g. the list of guest book entries.
If we begin with Actions instead, then we can structure the application in
another way. Each type of Action gets its own class, e.g. the action show is
performed by the ShowAction class. The ShowAction contains the code for
showing one data item. It could be possible to generalize the code for every
type in the system. And like the ShowAction we can also create an
EditAction and an UpdateAction.
The controller based structure enables us to spell out every part of the
computation, from beginning till end. We're free to do whatever we want. The
action based structure forces us to become more structured programmers. The
action can't contain any specific code. We'd have to write Actions for every
action in the system.
I think it's better if we specify and generate code for the specific differences
and let the general case be handled by the Action class. On the other hand couldn't
we apply these lessons to a Controller based model?
Posted August 25, 2010
Yesterday we looked at the structure inside of two simple controller actions.
Now let us look at the outside. The structure of the two actions can be shown as
a tree.
I hid the rest of methods that would normally be in these controllers. In front of
these controllers there is another class, called the Dispatcher. This Dispatcher
uses a request and calls the appropriate controller and action.
The Dispatcher translates a URL into two pieces of information: the controller
and the action. The Dispatcher then translates the controller name into a class
name like this.
my ($controller, $action, $id) = ($url =~ m{^/(\w+)(?:/(\w+)(?:/(\d+))?)?});
my $classname = 'AppName::Controller::' . ucfirst $controller;
eval "require $classname";
my $obj = $classname->BUILD();
$obj->$action();
This simplified version of the code calls the controller. This
contains no error checking, which is really important in a secure web
application.
While I was thinking about writing this article at one point I thought about the
direction of the calls and which part controls the execution flow. In the
example the URL is passed to the Dispatcher which finds the controller. It looks
up the controller class, then looks up the method and calls it. This way the URL
and the code is coupled.
The URL determines the class and method that gets called. In this design we
can't split up classes, because all controller methods need to be contained in
the same class. We can't split or join classes or create smarter software
because of this design decision. Furthermore, because the controller is created
at the start of a request, we can't use the same controller for different URLs.
Each URL needs its own piece of code. I argue that because of these problems we
can't even use the techniques we know to improve the design.
We could increase the flexibility of the design by using objects instead of
classes. Objects are more flexible, we can replace parts of the system by
setting an instance variable to a different object.
To find out how we can redesign the code with this new knowledge, we have to
take a look at the structure of more controllers.
I have still a few questions about the structure of the system. Where is the
boundary between controllers and actions. Is there a boundary? Do we need
controllers at all, or is having actions enough? As always we have to consider
more sides of this problem.
Posted August 24, 2010
Alex Stepanov writes in Notes on Programming:
We often get the idea that a mathematical theory is built in a logical way
starting from definitions and axioms. This is not the case. The definitions
and axioms appear at the very end of the development of a good theory. It
invariably starts with simple facts that later on are generalized into
theorems, and only at the very end the formal definitions and axioms are
developed.
In an effort to become a better programmer (while at the same time rewriting one
of my software programs,) we'll look at a piece of web application code, that I
wrote a few years ago. The language used in these examples is Perl.
The idea is to find a better abstraction, to make it easier to add and change
code related to the guestbook (and other controllers), while at the same time
finding the underlying abstraction that isn't yet obvious from the code.
The piece of code in question is the list function of the Guestbook
controller. It lists a some entries limited by a constant. The function loads
the entries from the database using the guestbook_load_entries function.
The webserver and a few other classes parse the url /guestbook/list and
dispatch the request to this function. Inside the function we validate all
parameters that are needed.
package WebWinkel::Controller::Guestbook;
use strict;
use base qw/WebWinkel::Controller/;
use WebWinkel::DB::Guestbook 'guestbook_load_entries';
sub list {
my $self = shift;
my $page = $self->validate(-as_integer => 'page');
my $entries = guestbook_load_entries($page, 10);
return $self->render_template('guestbook/list', {
page => $page, entries => $entries
});
}
Each of the lines in the function performs a small part of the whole action. I
don't think it's important to look too much at the syntax. We're interested in
the structure of the action.
The first line in the function validates the page parameter. The
guestbook_load_entries function retrieves a list of guestbook entries from
the database in the second line. In the third line the controller renders a
specific template with the parameters we retrieved from the database.
It seems the structure of the process is quite simple.
- Validate the query parameters,
- Load some database entities using the validated parameters,
- Render a template using the database entities.
There are no branches or loops in this piece of code on this level. The
template hides the loops we need for rendering a list.
We can simplify the process to a simple graph.
(1)
{ Validate → Load → Render }
We omit the specifics in this graph, because we're looking for an abstraction.
Each node in this graph depends on the values provided by the previous step.
The Validate step depends on data outside the controller.
Now it's time to look if this model we just created, can be applied to other
controller and their actions. Let's take a look and see if we can find at least
one action that satisfies this model.
package WebWinkel::Controller::Orders;
use strict;
use WebWinkel::DB::Orders qw/orders_find_all_open/;
use Date::Simple qw/today date/;
use base 'WebWinkel::Controller';
sub list {
my ($self) = @_;
my ($orders, $payed) = orders_find_all_open();
my $today = today();
return $self->render_template('orders/list', {
today => $today->format("%d-%m-%Y"),
orders => $orders,
payed_orders => $payed,
});
}
I found this piece of code in the Orders controller. Let's see which steps
this function contains.
It starts with the retrieval of the open orders from the database on the first
line. The second line gets the current date. The third line renders the
template with the database gathered in the previous two lines. The structure of
the function look like this.
{ Load → GatherData → Render }
This shows us that our previous model doesn't completely describe all
controllers and actions. In this instance the GatherData step doesn't depend
on the Load step. We could switch the two statements and the structure will
still be the same. This means we should rewrite this model to include this new
piece information. We'll use the $ to describe a relation between two steps
where the first doesn't depend on the second and vice versa. The model now
looks like this.
{ { Load $ GatherData } → Render }
The → between the first two steps and Render remains, because all data found
in those steps is used in that Step. This model can also be written as follows
{ { GatherData $ Load } → Render }
because we declare the operator $ as being commutative. We can't yet know if
this is an important property of our model, but it describes the examples
better. The model is still quite different from the other model.
How could we combine the two models while still being specific about the steps?
What have the steps Validate and Load in common? If we look at the two
code examples we see that both steps create information that is used in a later
step. A later step doesn't need to be the next step. The step GatherData also
satisfies this property.
{ { { Validate → Load } $ GatherData } → Render }
The variables found by Validate are passed into Load, but not specifically
into GatherData. How can I say that this model is the same as model (1)?
Let's say step Validate creates zero or more pieces of information. If a step
creates zero pieces of information for the next step, then this is same as not
performing the step at all, at least with the current understanding of the
model.
This concludes our first look at the structure of controllers and actions.
From experience I know that there are more ways to write a controller. We will
leave those for another time, just like the specifics about finding out which
functions to call. Maybe we can discover a pattern there as well?
Posted March 29, 2010
People write blog posts. And when you have written a lot of blog posts, there
comes a time when it becomes necessary to divide the posts into smaller
collections of posts. One way to do this is pagination.
Pagination divides a list of items into a few pages. Each page
has an URI, contains a few items and links to other pages in the list.
There are many places where this method is used. Two examples are search result
pages and blogs. Search result pages can contain many, many results, sometimes
as much as a few million. Showing all the result is a waste of space and
bandwidth, as most people won't even look past the first page.

For blogs this is a little bit different. The posts are in a reverse
chronological order, thus starting with the latest post. Sometimes the last ten
posts are shown on the same page, sometimes only one post.
A big difference between these two examples is that in search engines, the
list is ephemeral. This list doesn't need to be the same every time you look at
it. Some results move up and some results move down. Search engines shouldn't
even index them, as there is no value in those pages for them.
Blogs on the other hand have a lot of value in the older posts. These posts are
useful for search engines and all have permanent URIs. But that isn't always
the way people find them. Sometimes a person finds an archive page with more
items, that contains the post. The problem with this is that the posts move
deeper and deeper into later pages, because the blog orders the posts from new
to old.
For example, a blog with ten posts.
| Page 1 |
| 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 |
When the author now writes a new post, all the posts move one position to
the right.
| Page 1 | Page 2 |
| 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 | 1 |
The first post was on page 1, but now moves to the second page. A search engine
or user that thought the post was on page 1, now has to find it again, because
the URI has changed. If the author writes even more posts, these posts move as
well.
The historical solution
The solution that I describe above was used for a long time, and probably still
is, because (1) it's easy to implement and fits the way the pages are generated
and (2) because each page, except for the last contains the maximum number of
items for a page.
A program that generates the pages for a blog, has a reversed list of all the
posts that are on the blog. It loops through the list from the first to the
last, starting a new page whenever it has shown some number of posts, for
example, ten. The program writes the footer of the last page when there are no
more posts left to show.
Other solutions
A better solution takes the moving post problem into account. To solve the
problem, we should find another way to divide the list of posts into
different pages.
A way to divide the pages is by grouping the post by a value that doesn't
change, for example, the combination of the year and month of the creation date
of the post. You could create a list of pages for each month of posts. This
depends on the number of posts written, because you don't want more than
about ten or twenty posts on one page.
A third solution would be to create the pages from the first to the last. This
way a post always stays on the same page, because its index in the list
doesn't change. The problem with this solution is that the homepage
contains, nine out of ten times, less items than the other pages.
The fourth solution only works on a dynamically generated blog. The other three
solutions all work for a statically generated blog. Twitter uses this solution,
which I'll call the More solution.

First we show a list of the first ten items of the blog. At the end of the
list e show a link or button with the text More. Clicking this link
loads the next ten items from the list of posts. This works because the More
button has the timestamp, or id of the last item in the list and clicking
loads the next ten posts that have an id or timestamp smaller than the current
last post.
When a search engine finds these links, it creates more search results than in
a model where each post can only be added to one page. In this model each post
can be on as much as ten pages on any given time, depending on how often a
search engine (or user) finds a link to such a page.
Conclusion
Each solution works best in a different situation. I prefer blogs that use the
year-month approach for splitting up the pages, because the posts are split in
a natural way.
In a searchengine however, or on other ephemeral pages, the More approach is
better. Because most people don't want to go deeper into the results, but if
they want, they can use the More button.
Posted February 3, 2010
I just released a small program that gets the ip address or your computer. The
nice thing is, this service is REST-based.
You can find it at Stuifzand Software Tools.
Posted January 29, 2010
I just read something that seems really nice. It's even strange that it isn't
already used like that. Use an email address-like identifier as pointer to an
account. Let me explain.
Email addresses consist of two parts seperated by an @-sign. The first part is
the username, the second part is the domain name of your email provider. For
example, my email address is peter@example.com. The username is peter and
the domain name is stuifzand.eu.
By taking this approach of username@domain we can dream all kinds of other
combinations that could work.
pstuifzand@twitter.com
peter@wijvervelenons.nl
pstuifzand@flickr.com
You see? And these could all point to all kinds of user accounts. This doesn't
mean that all these addresses should be email addresses. I also don't say that
they shouldn't be. For some things I would make sense, for others it maybe
doesn't.
Posted January 22, 2010
Plack is a Perl Web Server:
Plack is the superglue interface between perl web application frameworks and
web servers, just like Perl is the duct tape of the internet.
This is interesting and maybe useful in my webshop platform.
Posted May 25, 2009
Sometimes I have the following ideas about blogging and other writing for the
web. The thing is that these ideas will not apply in the same way to other people.
I think some of the points can be generalized for other people and
applications.
Notes
These blog posts on this website are written with Vim and a few scripts through
a SSH connection to my server.
I would like to use Vim to edit my blog posts.
Vim should be able to edit a URL and use GET and PUT to read and write the page.
Ideally the GET will only supply the content of the page; the navigation
surrounding the content should be added later by the blogging software.
Starting a new blog posts should be as easy as POSTing to a URL.
The same as I use Vim for the text parts of the website, I want to use Gimp
to be able to edit pictures.
The text of the blog posts can be written in HTML or Markdown (which is what
I use at the moment).
Navigation should be added later. Content can be edited. The user doesn't
have to update headers and footers for each page.
General points
All programs should be able to GET, PUT, POST to URLs. DELETE could be
implemented in the browser.
HTML will stay the main language for publishing content on the internet.
The website should add the headers and footers for the user.
Posted November 28, 2007
Maybe I should link to my Pownce profile.
Posted November 27, 2007
I just received my Pownce invitation. Now let's see what this is.
Posted June 26, 2005
At lifehacker I came across this incredible tool for del.icio.us. It is called the del.icio.us Direc.tor. It a combination of webapp and del.icio.us API and Ajax.
This probably is one of the steps were going to see on the web, web applications that are written over webbased apis. Web based tools with nice user friendly clean interfaces that do something that's different enough from the actual data provider website.
Very impressive. We need more of this.