The writings of Peter Stuifzand

In Use of DBI in Sqitch, Wheeler writes about Sqitch, an application for SQL change management.

Sqitch uses the native database client applications (psql, sqlite3, mysql, etc.). [...] The IPC is a huge PITA. Furthermore, getting things properly quoted is also pretty annoying

I would expect this when you use SQL command-line tools in other tools. Of course, this is Perl, which is better at this kind of thing than other languages.

Still in a Perl application, I wouldn't use SQL command-line tools for anything, except maybe a poke at the database or in a tiny (shell-like) script that just needs one value. DBI is perfect for executing queries on the database.

His main reason for using the database command-line tools is that these are always installed on the servers for the people that want to use this tool. That's probably true.

If I used the DBI instead, then Sqitch would not work at all unless you installed the appropriate DBI driver for your database of choice. This is no big deal for Perl people, of course, but I don’t want this to be a Perl people tool. I want it to be dead simple for anyone to use for any database.

Making this a tool for everybody and not just Perl users is great and important. It shouldn't matter to his audience what language this application is written in. So using the CPAN to install modules, shouldn't be something these people should have to learn to install this software.

I argue that these same people also wouldn't download a tar file and run some strange build instructions. From the README:

perl Build.PL
./Build
./Build test
./Build install

They wouldn't even download the file. Package management is a wonderful thing. And the apt-get install sqitch command should just work.

Do you use servers on the internet? Do you connect to them with ssh? Then you should take a look at SSHFS.

It allows you to mount a directory on a different computer as if it's a directory on your own. And then sshfs takes care making sure that your commands do the right thing. It's amazing.


In The Hideous Name Pike and Weinberger write about a similar thing. You can profit if you can the same tools in the same way regardless if you're on your local machine or on a remote machine.

I had a nice idea a few days ago. A simple module that creates the rules part of a Marpa grammar, but without code generation and lexers. Just the bit that Marpa::XS::Grammar uses.

So I wrote it in a few minutes. It's not really complicated and provides an example how to write a short parser.

Lets start with a short example of what you would write.

use Marpa::XS;
use MarpaX::Simple::Rules 'parse_rules';


my $rules = parse_rules(<<"RULES");
RULES

my $grammar = Marpa::XS::Grammar->new({
    rules => $rules,
    ...
});

At the ... you should add more arguments to Marpa::XS depending on how you want to use it. Now let's write a simple example of some rules.

my $rules = parse_rules(<<"RULES");
expr   ::= term
term   ::= term plus term
term   ::= factor
factor ::= factor mul factor
factor ::= number
RULES

These rules don't yet contain actions, so Marpa will call the default_action or a function named like that left hand side name. For information you should take a look at the Marpa::XS documentation.

Now let's some specific action to all the rules. We do this by adding => action_name to the end of each of the rules.

my $rules = parse_rules(<<"RULES");
expr   ::= term                  => arg_0
term   ::= term plus term        => plus
term   ::= factor                => arg_0
factor ::= factor mul factor     => mul
factor ::= number                => arg_0
RULES

No functions are provided for you. The names like arg_0 are just generic functions that you should write yourself. If you create a package that uses this module you could use the actions argument for the grammar.

my $grammar = Marpa::XS::Grammar->new({
    ...
    actions => __PACKAGE__,
    ...
});

If you write the functions like the following, the Marpa will call the right thing.

package YYY;
...

sub arg_0 {
    shift; return $_[0];
}
...

I will release the code to CPAN so it will appear somewhat later, but you can already take a look at it at github/MarpaX::Simple::Rules.

The three problems that need to be fixed are subscription, feed reading and feed creation.

Subscription

Subscribing to feeds is really hard. It should be really simple. Unsubscribing should be just as simple.

  1. See something that you want to subscribe to? Click the subscribe button.
  2. Done!

Unsubscribing should be just as easy.

  1. You see a post in your feed reader that you don't want to see.
  2. Click Hide.
  3. Show an option for unsubscription.
  4. Click, gone!

Or:

  1. Visit the page you want unsubscribe from in your browser.
  2. Click the unsubscribe button.
  3. No step three!

That's easy. Why isn't this possible yet?

Feed reading

Reading a feed should also be really simple. Dave Winer has the River of News. A small paragraph with optional title, description and link. That's how simple it should be.

Do you see something that you want to read? Read it. If you don't want to read it, just scroll further for other posts.

Nothing is remembered or saved. Unless you want it too. Click 'star', 'favorite' or 'bookmark', and it is saved in a to read feed, so you can find it later.

Feed creation

You have a feed. Write an update in a textarea. Click 'Publish'. Done. It appears in your feed and the rivers of everyone who subscribed.

These three parts are easy and important.

In my last post I wrote about what the different rewrites of more advanced rules would look like. The question now is, when do you have a grammar that's basic enough?

This is the basic configuration of the Marpa::XS class. In this configuration you can specify left hand sides and right hand sides and the star and plus operator. If your tree looks like this, then it's basic enough.

Parser    ::= Rule+
Rule      ::= Lhs DeclareOp Rhs
Lhs       ::= Name
Rhs       ::= Names
Rhs       ::= Name Star
Rhs       ::= Name Plus
Names     ::= Name+

This grammar consists of only 7 lines, so that's pretty good. This are the basics, but it assumes a few things.

  1. Ignores whitespace. This grammar completely ignores whitespace. It will work if your tokenizer removes whitespace from the front of each token before it passes it to the Recognizer.
  2. DeclareOp, Name, Star and Plus are terminals which are recognized by the tokenizer.
  3. Tokens and actions are declared somewhere else. In the grammar that I used I can specify characters, regex tokens and actions. See github/MarpaX-Parser-Marpa.

A grammar rewriter should be able to rewrite the advanced rules to a grammar that looks like this.

I will leave you with the following example, that tries to match 0 or 1 occurences of B.

A     ::= B?         => A ::= Null
                        A ::= B 

Let's try to write some rewrite rules for a next level Marpa to a lower level Marpa. The syntax is not important, this is about how we can rewrite the rules.

In the actual grammars none of the left side patterns work, but all of the right side rewritten version should work.

Alternations

An alternation is a rule that can match either way B, C or D.

A ::= B | C     =>  A ::= B
                    A ::= C

A ::= B | C | D =>  A ::= B
                    A ::= C
                    A ::= D

Plus, Star

Plus means match one or more times, star means match zero of more times. Normally you can't use + or * inside of a rule. Only whole rules can be used like this. If you want to use these you need to create a new rule and use it inside.

A ::= B+ C      =>  A   ::= SR0 C
                    SR0 ::= B+

Subrule

A subrule is a rule that is part of another rule. A subrule behaves as if it is a rule.

A ::= (B C) D   =>  A   ::= SR0 D
                    SR0 ::= B C

A ::= (B C)+ D  =>  A   ::= SR0 D
                    SR0 ::= SR1+
                    SR1 ::= B C

Count

Match exactly n times.

A ::= B{n}

A ::= B{1}      =>  A   ::= B
A ::= B{2}      =>  A   ::= B B
A ::= B{10}     =>  A   ::= B B B B B B B B B B

Min

Match at least min times.

A ::= B{min,}

A ::= B{2,}     =>  A   ::= B B
                    A   ::= B B SR0
                    SR0 ::= B*

Max

Match no more than max times.

A ::= B{,max}

A ::= B{,10}    =>  A   ::= Null
                    A   ::= B
                    A   ::= B B
                    A   ::= B B B
                    A   ::= B B B B
                    A   ::= B B B B B
                    A   ::= B B B B B B
                    A   ::= B B B B B B B
                    A   ::= B B B B B B B B
                    A   ::= B B B B B B B B B
                    A   ::= B B B B B B B B B B

Min, Max

Match at least min times, but no more than max times.

A ::= B{min, max}

A ::= B{2,5}    =>  A   ::= B B
                    A   ::= B B B
                    A   ::= B B B B
                    A   ::= B B B B B

I will keep the namespaces, modules and includes in my brain for another time.

No network joins an internetwork smaller than itself.

source

Yesterday I was working on my realtime feed reader (and writer). I can read feeds with this, but I can also write posts. With it you can write a small post with a title, description and link.

When I was looking around for websites that support WebFinger I found a similar thing on Twitter. It contained the host-meta file, but instead of showing WebFinger related links, it showed something called an OExchange link. So I took a look what it was.

OExchange is a protocol that allows sharing of URLs with any service. It supports discovery of share URLs. When you have a service that allows sharing of URLs, then OExchange allows tool builders to share URLs with your service.

I created such a service, so I thought how about using this to share URLs. So I added these files. Now I couldn't find tools, that allow me to use this with my own service. Every tool I found could only share with pre-discovered websites. I couldn't even put my own domain name in there. That sucks for a tool that is about discovery.

Someone should write a Firefox extension that uses this. Maybe Firefox Share could do this?

Ok, but lets get back to the point I was trying to make. What about using a similar thing to allow tool developers to discover URLs to subscribe to feeds?

Such a tool or extension could include a list of pre configured domains, but allow you to add your own domains. The discovery process takes care of the rest.

It important to know that this file could be cached. It shouldn't be downloaded every time you like to subscribe to a feed.

This could make subscribing to feeds as easy as it is in Twitter. How about that?

Tonight I wrote this parser in Perl with Marpa::XS. The nice thing about Marpa is that it will tell you which terminals (or tokens) it expects at the current position. So instead of guessing what the next token will be you can just try to parse a token and try it.

I got the general idea from the TAP::Spec::Parser. The difference is that it will try all alternatives at once, while mine tries each after the other until one works.

I hope this will help people to understand this kind of parsing better. Let me know what you think.

I was thinking about implicit algorithms and implicit datastructures and thought how this applied to templates in web applications. I found that there are a few places where I write the same template code with just a few different pieces.

One of the examples is a piece of code for showing a text field with a few labels around it. This piece of template code is used for almost every field in the product edit (and new) form. This is repeated code, but we should not do this, right?

There are a few parts to one of these pieces of code. It has a textual name, a short description, an input field (which depends on the type of the value) and the actual value in the field.

<p><label for="sku">SKU</label> <span class='description'>Stock keeping unit</span>
<input type="text" id="sku" name="sku" value="[% product.sku %]"></p>

It could look like the above. To get an idea about what an implicit algorithms is, you could say that you're the computer and the data structure is in your head.

Write 'p' start tag
Write 'label' start tag with 'for' attribute with value 'id of field'
Write name of field
Write 'label' end tag
Write a space
If field has a description
    Write 'span' start tag with 'class' attribute with value 'description'
    Write description of field
    Write 'span' end tag
End
Write 'input' tag with type, id, name and value.
Write 'p' end tag

This piece of pseudocode shows how we run the piece of code in our head. The strange thing is that computers are actually really good at following instructions. A computer could this piece of code better than you could.

This piece of code is implicit in the template. Why didn't I write it as a piece of code?

There is also an implicit datastructure in there. Maybe something like this:

Field {
    string title
    string name
    string id
    string description
    string type
}

Field { "SKU", "sku", "sku", "Stock keeping unit", "text" }

If I create another line like this field description I can let the computer create a piece of HTML code for me, without thinking about it.

Field { "Name", "name", "name", Name of the product", "text" }

The thing is that by starting to use datastructures and algorithms for this, we can start to write more general functions that can be used for even more parts of your web app. Imagine the possibilities.

View archived entries