The writings of Peter Stuifzand

I had a nice idea a few days ago. A simple module that creates the rules part of a Marpa grammar, but without code generation and lexers. Just the bit that Marpa::XS::Grammar uses.

So I wrote it in a few minutes. It's not really complicated and provides an example how to write a short parser.

Lets start with a short example of what you would write.

use Marpa::XS;
use MarpaX::Simple::Rules 'parse_rules';


my $rules = parse_rules(<<"RULES");
RULES

my $grammar = Marpa::XS::Grammar->new({
    rules => $rules,
    ...
});

At the ... you should add more arguments to Marpa::XS depending on how you want to use it. Now let's write a simple example of some rules.

my $rules = parse_rules(<<"RULES");
expr   ::= term
term   ::= term plus term
term   ::= factor
factor ::= factor mul factor
factor ::= number
RULES

These rules don't yet contain actions, so Marpa will call the default_action or a function named like that left hand side name. For information you should take a look at the Marpa::XS documentation.

Now let's some specific action to all the rules. We do this by adding => action_name to the end of each of the rules.

my $rules = parse_rules(<<"RULES");
expr   ::= term                  => arg_0
term   ::= term plus term        => plus
term   ::= factor                => arg_0
factor ::= factor mul factor     => mul
factor ::= number                => arg_0
RULES

No functions are provided for you. The names like arg_0 are just generic functions that you should write yourself. If you create a package that uses this module you could use the actions argument for the grammar.

my $grammar = Marpa::XS::Grammar->new({
    ...
    actions => __PACKAGE__,
    ...
});

If you write the functions like the following, the Marpa will call the right thing.

package YYY;
...

sub arg_0 {
    shift; return $_[0];
}
...

I will release the code to CPAN so it will appear somewhat later, but you can already take a look at it at github/MarpaX::Simple::Rules.

The three problems that need to be fixed are subscription, feed reading and feed creation.

Subscription

Subscribing to feeds is really hard. It should be really simple. Unsubscribing should be just as simple.

  1. See something that you want to subscribe to? Click the subscribe button.
  2. Done!

Unsubscribing should be just as easy.

  1. You see a post in your feed reader that you don't want to see.
  2. Click Hide.
  3. Show an option for unsubscription.
  4. Click, gone!

Or:

  1. Visit the page you want unsubscribe from in your browser.
  2. Click the unsubscribe button.
  3. No step three!

That's easy. Why isn't this possible yet?

Feed reading

Reading a feed should also be really simple. Dave Winer has the River of News. A small paragraph with optional title, description and link. That's how simple it should be.

Do you see something that you want to read? Read it. If you don't want to read it, just scroll further for other posts.

Nothing is remembered or saved. Unless you want it too. Click 'star', 'favorite' or 'bookmark', and it is saved in a to read feed, so you can find it later.

Feed creation

You have a feed. Write an update in a textarea. Click 'Publish'. Done. It appears in your feed and the rivers of everyone who subscribed.

These three parts are easy and important.

In my last post I wrote about what the different rewrites of more advanced rules would look like. The question now is, when do you have a grammar that's basic enough?

This is the basic configuration of the Marpa::XS class. In this configuration you can specify left hand sides and right hand sides and the star and plus operator. If your tree looks like this, then it's basic enough.

Parser    ::= Rule+
Rule      ::= Lhs DeclareOp Rhs
Lhs       ::= Name
Rhs       ::= Names
Rhs       ::= Name Star
Rhs       ::= Name Plus
Names     ::= Name+

This grammar consists of only 7 lines, so that's pretty good. This are the basics, but it assumes a few things.

  1. Ignores whitespace. This grammar completely ignores whitespace. It will work if your tokenizer removes whitespace from the front of each token before it passes it to the Recognizer.
  2. DeclareOp, Name, Star and Plus are terminals which are recognized by the tokenizer.
  3. Tokens and actions are declared somewhere else. In the grammar that I used I can specify characters, regex tokens and actions. See github/MarpaX-Parser-Marpa.

A grammar rewriter should be able to rewrite the advanced rules to a grammar that looks like this.

I will leave you with the following example, that tries to match 0 or 1 occurences of B.

A     ::= B?         => A ::= Null
                        A ::= B 

Let's try to write some rewrite rules for a next level Marpa to a lower level Marpa. The syntax is not important, this is about how we can rewrite the rules.

In the actual grammars none of the left side patterns work, but all of the right side rewritten version should work.

Alternations

An alternation is a rule that can match either way B, C or D.

A ::= B | C     =>  A ::= B
                    A ::= C

A ::= B | C | D =>  A ::= B
                    A ::= C
                    A ::= D

Plus, Star

Plus means match one or more times, star means match zero of more times. Normally you can't use + or * inside of a rule. Only whole rules can be used like this. If you want to use these you need to create a new rule and use it inside.

A ::= B+ C      =>  A   ::= SR0 C
                    SR0 ::= B+

Subrule

A subrule is a rule that is part of another rule. A subrule behaves as if it is a rule.

A ::= (B C) D   =>  A   ::= SR0 D
                    SR0 ::= B C

A ::= (B C)+ D  =>  A   ::= SR0 D
                    SR0 ::= SR1+
                    SR1 ::= B C

Count

Match exactly n times.

A ::= B{n}

A ::= B{1}      =>  A   ::= B
A ::= B{2}      =>  A   ::= B B
A ::= B{10}     =>  A   ::= B B B B B B B B B B

Min

Match at least min times.

A ::= B{min,}

A ::= B{2,}     =>  A   ::= B B
                    A   ::= B B SR0
                    SR0 ::= B*

Max

Match no more than max times.

A ::= B{,max}

A ::= B{,10}    =>  A   ::= Null
                    A   ::= B
                    A   ::= B B
                    A   ::= B B B
                    A   ::= B B B B
                    A   ::= B B B B B
                    A   ::= B B B B B B
                    A   ::= B B B B B B B
                    A   ::= B B B B B B B B
                    A   ::= B B B B B B B B B
                    A   ::= B B B B B B B B B B

Min, Max

Match at least min times, but no more than max times.

A ::= B{min, max}

A ::= B{2,5}    =>  A   ::= B B
                    A   ::= B B B
                    A   ::= B B B B
                    A   ::= B B B B B

I will keep the namespaces, modules and includes in my brain for another time.

No network joins an internetwork smaller than itself.

source

Yesterday I was working on my realtime feed reader (and writer). I can read feeds with this, but I can also write posts. With it you can write a small post with a title, description and link.

When I was looking around for websites that support WebFinger I found a similar thing on Twitter. It contained the host-meta file, but instead of showing WebFinger related links, it showed something called an OExchange link. So I took a look what it was.

OExchange is a protocol that allows sharing of URLs with any service. It supports discovery of share URLs. When you have a service that allows sharing of URLs, then OExchange allows tool builders to share URLs with your service.

I created such a service, so I thought how about using this to share URLs. So I added these files. Now I couldn't find tools, that allow me to use this with my own service. Every tool I found could only share with pre-discovered websites. I couldn't even put my own domain name in there. That sucks for a tool that is about discovery.

Someone should write a Firefox extension that uses this. Maybe Firefox Share could do this?

Ok, but lets get back to the point I was trying to make. What about using a similar thing to allow tool developers to discover URLs to subscribe to feeds?

Such a tool or extension could include a list of pre configured domains, but allow you to add your own domains. The discovery process takes care of the rest.

It important to know that this file could be cached. It shouldn't be downloaded every time you like to subscribe to a feed.

This could make subscribing to feeds as easy as it is in Twitter. How about that?

Tonight I wrote this parser in Perl with Marpa::XS. The nice thing about Marpa is that it will tell you which terminals (or tokens) it expects at the current position. So instead of guessing what the next token will be you can just try to parse a token and try it.

I got the general idea from the TAP::Spec::Parser. The difference is that it will try all alternatives at once, while mine tries each after the other until one works.

I hope this will help people to understand this kind of parsing better. Let me know what you think.

I was thinking about implicit algorithms and implicit datastructures and thought how this applied to templates in web applications. I found that there are a few places where I write the same template code with just a few different pieces.

One of the examples is a piece of code for showing a text field with a few labels around it. This piece of template code is used for almost every field in the product edit (and new) form. This is repeated code, but we should not do this, right?

There are a few parts to one of these pieces of code. It has a textual name, a short description, an input field (which depends on the type of the value) and the actual value in the field.

<p><label for="sku">SKU</label> <span class='description'>Stock keeping unit</span>
<input type="text" id="sku" name="sku" value="[% product.sku %]"></p>

It could look like the above. To get an idea about what an implicit algorithms is, you could say that you're the computer and the data structure is in your head.

Write 'p' start tag
Write 'label' start tag with 'for' attribute with value 'id of field'
Write name of field
Write 'label' end tag
Write a space
If field has a description
    Write 'span' start tag with 'class' attribute with value 'description'
    Write description of field
    Write 'span' end tag
End
Write 'input' tag with type, id, name and value.
Write 'p' end tag

This piece of pseudocode shows how we run the piece of code in our head. The strange thing is that computers are actually really good at following instructions. A computer could this piece of code better than you could.

This piece of code is implicit in the template. Why didn't I write it as a piece of code?

There is also an implicit datastructure in there. Maybe something like this:

Field {
    string title
    string name
    string id
    string description
    string type
}

Field { "SKU", "sku", "sku", "Stock keeping unit", "text" }

If I create another line like this field description I can let the computer create a piece of HTML code for me, without thinking about it.

Field { "Name", "name", "name", Name of the product", "text" }

The thing is that by starting to use datastructures and algorithms for this, we can start to write more general functions that can be used for even more parts of your web app. Imagine the possibilities.

I was working on a small game that displays some blocks. These blocks need to move. The way to make these blocks move is by changing the x and y position over time.

The way I did this in the game was by calling a tick() method every 1/60th of a second on every block that could be moving. Then the block can change it's position according to some algorithm. I just wrote code that makes these changes to the position.

But if you take a closer look at the code you can see that some lines are connected. These lines are part of the same algorithm, but there is no place where all the parts come together in the same place.

The blocks in the first level have three states.

  1. Do nothing. Stay in the same position.
  2. Falling. When falling the block makes sure its position returns to the default position.
  3. Shaking. Some blocks shake, other don't. When shaking the position is changed randomly around its current position, the position from state 1.

Let's look at shaking. We shake the block by changing the difference from the current position of the block by a small random amount. In the code it looks like this.

if (selected) {
    this.dx = rnd.nextInt(3) - 1;
    this.dy = rnd.nextInt(3) - 1;
}

The fields dx and dy are the difference in pixel position that this object is drawn on screen.

The Level class contains a small part of this algorithm. It makes sure that the object is drawn (dx,dy) pixels from its normal position. Without this line of code the algorithm wouldn't work.

In a way there is also an implicit datastructure that contains the difference position. The code doesn't make that obvious.

Making it obvious

The is a piece of code contained in the game that shakes an Entity. By extracting this into a class we can help other programmers and reuse this piece of code. But we have to look at the concepts that are used in this code.

A few concepts that I think are in this code are: shake, tick, difference in position and entity. There are as always many ways to do the same thing.

public class Shake implements Mover {
    private static final Random rnd = new Random();
    private Movable object;

    public Shake(Movable dm)  { this.object=dm; }

    public void tick() {
        object.moveBy(rnd.nextInt(3)-1, rnd.nextInt(3)-1);
    }
}

Now we need to write multiple objects that do a similar thing. This way we can show that the concept is useful.

public class Falling implements Mover {
    private Movable object;

    public Falling(Movable m) { this.object=dm; }

    public void tick() {
        if (m.y() < 0) {
            m.moveBy(0, 3);
        }
    }
}

We can see here a falling block. This code will make the block fall automatically. Because of the way the game works a block will only fall by a certain amount. Then it needs to move to the next position in the level.

I will write some more about implicit algorithms (and datastructures) later.

I have an article in the pipeline about implicit algorithms (and datastructures), which is a continuation of a few articles I wrote last year and the year before. It explains the same idea, but from another point of view: a small game that I started to write.

The thing is that the article isn't that good. Especially the example is too simple. Now it just hangs there. It doesn't move and I don't want to touch it anymore.

I do think that it about an idea that is really important. I will publish it after publishing this.

View archived entries