Posted April 14, 2012
I had a nice idea a few days ago. A simple module that creates the rules part
of a Marpa grammar, but without code generation and lexers. Just the bit that
Marpa::XS::Grammar uses.
So I wrote it in a few minutes. It's not really complicated and provides an
example how to write a short parser.
Lets start with a short example of what you would write.
use Marpa::XS;
use MarpaX::Simple::Rules 'parse_rules';
my $rules = parse_rules(<<"RULES");
RULES
my $grammar = Marpa::XS::Grammar->new({
rules => $rules,
...
});
At the ... you should add more arguments to Marpa::XS depending on how you
want to use it. Now let's write a simple example of some rules.
my $rules = parse_rules(<<"RULES");
expr ::= term
term ::= term plus term
term ::= factor
factor ::= factor mul factor
factor ::= number
RULES
These rules don't yet contain actions, so Marpa will call the default_action
or a function named like that left hand side name. For information you should
take a look at the Marpa::XS documentation.
Now let's some specific action to all the rules. We do this by adding => action_name
to the end of each of the rules.
my $rules = parse_rules(<<"RULES");
expr ::= term => arg_0
term ::= term plus term => plus
term ::= factor => arg_0
factor ::= factor mul factor => mul
factor ::= number => arg_0
RULES
No functions are provided for you. The names like arg_0 are just generic
functions that you should write yourself. If you create a package that uses
this module you could use the actions argument for the grammar.
my $grammar = Marpa::XS::Grammar->new({
...
actions => __PACKAGE__,
...
});
If you write the functions like the following, the Marpa will call the right
thing.
package YYY;
...
sub arg_0 {
shift; return $_[0];
}
...
I will release the code to CPAN so it will appear somewhat later, but you can already take a look at it at
github/MarpaX::Simple::Rules.
Posted April 7, 2012
The three problems that need to be fixed are subscription, feed reading and
feed creation.
Subscription
Subscribing to feeds is really hard. It should be really simple. Unsubscribing
should be just as simple.
- See something that you want to subscribe to? Click the subscribe button.
- Done!
Unsubscribing should be just as easy.
- You see a post in your feed reader that you don't want to see.
- Click Hide.
- Show an option for unsubscription.
- Click, gone!
Or:
- Visit the page you want unsubscribe from in your browser.
- Click the unsubscribe button.
- No step three!
That's easy. Why isn't this possible yet?
Feed reading
Reading a feed should also be really simple. Dave Winer
has the River of News. A small paragraph
with optional title, description and link. That's how simple it should be.
Do you see something that you want to read? Read it. If you don't want to read
it, just scroll further for other posts.
Nothing is remembered or saved. Unless you want it too. Click 'star',
'favorite' or 'bookmark', and it is saved in a to read feed, so you can find it
later.
Feed creation
You have a feed. Write an update in a textarea. Click 'Publish'. Done. It
appears in your feed and the rivers of everyone who subscribed.
These three parts are easy and important.
Posted April 5, 2012
In my last post I wrote about what the different rewrites of more advanced
rules would look like. The question now is, when do you have a grammar that's
basic enough?
This is the basic configuration of the Marpa::XS class. In this configuration
you can specify left hand sides and right hand sides and the star and plus
operator. If your tree looks like this, then it's basic enough.
Parser ::= Rule+
Rule ::= Lhs DeclareOp Rhs
Lhs ::= Name
Rhs ::= Names
Rhs ::= Name Star
Rhs ::= Name Plus
Names ::= Name+
This grammar consists of only 7 lines, so that's pretty good. This are the
basics, but it assumes a few things.
- Ignores whitespace. This grammar completely ignores whitespace. It will work
if your tokenizer removes whitespace from the front of each token before it
passes it to the Recognizer.
- DeclareOp, Name, Star and Plus are terminals which are recognized by the
tokenizer.
- Tokens and actions are declared somewhere else. In the grammar that I used I
can specify characters, regex tokens and actions. See
github/MarpaX-Parser-Marpa.
A grammar rewriter should be able to rewrite the advanced rules to a grammar
that looks like this.
I will leave you with the following example, that tries to match 0 or 1 occurences of B.
A ::= B? => A ::= Null
A ::= B
Posted April 4, 2012
Let's try to write some rewrite rules for a next level Marpa to a lower level
Marpa. The syntax is not important, this is about how we can rewrite the rules.
In the actual grammars none of the left side patterns work, but all of the
right side rewritten version should work.
Alternations
An alternation is a rule that can match either way B, C or D.
A ::= B | C => A ::= B
A ::= C
A ::= B | C | D => A ::= B
A ::= C
A ::= D
Plus, Star
Plus means match one or more times, star means match zero of more times.
Normally you can't use + or * inside of a rule. Only whole rules can be
used like this. If you want to use these you need to create a new rule and use
it inside.
A ::= B+ C => A ::= SR0 C
SR0 ::= B+
Subrule
A subrule is a rule that is part of another rule. A subrule behaves as if it is
a rule.
A ::= (B C) D => A ::= SR0 D
SR0 ::= B C
A ::= (B C)+ D => A ::= SR0 D
SR0 ::= SR1+
SR1 ::= B C
Count
Match exactly n times.
A ::= B{n}
A ::= B{1} => A ::= B
A ::= B{2} => A ::= B B
A ::= B{10} => A ::= B B B B B B B B B B
Min
Match at least min times.
A ::= B{min,}
A ::= B{2,} => A ::= B B
A ::= B B SR0
SR0 ::= B*
Max
Match no more than max times.
A ::= B{,max}
A ::= B{,10} => A ::= Null
A ::= B
A ::= B B
A ::= B B B
A ::= B B B B
A ::= B B B B B
A ::= B B B B B B
A ::= B B B B B B B
A ::= B B B B B B B B
A ::= B B B B B B B B B
A ::= B B B B B B B B B B
Min, Max
Match at least min times, but no more than max times.
A ::= B{min, max}
A ::= B{2,5} => A ::= B B
A ::= B B B
A ::= B B B B
A ::= B B B B B
I will keep the namespaces, modules and includes in my brain for another time.
Posted March 30, 2012
No network joins an internetwork smaller than itself.
source
Posted March 26, 2012
Yesterday I was working on my realtime feed reader (and writer). I can read
feeds with this, but I can also write posts. With it you can write a small
post with a title, description and link.
When I was looking around for websites that support WebFinger I found a similar
thing on Twitter. It contained the host-meta file, but instead of showing
WebFinger related links, it showed something called an
OExchange link. So I took a look what it was.
OExchange is a protocol that allows sharing of URLs with any service. It
supports discovery of share URLs. When you have a service that allows sharing
of URLs, then OExchange allows tool builders to share URLs with your service.
I created such a service, so I thought how about using this to share URLs. So I
added these files. Now I couldn't find tools, that allow me to use this with my
own service. Every tool I found could only share with pre-discovered websites.
I couldn't even put my own domain name in there. That sucks for a tool that is
about discovery.
Someone should write a Firefox extension that uses this. Maybe Firefox Share
could do this?
Ok, but lets get back to the point I was trying to make. What about using a
similar thing to allow tool developers to discover URLs to subscribe to feeds?
Such a tool or extension could include a list of pre configured domains, but
allow you to add your own domains. The discovery process takes care of the
rest.
It important to know that this file could be cached. It shouldn't be downloaded
every time you like to subscribe to a feed.
This could make subscribing to feeds as easy as it is in Twitter. How about that?
Posted March 14, 2012
Tonight I wrote this parser in Perl with Marpa::XS.
The nice thing about Marpa is that it will tell you which terminals (or tokens)
it expects at the current position. So instead of guessing what the next token
will be you can just try to parse a token and try it.
I got the general idea from the TAP::Spec::Parser. The difference is that it will try all alternatives at once, while mine tries each after the other until one works.
I hope this will help people to understand this kind of parsing better. Let me
know what you think.
Posted March 5, 2012
I was thinking about implicit algorithms and implicit datastructures and
thought how this applied to templates in web applications. I found that there
are a few places where I write the same template code with just a few different
pieces.
One of the examples is a piece of code for showing a text field with a few
labels around it. This piece of template code is used for almost every field in
the product edit (and new) form. This is repeated code, but we
should not do this, right?
There are a few parts to one of these pieces of code. It has a textual name, a
short description, an input field (which depends on the type of the value) and
the actual value in the field.
<p><label for="sku">SKU</label> <span class='description'>Stock keeping unit</span>
<input type="text" id="sku" name="sku" value="[% product.sku %]"></p>
It could look like the above. To get an idea about what an implicit algorithms
is, you could say that you're the computer and the data structure is in your
head.
Write 'p' start tag
Write 'label' start tag with 'for' attribute with value 'id of field'
Write name of field
Write 'label' end tag
Write a space
If field has a description
Write 'span' start tag with 'class' attribute with value 'description'
Write description of field
Write 'span' end tag
End
Write 'input' tag with type, id, name and value.
Write 'p' end tag
This piece of pseudocode shows how we run the piece of code in our head. The
strange thing is that computers are actually really good at following
instructions. A computer could this piece of code better than you could.
This piece of code is implicit in the template. Why didn't I write it as a
piece of code?
There is also an implicit datastructure in there. Maybe something like this:
Field {
string title
string name
string id
string description
string type
}
Field { "SKU", "sku", "sku", "Stock keeping unit", "text" }
If I create another line like this field description I can let the computer
create a piece of HTML code for me, without thinking about it.
Field { "Name", "name", "name", Name of the product", "text" }
The thing is that by starting to use datastructures and algorithms for this, we
can start to write more general functions that can be used for even more parts
of your web app. Imagine the possibilities.
Posted March 5, 2012
I was working on a small game that displays some blocks. These blocks need
to move. The way to make these blocks move is by changing the x and y
position over time.
The way I did this in the game was by calling a tick() method every 1/60th
of a second on every block that could be moving. Then the block can change
it's position according to some algorithm. I just wrote code that makes
these changes to the position.
But if you take a closer look at the code you can see that some lines are
connected. These lines are part of the same algorithm, but there is no place
where all the parts come together in the same place.
The blocks in the first level have three states.
- Do nothing. Stay in the same position.
- Falling. When falling the block makes sure its position returns to the
default position.
- Shaking. Some blocks shake, other don't. When shaking the position is
changed randomly around its current position, the position from state 1.
Let's look at shaking. We shake the block by changing the difference from the
current position of the block by a small random amount. In the code it looks
like this.
if (selected) {
this.dx = rnd.nextInt(3) - 1;
this.dy = rnd.nextInt(3) - 1;
}
The fields dx and dy are the difference in pixel position that this
object is drawn on screen.
The Level class contains a small part of this algorithm. It makes sure that
the object is drawn (dx,dy) pixels from its normal position. Without this
line of code the algorithm wouldn't work.
In a way there is also an implicit datastructure that contains the
difference position. The code doesn't make that obvious.
Making it obvious
The is a piece of code contained in the game that shakes an Entity. By
extracting this into a class we can help other programmers and reuse this
piece of code. But we have to look at the concepts that are used in this
code.
A few concepts that I think are in this code are: shake, tick, difference in
position and entity. There are as always many ways to do the same thing.
public class Shake implements Mover {
private static final Random rnd = new Random();
private Movable object;
public Shake(Movable dm) { this.object=dm; }
public void tick() {
object.moveBy(rnd.nextInt(3)-1, rnd.nextInt(3)-1);
}
}
Now we need to write multiple objects that do a similar thing. This way we can
show that the concept is useful.
public class Falling implements Mover {
private Movable object;
public Falling(Movable m) { this.object=dm; }
public void tick() {
if (m.y() < 0) {
m.moveBy(0, 3);
}
}
}
We can see here a falling block. This code will make the block fall
automatically. Because of the way the game works a block will only fall by a
certain amount. Then it needs to move to the next position in the level.
I will write some more about implicit algorithms (and datastructures) later.
Posted March 5, 2012
I have an article in the pipeline about implicit algorithms (and
datastructures), which is a continuation of a few articles I wrote
last year and the year before.
It explains the same idea, but from another point of view: a small game that I
started to write.
The thing is that the article isn't that good. Especially the example is too
simple. Now it just hangs there. It doesn't move and I don't want to touch it
anymore.
I do think that it about an idea that is really important. I will publish
it after publishing this.