Posted April 14, 2012
I had a nice idea a few days ago. A simple module that creates the rules part
of a Marpa grammar, but without code generation and lexers. Just the bit that
Marpa::XS::Grammar uses.
So I wrote it in a few minutes. It's not really complicated and provides an
example how to write a short parser.
Lets start with a short example of what you would write.
use Marpa::XS;
use MarpaX::Simple::Rules 'parse_rules';
my $rules = parse_rules(<<"RULES");
RULES
my $grammar = Marpa::XS::Grammar->new({
rules => $rules,
...
});
At the ... you should add more arguments to Marpa::XS depending on how you
want to use it. Now let's write a simple example of some rules.
my $rules = parse_rules(<<"RULES");
expr ::= term
term ::= term plus term
term ::= factor
factor ::= factor mul factor
factor ::= number
RULES
These rules don't yet contain actions, so Marpa will call the default_action
or a function named like that left hand side name. For information you should
take a look at the Marpa::XS documentation.
Now let's some specific action to all the rules. We do this by adding => action_name
to the end of each of the rules.
my $rules = parse_rules(<<"RULES");
expr ::= term => arg_0
term ::= term plus term => plus
term ::= factor => arg_0
factor ::= factor mul factor => mul
factor ::= number => arg_0
RULES
No functions are provided for you. The names like arg_0 are just generic
functions that you should write yourself. If you create a package that uses
this module you could use the actions argument for the grammar.
my $grammar = Marpa::XS::Grammar->new({
...
actions => __PACKAGE__,
...
});
If you write the functions like the following, the Marpa will call the right
thing.
package YYY;
...
sub arg_0 {
shift; return $_[0];
}
...
I will release the code to CPAN so it will appear somewhat later, but you can already take a look at it at
github/MarpaX::Simple::Rules.
Posted April 7, 2012
The three problems that need to be fixed are subscription, feed reading and
feed creation.
Subscription
Subscribing to feeds is really hard. It should be really simple. Unsubscribing
should be just as simple.
- See something that you want to subscribe to? Click the subscribe button.
- Done!
Unsubscribing should be just as easy.
- You see a post in your feed reader that you don't want to see.
- Click Hide.
- Show an option for unsubscription.
- Click, gone!
Or:
- Visit the page you want unsubscribe from in your browser.
- Click the unsubscribe button.
- No step three!
That's easy. Why isn't this possible yet?
Feed reading
Reading a feed should also be really simple. Dave Winer
has the River of News. A small paragraph
with optional title, description and link. That's how simple it should be.
Do you see something that you want to read? Read it. If you don't want to read
it, just scroll further for other posts.
Nothing is remembered or saved. Unless you want it too. Click 'star',
'favorite' or 'bookmark', and it is saved in a to read feed, so you can find it
later.
Feed creation
You have a feed. Write an update in a textarea. Click 'Publish'. Done. It
appears in your feed and the rivers of everyone who subscribed.
These three parts are easy and important.
Posted April 5, 2012
In my last post I wrote about what the different rewrites of more advanced
rules would look like. The question now is, when do you have a grammar that's
basic enough?
This is the basic configuration of the Marpa::XS class. In this configuration
you can specify left hand sides and right hand sides and the star and plus
operator. If your tree looks like this, then it's basic enough.
Parser ::= Rule+
Rule ::= Lhs DeclareOp Rhs
Lhs ::= Name
Rhs ::= Names
Rhs ::= Name Star
Rhs ::= Name Plus
Names ::= Name+
This grammar consists of only 7 lines, so that's pretty good. This are the
basics, but it assumes a few things.
- Ignores whitespace. This grammar completely ignores whitespace. It will work
if your tokenizer removes whitespace from the front of each token before it
passes it to the Recognizer.
- DeclareOp, Name, Star and Plus are terminals which are recognized by the
tokenizer.
- Tokens and actions are declared somewhere else. In the grammar that I used I
can specify characters, regex tokens and actions. See
github/MarpaX-Parser-Marpa.
A grammar rewriter should be able to rewrite the advanced rules to a grammar
that looks like this.
I will leave you with the following example, that tries to match 0 or 1 occurences of B.
A ::= B? => A ::= Null
A ::= B
Posted April 4, 2012
Let's try to write some rewrite rules for a next level Marpa to a lower level
Marpa. The syntax is not important, this is about how we can rewrite the rules.
In the actual grammars none of the left side patterns work, but all of the
right side rewritten version should work.
Alternations
An alternation is a rule that can match either way B, C or D.
A ::= B | C => A ::= B
A ::= C
A ::= B | C | D => A ::= B
A ::= C
A ::= D
Plus, Star
Plus means match one or more times, star means match zero of more times.
Normally you can't use + or * inside of a rule. Only whole rules can be
used like this. If you want to use these you need to create a new rule and use
it inside.
A ::= B+ C => A ::= SR0 C
SR0 ::= B+
Subrule
A subrule is a rule that is part of another rule. A subrule behaves as if it is
a rule.
A ::= (B C) D => A ::= SR0 D
SR0 ::= B C
A ::= (B C)+ D => A ::= SR0 D
SR0 ::= SR1+
SR1 ::= B C
Count
Match exactly n times.
A ::= B{n}
A ::= B{1} => A ::= B
A ::= B{2} => A ::= B B
A ::= B{10} => A ::= B B B B B B B B B B
Min
Match at least min times.
A ::= B{min,}
A ::= B{2,} => A ::= B B
A ::= B B SR0
SR0 ::= B*
Max
Match no more than max times.
A ::= B{,max}
A ::= B{,10} => A ::= Null
A ::= B
A ::= B B
A ::= B B B
A ::= B B B B
A ::= B B B B B
A ::= B B B B B B
A ::= B B B B B B B
A ::= B B B B B B B B
A ::= B B B B B B B B B
A ::= B B B B B B B B B B
Min, Max
Match at least min times, but no more than max times.
A ::= B{min, max}
A ::= B{2,5} => A ::= B B
A ::= B B B
A ::= B B B B
A ::= B B B B B
I will keep the namespaces, modules and includes in my brain for another time.