Monday, July 31, 2006

July 31st, 2006 -- Parsec emitter and underlay problems

Main goal of this week was to complete the Parsec emitter. However, due to some natural limitation, it's not possible to really "complete" it, I just try my best to complete as many components as I can and hope those be enough for writing complete Perl 6 grammar. The accepting grammar constructions implemented from last post are:
  • :sigspace option
  • complete \X syntax (but not \Xxxx)
  • numbered captures
  • subrule with parameters
  • non-capture group

And I mailed pmichaud++ about the different semantics of "non-backtrack" in Perl 6 rules and Parsec parsing strategy. The :ratchet option, which is turned on by rule and token, is to make backtrack over atom fail. But different branches are still tried even if first several atoms in one branch are matched. In Parsec, the whole parsing fails if it goes into some branch, consumes some tokens and is unable to go on further. However, this can be changed by wrapping the branch with "try," which, as the name tells, tries to match the branch but will try other ones if failed. Such action is like adding a "::" after each atom instead of adding ":", which is done by rule and token in Perl 6 grammar.

One way to solve it is to add "try" everywhere. But that means giving up the high-efficiency parsing provided by Parsec. When the grammar is not LL, it's unavoidable. Parsec performs best on LL grammar (from official page), so in this stage, I'll feed only LL grammar to it and no additional "try" is added.

Sunday, July 23, 2006

July 23rd, 2006 -- Pugs::Emitter::Rule::Parsec

I finally escaped from the final exams and projects, and the unexpected busy early July. The first checkin of the Pugs::Emitter::Rule::Parsec module is on July 20th. It accepted and emitted correct Haskell code on the yada example given in the README of MiniPerl6 module. (By the way, the Pugs::Grammar::MiniPerl6 and Pugs::Compiler::Rule modules have been moved from pX/ to perl5/)

After three days' hacking, it now accepts a lot of rule constructions. Also, a test file was added. In Parser.Literal there are 10 parser routines, two of them take arguments (namedLiteral and possiblyTypeLiteral) which I currently don't know how to present in Perl 6 rule, one uses previous parsing state to decide next action (ruleWordboundary), two use negative look-ahead (ruleDot and ruleLongDot) but <!before pattern> is not ready yet, all other five can be easily generated. In fact, all five of them are already in the test file, the result is tested by replacing existing code by the generated one proven that it gears Pugs' parser, too.

UPDATE: <!before pattern> support is added to both Pugs::Compiler::Rule and Parsec emitter. However, since the existing parser is not LL, I have to put an additional "try" in the generated code by hand to make it work.