Hey,

I’m trying to implement a parser for an island grammar using Spoofax (Version 1.0.2.0-R24315). If you are not familiar with island grammars, the idea is to have specific rules for certain interesting code parts (“islands”) and a very general rule catching everything else (“water”).
I started with a newly created Spoofax project and added the following grammar rules:
In Common.sdf:

lexical syntax

  ~[\ \n\t\r]+				-> Water {avoid}

context-free restrictions

  Water	-/- ~[\ \t\n\r]

In the main grammar file:

context-free syntax

  Token*                    -> Start
  "Test" "Statement"        -> Token{cons("Statement"), prefer}
  Water                     -> Token{cons("Water"), avoid}

Now, my program looks like this:

Test Statement
Anything
Test Statement
Water

This should be parsed as

    Statement(), Water("Anything"), Statement(), Water("Water")

Unfortunately, the parse tree looks like this:

[ amb(
    [ [Statement(), Water("Anything"), Statement()]
    , [Statement(), Water("Anything"), Water("Test"), Water("Statement")]
    ]
  )
, Water("Water")
]

Shouldn’t the avoid take care of the disambiguation? I tried to leave out the prefer and avoid in the two Token-rules (the avoid in the lexical syntax should be enough) and I tried using explicit priorities:

  	"Test" "Statement"     -> Token >
    Water                  -> Token

None of that worked. The problem occurs as soon as a Statement occurs after Water. If I only have a statement followed by water everything is fine. But my goal is to filter specific statements/constructs from source code…

Do I misunderstand the disambiguation features or is this a bug?

I understand that I might use stratego rules to resolve the ambiguities, but it would be much nicer to have an unambigous grammar in the first place.

Submitted by Johannes on 6 July 2012 at 13:11

On 9 July 2012 at 10:16 Lennart Kats commented:

I’m afraight this is the expected statement. Basically, your grammar doesn’t specify how to disambiguate

Statement()

vs. a list, which in this case is

[Water("Test"), Water("Statement")]

Disambiguation of nested structures like this can’t really be expressed in SDF. There are heuristic rules for this, that would count the avoid-value of a nested structure like the above, but they’re only applied as a post-parse filter. Right now there’s no nice way to enable these filters for an entire language, but this may be a (the?) valid use case. Alternatively, you can just do it from Stratego, if you’re post-parse filtering anyway.


On 9 July 2012 at 10:34 Johannes commented:

Thanks for your answer!

Alright, I see the problem now. That’s too bad that the disambiguation I imagined is not possible. So I’ll have to go with Stratego filtering.


On 13 November 2012 at 13:11 Maartje closed this issue.

Log in to post comments