Disambiguation does not work as expected (avoid)
Hey,
I’m trying to implement a parser for an island grammar using Spoofax (Version 1.0.2.0-R24315). If you are not familiar with island grammars, the idea is to have specific rules for certain interesting code parts (“islands”) and a very general rule catching everything else (“water”).
I started with a newly created Spoofax project and added the following grammar rules:
In Common.sdf:lexical syntax ~[\ \n\t\r]+ -> Water {avoid} context-free restrictions Water -/- ~[\ \t\n\r]
In the main grammar file:
context-free syntax Token* -> Start "Test" "Statement" -> Token{cons("Statement"), prefer} Water -> Token{cons("Water"), avoid}
Now, my program looks like this:
Test Statement Anything Test Statement Water
This should be parsed as
Statement(), Water("Anything"), Statement(), Water("Water")
Unfortunately, the parse tree looks like this:
[ amb( [ [Statement(), Water("Anything"), Statement()] , [Statement(), Water("Anything"), Water("Test"), Water("Statement")] ] ) , Water("Water") ]
Shouldn’t the avoid take care of the disambiguation? I tried to leave out the prefer and avoid in the two Token-rules (the avoid in the lexical syntax should be enough) and I tried using explicit priorities:
"Test" "Statement" -> Token > Water -> Token
None of that worked. The problem occurs as soon as a Statement occurs after Water. If I only have a statement followed by water everything is fine. But my goal is to filter specific statements/constructs from source code…
Do I misunderstand the disambiguation features or is this a bug?
I understand that I might use stratego rules to resolve the ambiguities, but it would be much nicer to have an unambigous grammar in the first place.
Submitted by Johannes on 6 July 2012 at 13:11
Issue Log
I’m afraight this is the expected statement. Basically, your grammar doesn’t specify how to disambiguate
Statement()
vs. a list, which in this case is
[Water("Test"), Water("Statement")]
Disambiguation of nested structures like this can’t really be expressed in SDF. There are heuristic rules for this, that would count the avoid-value of a nested structure like the above, but they’re only applied as a post-parse filter. Right now there’s no nice way to enable these filters for an entire language, but this may be a (the?) valid use case. Alternatively, you can just do it from Stratego, if you’re post-parse filtering anyway.
Thanks for your answer!
Alright, I see the problem now. That’s too bad that the disambiguation I imagined is not possible. So I’ll have to go with Stratego filtering.
Log in to post comments