I have this small example of an ambiguous grammar that doesn’t return the result I’m expecting:

module Test2

exports
context-free start-symbols
Start

context-free syntax
K -> Start {cons(“K”)}

K 	"=>" K	 -> K {non-assoc, cons("ReplaceK")}
ID 	"=>" ID	 -> ID {non-assoc, cons("ReplaceID")}

ID			-> K {cons("K2ID")}
VARID		-> K {cons("K1Var")}

lexical syntax
[a-zA-Z] -> ID
[A-Z] -> VARID

The returned AST looks like this:


amb(
[ K(K2ID(ReplaceID(“A”, “B”)))
, K(
ReplaceK(
amb([K2ID(“A”), K1Var(“A”)])
, amb([K2ID(“B”), K1Var(“B”)])
)
)
]
)

My question is, why is amb() on top of K? Shouldn’t be like this?:

 
  
K(amb(
[ K2Id(ReplaceK(“A”, “B”))
, ReplaceK(
amb([K2Id(“A”), K1Var(“A”)])
, amb([K2Id(“B”), K1Var(“B”)])
)
]
)
)

Am I doing something wrong? or is that the intended result?

Submitted by Radu Mereuta on 8 July 2011 at 14:22

On 8 July 2011 at 16:03 Lennart Kats commented:

The top-level amb node says that it’s either a ReplaceID or a ReplaceK. That seems sensible. How could both branches be ReplaceK?


On 8 July 2011 at 16:43 Radu Mereuta commented:

Sorry, I’ve messed up the second example:


K(amb(
[ ReplaceK(“A”, “B”))
, K2Id(ReplaceID(
amb([K2Id(“A”), K1Var(“A”)])
, amb([K2Id(“B”), K1Var(“B”)])
)
]
)
)
Now both ambiguities are of sort K.
The problem is that I have an example where I have different sorts next to K, something like this:

amb([ K(..), K(..), Bag(..), List(..), .. ])

This looks inconsistent to me, and I’m finding difficulties in writing a simple disambiguation algorithm.


On 8 July 2011 at 17:04 Lennart Kats commented:

I’m guessing that you mean


K(amb(
[ K2ID(ReplaceID(“A”, “B”))
,
ReplaceK(
amb([K2ID(“A”), K1Var(“A”)])
, amb([K2ID(“B”), K1Var(“B”)])
)

]
))

where ReplaceK still has the ambiguous children.

So what you’re saying is that you don’t expect the ambiguity to be at the level of the ‘K’ constructor, and would expect it one level deeper in the tree?

I don’t get your other example, exactly what is inconsistent?


On 8 July 2011 at 17:20 Radu Mereuta commented:

Yeah, that’s exactly where my problem is.
I guess I should extend the grammar to make the inconsistency more visible (added the Bag sort):

module Test2

exports
context-free start-symbols
    Start

context-free syntax
	K -> Start {cons("K")}
	Bag -> Start {cons("Bag")}

	Bag "=>" Bag -> Bag {non-assoc, cons("ReplaceBag")}
	K 	"=>" K	 -> K {non-assoc, cons("ReplaceK")}
	ID 	"=>" ID	 -> ID {non-assoc, cons("ReplaceID")}

	ID			-> K {cons("K2ID")}
	VARID		-> K {cons("K1Var")}
	VARID		-> Bag {cons("Bag1Var")}

lexical syntax
	[a-zA-Z] -> ID
	[A-Z] -> VARID
Right now, for the input "A=>B" I get this:
amb(
  [ amb(
      [ K(K2ID(ReplaceID("A", "B")))
      , K(
          ReplaceK(
            amb([K2ID("A"), K1Var("A")])
          , amb([K2ID("B"), K1Var("B")])
          )
        )
      ]
    )
  , Bag(ReplaceBag(Bag1Var("A"), Bag1Var("B")))
  ]
)
Why two "K(..)" ? Why not only one and at the same level with "Bag(..)"?

On 8 July 2011 at 17:55 Lennart Kats commented:

Ah, so you have a nested forest of ambiguities? I’m not sure why the parser does it that way. The Java parser tries to follow the results of the C version of SGLR, so if C-SGLR doesn’t do it that way, then JSGLR shouldn’t either. There might be something in the parser or preprocessor that causes the second amb to appear outside the K where it was originally inside it.

Regardless of whether it’s an error in the parse result or not, you can flatten nested ambiguities yourself as follows with a strategy flatten-amb-top:


flatten-amb-top =
io-wrap(
bottomup(try(flatten-amb))
)

flatten-amb:
amb(a*) -> amb(a’)
with
a’
:= <map(try(extract-ambs)); flatten-list> a*

extract-ambs:
amb(a*) -> a*


On 10 July 2011 at 19:40 Radu Mereuta commented:
Thanks for the flatten-amb rule. It's a bit better than what I wrote initially.
As for the ambiguity problem, it seems that the same thing happens in the old parser.
Can you help me with a stratego rule to correct it (my experience with this language is limited)?

What I need is to transform something like this:

amb([X(X'), ... , X(X")]) -> amb([X(amb([X', X"]), ...])

In other words, if there are any duplicate nodes in an ambiguity list,
replace those with only one node, and their children will be united in an amb node.


On 11 July 2011 at 13:48 Lennart Kats commented:

Alright, that should be something like this:


down-with-ambs:
amb(a*) -> amb(a’)
with
a’
:= a*

down-with-ambs-list =
down-with-ambs-list-1 <+ down-with-ambs-list-2

down-with-ambs-list-1:
[parent#([child]) | tail] -> [grouped | tail’]
with
matching-children := <filter(fetch-child(|parent))> tail
where
not(!matching-children => [])
with
grouped := parent#([ amb([child | matching-children]) ]);
tail’ := <filter(not(fetch-child(|parent)))> tail

down-with-ambs-list-2:
[a | tail] -> [a | tail]

down-with-ambs-list-2:
[] -> []

fetch-child(|parent):
parent#([child]) -> child

The main trick there is that you need to use the # generic term (de)construction operator (see the Stratego manual) combined with a bit of functional programming.


On 15 July 2011 at 15:45 Radu Mereuta commented:

Thanks, that did the trick.


On 18 July 2011 at 15:42 Lennart Kats tagged parser

On 18 July 2011 at 15:43 Lennart Kats tagged disambiguator

On 18 July 2011 at 15:43 Lennart Kats tagged workaround

On 8 January 2013 at 14:56 Eelco Visser tagged sdf

Log in to post comments