Currently, the way to do this is by writing kernel syntax, which allows the programmer to have more control on how LAYOUT?-CF is introduced in the production, as the normalizer does not include this symbol when normalizing kernel productions.
One point is that it is a tedious task for the user, he has to basically know how the normalizer works to write correct kernel productions (missing correct -LEX and -CF in symbols there is really an issue) and also to write proper priorities.

There at least are two ways to address it:

1- Allow constructors in lexical syntax productions. I believe this would require a lot of changes in the architecture.

2- Have something to inform the normalizer that one does not want optional layout in his context-free production. I personally would vote for this one, as it might only require changing the normalizer. I guess a short-term solution could happen even in the translation from SDF3 to SDF2, in which we can do this normalization ourselves and can guarantee it is correct.

Submitted by Eduardo Amorim on 17 September 2015 at 14:35

On 17 September 2015 at 14:48 Daco Harkes commented:

I agree with Eduardo.

  • context free: has constructor, introduces layout between all ‘tokens’
  • lexical: has no constructor but combines everything in a string, introduces no layout between ‘tokens’

Having an option where you want the constructor but not the layout is definitely a valid use case.

I see how 2 is an easy solution for this: {no-layout}. This does not allow having optional layout but getting it as a string instead of a constructor, which would be the 4th combination (but I’m not sure if that would be a valid use case at all). I guess you can explicitly mention LAYOUT within your lexical syntax already if you need it.

I’m in favor of having context free with {no-layout}, but maybe there’s a better abstraction here.


On 17 September 2015 at 15:26 Eelco Visser commented:

Constructors in layout is not just a solution to optional layout. It is a quite valid requirement to (sometimes) preserve the structure of a lexical production. The floating point example is a good example of that. So eventually, we need to address that. So, I think this issue should be split: optional lexical constructors should be a separate feature request.


On 17 September 2015 at 16:31 Peter Mosses commented:

Presumably {no-layout} should affect also the productions generated from SYMBOL* and SYMBOL+ when normalizing a context-free production.


On 17 September 2015 at 17:45 Jeff Smits commented:

The motivating example for me, to have more control over layout, is Jasmin (the JVM assembly language). When we wrote the grammar for that language we found that the language has two kinds of the layout. An inter-instruction version that contains spaces, tabs and newlines, and an intra-instruction version that doesn’t contain newlines, because normal instructions don’t span multiple lines. Of course there are edge-cases of larger instructions that are multi-line, but the sub-instruction statements are again single-line.

What I think would make sense is if the context-free syntax section can be parametrised with a locally used layout sort. Then for Jasmin you can put single-line instructions in a section where the layout doesn’t have newlines, and put the “list of instructions” rule in another section with the more standard layout.

@pdmosses’s comment about list-productions makes sense to me, but show at least one difficulty of having more control over layout at all. If you use SYMBOL* in two different places with two different layout rules, those two SYMBOL* are not the same and should not use the same generated rules.

Log in to post comments