#546 Improve indentation-sensitive language support (project SpoofaxLegacy on YellowGrass.org)

I was hoping to make my new language indentation-sensitive like many new languages are these days (coffeescript, python). Spoofax is extremely cool, but it doesn’t quite support that in a friendly manner.

Here’s a ticket to think about what’s the shortest path to support this use case.

The current workaround is to make the line starts into tokens and do post-processing to merge adjacent statements with the same level of indentation into a block. This is OK but it looks like a lot of work and isn’t as flexible.

There are a couple alternative approaches that come to mind.

One is to allow a pre-processing step of some sort that modifies the source code to replace some characters with some other. For example you could do something like:

if x:
if y:
if z: bla
else:
bug

becomes

if x:
{ if y:
print ‘hello’
} if z: bla
else:
{ bug
bla
} boo

(the closing brace goes BEFORE the final statement because there might not be enough whitespace afterwards)

Another preprocessing step might be to alter the character set used for newline and then preprocessing to use different newline characters depending on whether the next line has a change in indentation. This keeps the file layout similar but requires changing the line number calculations.

Another approach is to modify the parser so that you can alter or disable the LAYOUT processing between tokens, thus making whitespace significant where you want it to be, and still automatically insignificant elsewhere. This might allow some kind of bracket matching … but I haven’t though this through so I can’t tell immediately if this would actually solve the problem.

The ideal scenario would be to be able to insert some sort of “virtual” tokens into the stream by overriding a tokenizer somewhere.

Any thoughts / hints / suggestions ?
Submitted by Dobes Vandermeer on 6 October 2012 at 06:16

improvement1.2!eelcovissersdf

On 6 October 2012 at 08:32 Eelco Visser commented:

This problem is solved, in theory. At the recent SLE 2012 conference Sebastian Erdweg presented an extension of SDF/JSGLR with layout sensitive parsing:

Sebastian Erdweg, Tillmann Rendel, Christian Kästner and Klaus Ostermann. Layout-sensitive Generalized Parsing. In Conference on Software Language Engineering (SLE), 2012. To appear. pdf

However, the extension is implemented in a branch that needs to be merged back into the trunk before we can deploy it in Spoofax. Also, it is not yet clear what the interaction between layout-sensitive parsing and error recovery will be.

On 6 October 2012 at 21:16 Dobes Vandermeer commented:

Great news!

I’ll see if I can figure out how to make it work.

On 13 October 2012 at 20:32 Dobes Vandermeer commented:

It looks like the merge is a bit of a task …

Is someone tasked with this merge already or should I embark upon it? If I do the merge, is there someone willing to review the results?

On 15 October 2012 at 13:05 Sebastian Erdweg commented:

I started to merge the layout-sensitive parser implementation back into the JSGLR trunk. I’ll hope to get back to it this week. I’ll comment here once the merge is done.

On 7 November 2012 at 14:16 Sebastian Erdweg commented:

I reintegrated the changes I made for layout-sensitive parsing into the SGLR svn repo. The merge can be found in branch jsglr-layout-merge.

The merge is currently under review by the Spoofax team. I hope the changes will be integrated into the Spoofax trunk soon, so you can start using layout-sensitive syntax in your Spoofax-based DSLs.

On 7 November 2012 at 14:22 Lennart Kats commented:

Great, thanks for the effort Sebastian.

On 8 January 2013 at 14:06 Eelco Visser tagged sdf

On 8 January 2013 at 14:06 Eelco Visser tagged 1.2

On 9 January 2013 at 23:38 Eelco Visser tagged !eelcovisser

On 13 February 2013 at 14:06 Maartje commented:

Layout sensitive parser is now merged back into the trunk.

On 13 February 2013 at 14:06 Maartje closed this issue.

Log in to post comments

Improve indentation-sensitive language support (1)

Issue Log