Use start symbol specified in main.esv
Even though some other issues (Spoofax/66 in particular) suggest this setting is likely to be deprecated, I think with the introduction of the testing language and its start symbol header, the start symbol setting has a use again.
Start symbols specified in the testing language must be declared as start symbol in the grammar.
This can have the effect that the grammar becomes ambiguous, e.g. toplevel ambiguity between File/Chunk/Statement for a file with a single statement in a language that doesn’t require any syntax to start the toplevel File/Chunk.Currently, this can (AFAIK) only be resolved by writing a custom disambiguator to resolve any toplevel ambiguity by picking the AST node belonging to the actually desired start-symbol.
It would be nice to be able to declaratively specify the start symbol (i.e., in main.esv), so that 1) such custom disambiguation isn’t needed, and 2) parser performance isn’t affected by the ambiguous grammar.
This is also consistent IMHO: SDF specifies the sorts that can be used as start symbol, ESV specifies the start symbol used by default, and SPT can override it for particular testsuites.
Caveat: may break editors that rely on the start symbol setting not being used.
Submitted by Tobi Vollebregt on 14 September 2011 at 14:28
Issue Log
“This is also consistent IMHO: SDF specifies the sorts that can be used as start symbol, ESV specifies the start symbol used by default, and SPT can override it for particular testsuites.”
I think you’re right. Some notion of a default start symbol is more useful now.
Doesn’t the ParseController already use the start symbol, though? (note that the Parse_Controller_ is not used by the testing language for fragments)
Bug was in Descriptor, it always thought there were no StartSymbols in the descriptor, even if there actually were. That is fixed now (SVN r23356)
There’s still inconsistency with the behaviour of SGLRParseController: when the top sort filter can’t find the start symbol specified in the descriptor, it will throw a StartSymbolException. This is caught by the controller, which then writes a warning to the log and re-parses the file with a null startSymbol.
In the past it set the startSymbol to null permanently for that editor; I’ve changed it now to restore the original startSymbol after every parse. Otherwise if you accidentally enter something that parses with a start symbol you need for e.g. testing, but doesn’t parse with the main start symbol, you brick the editor permanently, i.e. the top sort filter is not applied anymore until you close and re-open the file.
It’s not so nice of course that now it will parse twice more often (unmanaged parse tables?) and/or spam more warnings in certain situations.
Also this behavior means that you don’t get an error for something that really is invalid syntax in your language, but just happens to be parseable because you want to export some start symbol for testing purposes… You just get a warning in the log file if you’re lucky, but the editor happily accepts an input that can only be parsed with a start symbol not specified in the descriptor…
I’ve no clue yet how to make these things better… it seems it’s either brick editor permanently (possible spurious top level ambiguities) for better performance and/or less warnings when a wrong start symbol is in ESV / unmanaged parse table is used, or more correct editor behavior but lower performance and/or more warnings…
Log in to post comments