String escaping in SDF and Stratego is not conventional
In all languages I’ve checked (Python, Java, Groovy, SPARQL, etc.) the rules for String escaping are as follows:
- single quotes are optionally escaped inside double-quoted Strings (i.e.
"'"
=="\'"
)- double quotes are optionally escaped inside single-quoted Strings (i.e.
'"'
=='\"'
)However, the String
Submitted by Oskar van Rest on 7 September 2016 at 18:57"\'"
, which should be valid according to the rules above, produces a syntax error in both SDF and Stratego. In SDF you can work around by using the un-escaped form"'"
but in Stratego this is actually a limitation since there is no way to produce the String"\'"
. This is not a huge issue since this can be replaced with"'"
for most target languages. However, it makes the Stratego rules for processing String literals not straightforward.
Issue Log
This is probably what you want to do in the SDF and Stratego grammars:
context-free syntax String.String = STRING lexical syntax STRING = SINGLE-QUOTED-STRING | DOUBLE-QUOTED-STRING SINGLE-QUOTED-STRING = "'" (~[\'\n\\] | ESCAPE-CHAR)* "'" DOUBLE-QUOTED-STRING = '"' (~[\"\n\\] | ESCAPE-CHAR)* '"' CHAR = "'" (~[\'\n\\] | ESCAPE-CHAR) "'" ESCAPE-CHAR = "\\" [tnrbf\"\'\\]
And in Stratego, normalize as follows to get the right Strings in case e.g. Java is your target language:
norm-string = String((un-double-quote + un-single-quote) ; string-replace(|"\\\"", "\"") ; string-replace(|"\\'", "\'") ; string-replace(|"\\\\", "\\"))
edit: ignore the Stratego code. That is only meaningful if you’re writing an AST interpreter but not if you’re generating Java code.
I never investigated this. Would be good to fix.
Oskar:
- what is the impact of this change on client code and on the implementation?
- pull request(s) realizing this would be welcome
For SDF and Stratego I think the change would be fully backwards compatible because we would only add additional character sequences that were previously not allowed. I’m not sure about TS and NaBL.
I also noticed that the Common.sdf3 that is generated as part of a new Spoofax project has again different escaping rules. It allows for backward slashes anywhere in the String (e.g.
"\"
or"\x"
) while in other languages they are normally escaped (i.e."\\"
or"\\x"
).
Ok. I would say go ahead and implement this.
Log in to post comments