In all languages I’ve checked (Python, Java, Groovy, SPARQL, etc.) the rules for String escaping are as follows:

  • single quotes are optionally escaped inside double-quoted Strings (i.e. "'" == "\'")
  • double quotes are optionally escaped inside single-quoted Strings (i.e. '"' == '\"')

However, the String "\'", which should be valid according to the rules above, produces a syntax error in both SDF and Stratego. In SDF you can work around by using the un-escaped form "'" but in Stratego this is actually a limitation since there is no way to produce the String "\'". This is not a huge issue since this can be replaced with "'" for most target languages. However, it makes the Stratego rules for processing String literals not straightforward.

Submitted by Oskar van Rest on 7 September 2016 at 18:57

On 7 September 2016 at 22:33 Oskar van Rest commented:

This is probably what you want to do in the SDF and Stratego grammars:

context-free syntax

  String.String = STRING

lexical syntax

  SINGLE-QUOTED-STRING = "'" (~[\'\n\\] | ESCAPE-CHAR)* "'"
  DOUBLE-QUOTED-STRING = '"' (~[\"\n\\] | ESCAPE-CHAR)* '"'

  CHAR                 = "'" (~[\'\n\\] | ESCAPE-CHAR) "'"

  ESCAPE-CHAR          = "\\" [tnrbf\"\'\\]

And in Stratego, normalize as follows to get the right Strings in case e.g. Java is your target language:

norm-string = String((un-double-quote + un-single-quote)
                   ; string-replace(|"\\\"", "\"")
                   ; string-replace(|"\\'", "\'")
                   ; string-replace(|"\\\\", "\\"))

edit: ignore the Stratego code. That is only meaningful if you’re writing an AST interpreter but not if you’re generating Java code.

On 8 September 2016 at 10:30 Eelco Visser commented:

I never investigated this. Would be good to fix.

- what is the impact of this change on client code and on the implementation?
- pull request(s) realizing this would be welcome

On 12 September 2016 at 23:02 Oskar van Rest commented:

For SDF and Stratego I think the change would be fully backwards compatible because we would only add additional character sequences that were previously not allowed. I’m not sure about TS and NaBL.

I also noticed that the Common.sdf3 that is generated as part of a new Spoofax project has again different escaping rules. It allows for backward slashes anywhere in the String (e.g. "\" or "\x") while in other languages they are normally escaped (i.e. "\\" or "\\x").

On 12 September 2016 at 23:13 Eelco Visser commented:

Ok. I would say go ahead and implement this.

On 13 September 2016 at 13:13 Gabriël Konat tagged sdf

On 13 September 2016 at 13:13 Gabriël Konat tagged stratego

On 13 September 2016 at 13:13 Gabriël Konat tagged improvement

On 13 September 2016 at 13:13 Gabriël Konat removed tag error

Log in to post comments