Generic term deconstruction and construction have:

  1. different behavior in the compiler and interpreter
  2. aren’t symmetric in the compiler

wrt. escaping / un-escaping of special characters

In short, the issue is that:

  1. the interpreter doesn’t bother with any escaping or un-escaping, while the compiler does.
  2. the compiler un-escapes \n\t\b\f\r\\\'\", while it escapes only backslash, double-quote, \n and \r.

Whether generic term construction should be exactly reversible using generic term deconstruction is discussable, but at the very least the different behavior between compiler and interpreter is a bug.

Here are my observations of the current behavior:

Generic term deconstruction (as used in explode-aterm)

Compiler

  • compiler uses SSL_get_constructor and SSL_get_arguments
  • SSL_get_constructor uses env.setCurrent(factory.makeString(current.toString()));
  • current.toString invokes StrategoString.writeAsString
  • StrategoString.writeAsString double-quotes and escapes backslash, double-quote, \n and \r

Interpreter

  • interpreter uses Match.getTermConstructor and Match.getTermArguments
  • Match.getTermConstructor returns env.getFactory().makeString("\"" + ((IStrategoString)t).stringValue() + "\"");
  • No escaping!

Generic term construction (as used in implode-aterm)

Compiler

  • compiler uses SSL_mkterm
  • SSL_mkterm fails if string does not start with a double-quote
  • SSL_mkterm uses env.setCurrent(env.getFactory().parseFromString(value + "\""));
  • parseFromString invokes(?) TAFTermReader.parseString
  • TAFTermReader.parseString un-escapes \n\t\b\f\r\\\'\" and throws on \0\1\2\3\4\5\5\6\7\8\9
  • TAFTermReader.parseString throws if string does not end with a double-quote (which is probably why the double-quote is added to the end of the string before parseFromString is called)

Interpreter

  • interpreter uses Build.doBuildExplode
  • Build.doBuildExplode un-double-quotes and makes a string iff the passed cons starts with a double-quote
  • Build.doBuildExplode makes an appl iff the passed cons does not start with a double-quote
  • No un-escaping!

Example

And here is a little bit of test code:


// interpreter | compiler
<debug(!"Test generic deconstruction 1: ")> <?#()> “.\t.\n.”; // adds quotes | adds quotes + escapes
<debug(!"Test generic deconstruction 2: ")> <?#(
)> “".\t.\n."”; // adds quotes | adds quotes + escapes
<debug(!"Test generic construction 1: ")> <!#([])> “.\t.\n.”; // nothing | replaces by ()
<debug(!"Test generic construction 2: ")> <!#([])> “".\t.\n."”; // removes quotes | removes quotes + unescapes

A place where this issue manifests itself (at least, I think this issue is the root cause), is in the testing language.
Note the following inconsistency:


// this test succeeds
test << >> // tab char!
parse to Template([Layout(" ")]) // tab char!

// this test fails
test << >> // tab char!
parse to Template([Layout(“\t”)])

// but this test fails(!)
test [[ <<

]]
parse to Template([Newline("
")])

// and this test succeeds(!)
test [[ <<

]]
parse to Template([Newline(“\n”)])

Submitted by Tobi Vollebregt on 28 September 2011 at 15:07

On 28 September 2011 at 15:08 Tobi Vollebregt tagged stratego

Log in to post comments