#771 Automated testing reference error (project SpoofaxLegacy on YellowGrass.org)

In the current Spoofax nightly build an error exists with respect to the automated reference testing. Spoofax version: Spoofax/IMP 1.2.0.0-s40785

Take the following testfile https://gist.github.com/rogierslag/7373320

The first test fails, since the second ‘x’ does not resolve to the first ‘x’, while the second test does works. It seems there is an issue when a parameter is used in a statement context; it works perfectly when it is referenced inside an expression.
Submitted by Rogier Slag on 11 November 2013 at 15:25

error1.2lab

On 11 November 2013 at 16:17 Guido Wachsmuth commented:

There are two important observations:

The test case works on the left-hand side as well, if you add a local variable to the method.

The error message indicates, that the AST associated with [[x]] is wrong. It seems to be the whole method AST.

Looks like an origin tracking issue to me.

On 11 November 2013 at 16:17 Guido Wachsmuth tagged lab

On 12 November 2013 at 14:27 Mick commented:

While working on the name resolution only I found that introducing a statement makes the test succeed:
test reference to parameter with equally named field available [[
class A {
	int i;
	public int m(int [[i]]) {
		System.out.println(0); // makes the test succeed
		[[i]] = 0;
		return 0;
	}
}
]] resolve #2 to #1

On 15 January 2014 at 20:44 Volker commented:

Some more background info which might help solve this issue:

Consider the following test suite:
module names-classes

language MiniJava
start symbol Start

setup header [[
        class Main {
                public static void main(String[] args) {
                        System.out.println(1);
                }
        }
]]

test field name resolution in array assignment [[
        class Foobar {
        int[] [[x]];
        public int method() {
                [[x]][0] = 3;
                return 1;
        }
        }
]] resolve #2 to #1
This test case has the same issue. The issue is related to how SPT tokenizes the fragments and how the SelectionFetcher.fetch() method deals with quotes that have more than 1 token inside.
Let’s take a closer look at this. The SPT file is parsed to an SPT AST, then the fragment (the minijava code in this case) is parsed to an AST and added as annotation to the SPT AST. Then the selections (or markers or quotes) are ‘fetched’ from the minijava AST and also appended as annotation to the SPT AST.

The error that causes this issue happens in the fetching of the selections. This fetching happens in the SelectionFetcher.fetch(parsedFragment) method. The parsedFragment for the test case in the example above is:
Program(MainClass("Main","args",Print(IntValue("1"))),
[Class("Foobar",None,[Field(IntArray,"x")],[
    Method(
        Int, /* return type of method  */
        "method", /* method name */
        [], /* parameters */
        [], /* local variable declarations */
        [ArrayAssign(VarRef("x"),IntValue("0"),IntValue("3"))], /* expressions */
        IntValue("1") /* return expression */
    )
])])
This parsedFragment has an ImploderAttachment, linking this minijava AST to the actual SPT file’s contents.
The SPT file’s content are tokenized and the SelectionFetcher finds a selection by looking at the tokens directly to the left and right of each term.
Let’s look at how the SPT example above get’s tokenized:
Note that each <<x>> is a token, where x is the actual token and << and >> are just delimiters I added to allow you to identify ‘empty’ tokens (i.e. <<>>).
<<>><<>><<module>><< >><<names-classes>><<\n>>
<<\n>>
<<language>><< >><<MiniJava>><<\n>>
<<start>><< >><<symbol>><< >><<Start>><<\n>>
<<\n>>
<<setup>><< >><<header >><<[[>><<       >><<class>><< >><<Main>><< >><<{>><<\n>>
<<              >><<public>><< >><<static>><< >><<void>><< >><<main>><<(>><<String>><<[>><<]>><< >><<args>><<)>><< >><<{>><<\n>>
<<                      >><<System>><<.>><<out>><<.>><<println>><<(>><<1>><<)>><<;>><<\n>>
<<              >><<}>><<\n>>
<<      >><<}>><<\n>>
<<]]>><<\n>>
<<\n>>
<<test>><< >><<field name resolution in array assignment >><<[[>><<     >><<class>><< >><<Foobar>><< >><<>><<{>><<\n>>
<<      >><<int>><<[>><<]>><< [[>><<x>><<]]>><<;>><<\n>>
<<      >><<public>><< >><<int>><< >><<method>><<(>><<>><<)>><< >><<{>><<\n>>
<<                      [[>><<>><<x>><<]]>><<[>><<0>><<]>><< >><<=>><< >><<3>><<;>><<\n>>
<<                      >><<return>><< >><<1>><<;>><<\n>>
<<      >><<}>><<\n>>
<<      >><<}>><<\n>>
As I said before, the SelectionFetcher recognises selections by looking at the tokens to the left and right. If the token on the left contains a [ and the token on the right contains a ] the current term is added to the list of selections.
Let’s look at how this works in the example: the first selection corresponding to int[] [[x]]; is tokenized as << >><<int>><<[>><<]>><< [[>><<x>><<]]>><<;>><<\n>>.
Note that the <<x>> token has a << [[>> token to the left and a <<]]>> token to the right and is thus recognised as a selection in this step.

If the left token contains a [ but the right token does not contain a ] the current term is remembered. As soon as a term with a ] to the right is found, the nearest common ancestor of both the remembered term and this current term is added as a selection.
Let’s look at how this works in the example: the second selection corresponding to [[x]][0] = 3; is tokenized as << [[>><<>><<x>><<]]>><<[>><<0>><<]>><< >><<=>><< >><<3>><<;>><<\n>>. Note that the <<x>> token now has an empty token <<>> to the left and a closing <<]]>> token on the right. The empty token corresponds to the empty list of VarDecls in the minijava method. The SPT parser finds no VarDecls and the VarDecl list is tokenized as an empty token, but SPT has no way of knowing we want the selection quote to come after this empty VarDecl list, and instead tokenizes the [[ together with the layout.

So the SelectionFetcher find the empty token <<>> with a token containing [ to the left, but no token with ] on the right. So as described above, it will remember the empty VarDecl term (i.e. the [] from the minijava AST). It then moves on to the next token, which is the <<x>> and has a token containing a ] to the right. It now adds the nearest common ancestor of the x term and the remembered VarDecl term []. As you can see in the MiniJava AST this corresponds to the Method(...) term.

Now that we know what the selections are, we can trivially see why the test case fails: Method(...) does not resolve to the int array field x.

So how can we solve this?

We can change how the file gets tokenized, to avoid situations where more than 1 term is inside a selection.

We can change how the selections are extracted from the SPT file, finding a proper way to handle multiple terms inside a selection.

We can change where the selections are extracted from. Instead of working with the tokenized SPT file, we could work with the SPT AST, where selections are nicely located inside Quotepart(term) terms.

I have no clue how to do 1 or what a ‘proper way’ is to handle multiple terms inside a selection. So my suggestion would be to go for 3.
But feel free to come up with better ideas, as this is not my area of expertise.

I hope this bug breakdown was useful and I am willing to help solve this, as I am somewhat familiar with the SPT code.

On 11 February 2014 at 14:32 Gabriël Konat commented:

Thank you for your thorough explanation Volker, this made it a lot easier to figure out a solution!

This issue was fixed in https://github.com/metaborg/spt/compare/d6f36a183647ad79c6eded6f5aaf50401eeb547e…38a3d119b4. I indeed went with option 3, instead of using the tokenizer to find marked terms, the QuotePart(str) is used instead. The term inside the QuotePart is just a string however, not an AST. For resolution tests the analysed term is needed, so the string is not enough. To get the analysed term I use the origin location of the string to find the innermost analysed term with the same origin, and use that to execute the resolution service.

Unfortunately the retokenization in SpoofaxTestingJSGLR.java messes up the origins of the marked strings. To work around this the marked strings had to be cloned so that their origins stayed untouched. The cloning function is now in SpoofaxTestingJSGLR.java, but this should probably a be a feature of the ITermFactory in the future.

On 11 February 2014 at 14:32 Gabriël Konat closed this issue.

On 28 July 2014 at 22:42 Gabriël Konat tagged 1.2

On 8 October 2015 at 15:33 Martijn commented:

For documentation purposes: apparently this issue still exists. I applied a quick ’n dirty fix that ignores empty IStrategoLists within a selection. This solves the problem in the specific case of an empty list of VarDecl (like Volker described), but it does not solve the problem in general. It’s also dirty code, so it should only be used as a workaround until something better is available.

https://github.com/MartijnDwars/spt/commit/d54087470658bcdd3fb809506ad3c8668612541d

On 8 October 2015 at 15:44 Guido Wachsmuth commented:

There are some conceptual issues in the current approach.

A selection can currently select an AST node which covers more text than the selection. This should never happen.

It is unclear what you actually can select. The assumption here seems to be single AST nodes. SPT should give an error, if you select something different.

Similar to selection in Spoofax editors, there can be more than one AST node representing the selected test. There is no general rule if you should pick the outermost or innermost node here, different applications require different approaches.

A cleaner approach can be to get the origin of the selection in the SPT AST, and find all nodes with the same origin in the object language AST. If there is no such node, SPT should give an error. Apparently, this is what Gabriël did in his fix, more or less. But this should be discussed in more detail with Volker.

Log in to post comments

Automated testing reference error

Issue Log