Property annotations make analyses and transformations fragile (2)
Currently, we annotate properties to terms. This makes every transformation involving equality fragile. The current strategy is either to use a less strict equality strategy or to annotate everything before checking equality. This is tedious and tends to break everywhere.
How can we avoid these annotations and keep terms clean? URI annotations are fine, because they distinguish different things. All others are not.
Can we cache properties of terms without annotating terms, maybe in the index? How expensive is this caching? It should take less memory than annotating it to the tree. What is the runtime overhead? Related to that: How expensive is calculating a hash for a term? How likely are collisions? Is there a way to handle collisions and still store and lookup properties efficiently?
Submitted by Guido Wachsmuth on 13 July 2013 at 01:15
Issue Log
Caching a term is O(n) where n is the number of subterms for that term. Caching leafs like a string is constant, caching applications or lists is linear. Collisions are likely as we experienced in the task engine. In Java, if you properly implement hashCode and equals, then a HashMap will take care of collisions automatically. However in the task engine we used the cached value (always an int in Java) as a pointer, which does not take care of collisions.
Storing properties of arbitrary terms in the index (or any Map like datastructure) is not possible, since a
Term()
may have annotations a* in position 1 in the AST, and annotations b* in another position in the AST, but both will have the same hashCode and equals results since the position ofTerm()
in the AST is not taken into account. If you add this position information then you can distinguish, but if you then transform the tree the positions will change.One way of hiding the properties is to store them as attachments instead of annotations, they can still be set and retrieved from nodes like annotations, but are not visible in the tree or equality any more.
Just a remark: I was assuming the same term with the same URI annotations would have the same properties at any position in the AST. But this might be indeed not the case.
Right, for example a VarRef(“a”) may have different types in different scopes.
But this would have either different URI annotations (which should survive, because they really distinguish terms) or should have the same type. But there might be properties which are propagated downwards, while most types are propagated upwards.
This is an example which caused some frustration. I’m trying to match both a definition and a use of a variable at the same time.
gm-to-gmcore: |[ Foreach(name3 : itersrc) { exp0 <= exp1 @ name3; } ]| -> ...
In the above example, note the
name3
both in the definition and use positions. I would expect this to match things like:Foreach(n : G.Nodes) { n.p <= 42 @ n; }
However this is only the case on non-annotated ASTs. Analysed ASTs will annotate
name3
on the definition site andname3
on the use sites and they will no longer match as same variable. For me the solution was ugly:gm-to-gmcore: |[ Foreach(name3 : itersrc) { exp0 <= exp1 @ name31; } ]| -> <fail> where name3-def := <nabl-collect-one-resolved-def> name3; name31-def := <nabl-collect-one-resolved-def> name31; <eq> (name3-def, name31-def)
The purpose of the
where
clause is to no longer compare the strings for equality but instead compare the definitions. I obtained the definition from the definition and definition from the use site and compare the two definitions for structural equality.In my case this works, but in other languages it may have different semantics. I’m thinking that for anything with aliases or scoped renames/imports this will not have the desired effect as it will go directly to the definition instead of performing a lexical comparison.
Note that generally you want to compare definitions anyway, since even if both uses have the same name, they could resolve to different definitions. I guess in this specific case you can be sure that they both refer to the same thing because the fragment is so small.
The best alternative that I can think of right now is to use attachments, any other ideas?
What is the purpose of your example anyway, Vlad? Why does the rule fail?
Comment removed
I’ve updated the post above
Completely fine, Vlad. I just like to get what you try to achieve. Because one solution is to support this systematically.
I think we have two problems here. First, we like to compare defs with uses without caring about what they are, as long as they have the same URI. Thus, a use should be the same as a def, if it resolves to this def. There are two variants here: First, it resolves only to this def. Second, it resolves to this def, but might also resolve to other defs. Second, we want to compare annotated terms modulo property annotations, but not modulo URI annotations.
Both can be supported by a comparison library.
Ok, so fixing https://yellowgrass.org/issue/NaBL/44 would give you easy comparisons between definitions and uses by comparing their URI’s.
And property annotations can be switched to attachments with a library to read/write those attachments?
Log in to post comments