Currently, we annotate properties to terms. This makes every transformation involving equality fragile. The current strategy is either to use a less strict equality strategy or to annotate everything before checking equality. This is tedious and tends to break everywhere.

How can we avoid these annotations and keep terms clean? URI annotations are fine, because they distinguish different things. All others are not.

Can we cache properties of terms without annotating terms, maybe in the index? How expensive is this caching? It should take less memory than annotating it to the tree. What is the runtime overhead? Related to that: How expensive is calculating a hash for a term? How likely are collisions? Is there a way to handle collisions and still store and lookup properties efficiently?

Submitted by Guido Wachsmuth on 13 July 2013 at 01:15

On 13 July 2013 at 01:15 Guido Wachsmuth tagged @gohla

On 13 July 2013 at 01:15 Guido Wachsmuth tagged !guwac

On 13 July 2013 at 01:16 Guido Wachsmuth tagged rfc

On 13 July 2013 at 01:28 Gabriël Konat commented:

Caching a term is O(n) where n is the number of subterms for that term. Caching leafs like a string is constant, caching applications or lists is linear. Collisions are likely as we experienced in the task engine. In Java, if you properly implement hashCode and equals, then a HashMap will take care of collisions automatically. However in the task engine we used the cached value (always an int in Java) as a pointer, which does not take care of collisions.

Storing properties of arbitrary terms in the index (or any Map like datastructure) is not possible, since a Term() may have annotations a* in position 1 in the AST, and annotations b* in another position in the AST, but both will have the same hashCode and equals results since the position of Term() in the AST is not taken into account. If you add this position information then you can distinguish, but if you then transform the tree the positions will change.

One way of hiding the properties is to store them as attachments instead of annotations, they can still be set and retrieved from nodes like annotations, but are not visible in the tree or equality any more.


On 13 July 2013 at 01:42 Guido Wachsmuth commented:

Just a remark: I was assuming the same term with the same URI annotations would have the same properties at any position in the AST. But this might be indeed not the case.


On 13 July 2013 at 01:48 Gabriël Konat commented:

Right, for example a VarRef(“a”) may have different types in different scopes.


On 13 July 2013 at 01:57 Guido Wachsmuth commented:

But this would have either different URI annotations (which should survive, because they really distinguish terms) or should have the same type. But there might be properties which are propagated downwards, while most types are propagated upwards.


On 18 July 2013 at 18:40 Vlad Vergu commented:

This is an example which caused some frustration. I’m trying to match both a definition and a use of a variable at the same time.

  gm-to-gmcore:
    |[
      Foreach(name3 : itersrc) {
        exp0 <= exp1 @ name3;
      }
    ]| -> ...

In the above example, note the name3 both in the definition and use positions. I would expect this to match things like:

  Foreach(n : G.Nodes) {
  	n.p <= 42 @ n;
  }

However this is only the case on non-annotated ASTs. Analysed ASTs will annotate name3 on the definition site and name3 on the use sites and they will no longer match as same variable. For me the solution was ugly:

  gm-to-gmcore:
    |[
      Foreach(name3 : itersrc) {
        exp0 <= exp1 @ name31;
      }
    ]| -> <fail>
    where
      name3-def := <nabl-collect-one-resolved-def> name3;
      name31-def := <nabl-collect-one-resolved-def> name31;
      <eq> (name3-def, name31-def)

The purpose of the where clause is to no longer compare the strings for equality but instead compare the definitions. I obtained the definition from the definition and definition from the use site and compare the two definitions for structural equality.

In my case this works, but in other languages it may have different semantics. I’m thinking that for anything with aliases or scoped renames/imports this will not have the desired effect as it will go directly to the definition instead of performing a lexical comparison.


On 18 July 2013 at 18:41 Vlad Vergu tagged !vvergu

On 18 July 2013 at 18:59 Gabriël Konat commented:

Note that generally you want to compare definitions anyway, since even if both uses have the same name, they could resolve to different definitions. I guess in this specific case you can be sure that they both refer to the same thing because the fragment is so small.

The best alternative that I can think of right now is to use attachments, any other ideas?


On 18 July 2013 at 19:02 Guido Wachsmuth commented:

What is the purpose of your example anyway, Vlad? Why does the rule fail?


On 18 July 2013 at 19:04 Vlad Vergu commented:

Comment removed
I’ve updated the post above


On 18 July 2013 at 19:09 Guido Wachsmuth commented:

Completely fine, Vlad. I just like to get what you try to achieve. Because one solution is to support this systematically.


On 24 July 2013 at 00:35 Guido Wachsmuth commented:

I think we have two problems here. First, we like to compare defs with uses without caring about what they are, as long as they have the same URI. Thus, a use should be the same as a def, if it resolves to this def. There are two variants here: First, it resolves only to this def. Second, it resolves to this def, but might also resolve to other defs. Second, we want to compare annotated terms modulo property annotations, but not modulo URI annotations.
Both can be supported by a comparison library.


On 26 July 2013 at 22:33 Gabriël Konat commented:

Ok, so fixing https://yellowgrass.org/issue/NaBL/44 would give you easy comparisons between definitions and uses by comparing their URI’s.

And property annotations can be switched to attachments with a library to read/write those attachments?

Log in to post comments