#57 Lookup all visible variables (project NaBL on YellowGrass.org)

How can i get (1) all variables visible and (2) all variables visible having a particular property? By visible i mean visible at a particular spot in an AST. I need to be able to do this during collection and after evaluation.

My concrete example are GM deferred assignments without an explicit iterator references. There i need to lookup all visible variables that are iterated (that’s the property). I need to (1) do this during collect to display error messages and (2) post-evalution to add explicit iterator references.
Submitted by Vlad Vergu on 16 July 2013 at 19:50

question

On 16 July 2013 at 19:50 Vlad Vergu tagged !vvergu

On 16 July 2013 at 19:59 Guido Wachsmuth commented:

For now, you can construct a resolution task for this, basically in the same way as it is done for refers to Variable x of prop p clauses. Gabriël should be able to point out how this works. The error messages should then be tasks which depend on this resolution task. I don’t get the second point, but if it is about error messages, there should be message modifiers which can do this. Again, Gabriël should be able to explain this in detail.

On 16 July 2013 at 20:32 Vlad Vergu commented:

Ok. Gabriël was already helping with this, but i created an issue because i think it’s important to have something that easy to use and clear.

The (1) is for displaying error messages and (2) is for transformations post-analysis. Apparently the lookup would be different in the two cases.

On 16 July 2013 at 20:45 Guido Wachsmuth commented:

Ok, (1) will be part of the declarative constraints. I have already a sketch for this, but you need to wait for the end of the month to see this ending up in master. For (2), a property might be interesting. Do you have a definition site where you can associate the data with?

On 16 July 2013 at 20:53 Vlad Vergu commented:

Ok. For (1) Gabriël has a compromise implementation that might work until the end of the month. For (2) i do not have a definition site. The transformation rewrites an assignment statement to mention a reference to the only iterated variable in scope. For & Foreach statements already associate a iterated property to the variable they define. The only issue is looking this up.

On 16 July 2013 at 20:56 Guido Wachsmuth commented:

Can you sketch the transformation and the required analysis in an example?

On 16 July 2013 at 21:01 Gabriël Konat commented:

Actually a normal resolution task won’t do because it will always pick the first one that succeeds and you require a name. But we actually have a ResolveAllDefs task that takes a scope and namespace, and will retrieve all definitions in that scope and namespace. Then instead of choosing the task results are concatenated.

The ‘hard’ part is that we need to get the current URI at any node. During collection this information is available in uri* but this is not passed to type-of. If we add this many interfaces change.

On 16 July 2013 at 21:07 Guido Wachsmuth commented:

@Gabriël: Can you have a look on the prop-list branch, if the basic tasks there meet your needs better?

On 16 July 2013 at 21:20 Vlad Vergu commented:

@Guido:
Foreach(t : G.Nodes){
  ...  
  t.pg_rank <= 42;
  ...
}
Gets transformed to:
Foreach(t : G.Nodes){
  ...  
  t.pg_rank <= 42 @ t;
  ...
}
If there is only one definition visible of a variable which has the iterated property.

On 16 July 2013 at 22:40 Gabriël Konat commented:

@Guido: No, you actually removed the only task that does what we want.

On 16 July 2013 at 22:47 Guido Wachsmuth commented:

@Vlad: We might store this with the URI of the scope (see the annotated AST). But there is no API for this.
@Gabriël: Which was? Maybe I just renamed it.

On 16 July 2013 at 22:51 Vlad Vergu commented:

@Guido: i’m not sure what you mean. Currently all Variables have an isiterated property. The ones declared by For & Foreach have this property True() the others have it False():
  ForEach(var, iter, filter, body):
    defines
      Variable var
        of type t
        of graph g
        of isiterated True()
      // in filter, body
      where iter has type t
      where t has graph g

On 16 July 2013 at 22:57 Gabriël Konat commented:

@Guido: ResolveAllDefs, you commented it out.

On 17 July 2013 at 02:05 Gabriël Konat commented:

I think using a property will work for this. You can attach a property, say alliterators to any deferred assignment. If you define this property in your NaBL specification the collection algorithm will generate alliterators-of(|ctx) etc., and the collection will call this on every node in the tree. Then you have something like:
alliterators-of(|lang, ctx, uri*): 
  DeferredAssignment(...) -> defs-task
  with
      uri       := <lookup-uri(|lang, NablNsVariable())> uri*;
      defs-task := <nabl-resolve-all-defs-subtasks(|ctx, NablNsVariable(), [Prop(NablProp_isiterated(), True(), [])], All())> uri;
      <task-create-error-on-multiple(|ctx, defs-task, "Ambiguous deferring iterator (multiple available)")> exp1;
      <task-create-error-on-failure(|ctx, defs-task, "No surrounding iterator to defer to (none available)")> exp1
the DeferredAssignment will be annotated with the result of defs-task, which you can then use during desugaring or normalization. One problem here is that NaBL does not generate property-of with term arguments lang, ctx, uri* yet, but we can add that.

On 18 July 2013 at 22:22 Vlad Vergu commented:

RE: obtaining all variables during analysis.

The fix from https://yellowgrass.org/issue/NaBL/53 seems to break the solution we had, namely:

  type-of(|lang, ctx, uri*):
    |[ exp0 assignop exp1; ]| -> <fail>
    where
      <not(?Assign())> assignop
    with
      uri       := <lookup-uri(|lang, NablNsVariable())> uri*;
      defs-task := <nabl-resolve-all-defs-subtasks(|ctx, NablNsVariable(), [Prop(NablProp_isiterated(), True(), [])], All())> uri;
      <task-create-error-on-multiple(|ctx, defs-task, "Ambiguous deferring iterator (multiple available)")> exp1;
      <task-create-error-on-failure(|ctx, defs-task, "No surrounding iterator to defer to (none available)")> exp1

On 22 July 2013 at 23:07 Gabriël Konat commented:

Fixed in https://github.com/metaborg/runtime-libraries/commit/b58fd95bcbe64d935c2c826a78ff7c796a34486e. Has this issue been solved?

On 23 July 2013 at 00:35 Vlad Vergu commented:

@Gabriel: Thanks, that works. No, the issue is not solved. The issue of how to lookup variables after analysis (e.g. during compilation) still remains.

On 23 July 2013 at 00:40 Gabriël Konat commented:

My solution does not work then?

On 23 July 2013 at 00:41 Vlad Vergu commented:

Which solution?

On 23 July 2013 at 00:44 Vlad Vergu commented:

The issue is by all means not limited to iterators. Take an example from compilation. One needs to find all Graphs and for each one of the graphs find all declarations of Variables that are on the graph. This is not possible because (1) no URI is available and (2) many times one wants really all declarations of variables, regardless of scope. Really: give me all the variables in this procedure that have a graph property, wherever they may be.

On 23 July 2013 at 01:04 Guido Wachsmuth commented:

@Vlad: What you try to achieve here is tempting, but you should be aware that this was never considered this way. What you want is a data-driven compilation, where you query collected data and generate code from this. The classic setup was a AST-driven compilation, where you might collect some stuff in dynamic rules or pass-around arguments, but the generation is guided by the tree. The index is a very specialised data structure when it comes to queries. It is constructed to support common lookups in name analysis, not arbitrary queries.

I see the need for your proposed scheme, but I don’t see a chance to make a fundamental change there now. This would also be a quite uninformed decision. The plan is to revisit each phase after the other. We need to see what compilation schemes are useful and decide which of those we want to support.

On 23 July 2013 at 01:17 Vlad Vergu commented:

Indeed Guido. So i’ll try to stay down to eARTH in expectations. Even in the AST-driven approach there is the need to lookup whatever is in scope at a particular position, whether that position is a definition, a use or a nothing. This is not possible because there is no URI available. Of course this is possible if you move three steps earlier in the pipeline and add properties on everything. But this is one (1) cumbersome, (2) a violation of concerns, (3) bloat-ware. You know that what i’m really concerned with is (2); i shouldn’t have to touch name or type system to compile.

IMHO, what we really need is the ability to retrieve the URI of any AST position without knowing anything else. This will open the world of names/types to read access during compilation.

On 23 July 2013 at 01:24 Guido Wachsmuth commented:

But only named elements have URIs. Am I right that you like to know the URI of the surrounding scope of an AST node? You can achieve this by stopping traversal at scopes (named scopes have the URI on the name, anonymous scopes have an annotation with their URI) and passing the URI down to subtrees.

On 23 July 2013 at 01:30 Vlad Vergu commented:

I think you understood what i would like. Perhaps i’m using the URI term wrong, but i think of it as the unique address of a scope. Manual traversal and manual pushing of URI down to subtrees is what i try to avoid. I also try to avoid manually retrieving the URI from a term. We should avoid that the compilation phase understands (or needs to understand) how names and URIs work.

On 23 July 2013 at 01:37 Guido Wachsmuth commented:

For named scopes, there are strategies to first get the name and second get the URI from this name. This can be wrapped into an API to hide the details, which can also provide the URI of an anonymous scope.

The situation is different in inner nodes. What you would need there is something to walk up to the node that establishes the scope. I am not sure if there is basic support for this, but I remember discussion about how expensive this is/would be.

On 23 July 2013 at 02:14 Vlad Vergu commented:

I think this is part of a wider issue that we’ve seen before when working on the LWC submission. There we constantly had the need to navigate the AST using the named and typed references. Here we need the same, with the addition that we first need to discover the entry points.

This would be the ideal solution from my point of view:

Lookup variables matching properties at any place in the AST, automatically.

Navigate the AST using resolution to get the definition term of a name.

Traverse name links in both directions (use* -> def and def -> use*)

On 23 July 2013 at 02:37 Gabriël Konat commented:

Traversing up requires parent pointers, which we should avoid. Like Guido said, we can make an API that traverses the tree and keeps the ‘current URI’ available for you, this would be very similar to the collection phase.

For 1. you can either do this lookup beforehand and store it in a property as I said before, or if that does not work you can create new resolve tasks and execute them during compilation. I don’t understand point 2 completely, but if you want to get the term of a name you can store it as a property of the definition. For 3 we can add a mapping to the index that maps definitions to all their uses, we currently don’t do this to reduce space overhead.

On 23 July 2013 at 02:54 Vlad Vergu commented:

It does not require parent references, which we should avoid. It requires name resolution pointers.

1: I think neither of your approaches works because it requires creating a task for every term in the AST. That or annotating every term.
2: Why not do this automatically? The AST is no longer a tree after name and type analysis, why are we trying to pretend it is? It’s a graph and i think it should be navigable as such: uniformly.

I guess what i’m trying to say is:

I can make GM work with anything, also without anything.

Currently it’s easier to forget that name binding results exist and do manual resolution during compilation, which is what i do.

Name and type information is currently meta-AST information which i don’t think it should be. Either it’s all easy to query or it should be directly part of the graph.

A program is a directed graph, not a tree. The fact that i need a property to distinguish whether some variable is iterated or not is a smell, if i could get it’s defining term it would start smelling better. NaBL properties are data pieces attached to edges.

On 23 July 2013 at 03:18 Gabriël Konat commented:

Ah, now I understand what you mean with 2. Definitions have origin information so you can go from a use, to a def, to its originating term, but this term is the name of the definition for reference resolution reasons. Where should it actually point to? The parent of the name will not always work, and if we use the NaBL pattern as the term to point to we have a violation of concerns again, albeit a small one. We can either make this a different kind of attachment so we get the actual term object that it points to, or the AST of that term can be stored in the index, but it will be a copy.

For 1, why does it require a task for every term in the AST? Do you need to query for all variables on each AST node?

On 23 July 2013 at 03:25 Vlad Vergu commented:

@Gabriel:

We can either make this a different kind of attachment so we get the actual term object that it points to, or the AST of that term can be stored in the index, but it will be a copy.

I vote for a different attachment such that you don’t get back a copy.

On 23 July 2013 at 10:39 Guido Wachsmuth commented:

@Vlad: The wider issue will be addressed, but later. Compilation is a whole subtopic in the VICI. Given your list, I see the following problems:

Lookup variables matching properties at any place in the AST, automatically.

This can only be achieved with a generic compilation scheme. But this will not work in most cases, since your compilation strategy will interfere. Let’s take MiniJava as an example: to-jbc traverses the AST, thus, it cannot hook into a generic traversal that propagates scope URIs.

Navigate the AST using resolution to get the definition term of a name.
Traverse name links in both directions (use* -> def and def -> use*)

I read in later comments that you basically like to navigate a graph. This is very unlikely to happen and I am totally against it. At uses, you can currently access all relevant properties of the definition. This approach tries to avoid exactly your proposal. This is for a reason: When you switch from trees to graphs, the whole underlying paradigm changes.We would need AGraphs instead of ATerms and a graph transformation language instead of Stratego. And this is not about trees with additional edges, the whole underlying theory changes and would require a whole new design of the underlying infrastructure. Not going to happen, IMO. Even if you use tricks like attachments to navigate the tree as a graph, you will run into problems related to tree vs. graph.
BTW, such a navigation would still not help you, because you would navigate edges on names, that is, you will navigate from referring name to referred name. But typically you want to get to the enclosing AST. Parent pointers again.

What I do see is the need to query the index more flexible and that there should be a scope URI available at each node. But this is also a midterm effort. It requires additional ways to store data. The find all definitions scenario is a common use case we should support. But remember, we store things in hash tables, not in a relational database. The scope can become an annotation (or attachment). However, I predict that this will just cause different issues. Terms created during compilation will not have annotations, and I guess this is the cause for great pain.

The fact that i need a property to distinguish whether some variable is iterated or not is a smell, if i could get it’s defining term it would start smelling better. NaBL properties are data pieces attached to edges.

No, this is not a smell, but a standard concept. You find the same approach in attribute grammars which assign (and propagate) properties at AST nodes and in formal semantics, where environments map names to properties.

Finally, I really don’t see why a manual URI propagation does not work for you. I agree that name analysis should be transparent to compilation phases. But the results of name analysis should not. It is completely fine to think during compilation in terms of scopes and properties of these scopes (all variables in a scope is a property of that scope). And often scopes are a unit of compilation. In your case with a for loop, the external definition introduces small scopes. Why can’t you use them?

Currently it’s easier to forget that name binding results exist and do manual resolution during compilation, which is what i do.

Honestly, I don’t like this attitude. It is jeopardising the group effort. We can work on things you need. But we cannot change the whole game for one use case. Particular not when the new game is based on your mindset during this use case. I would really like to see you stay in the current scheme. This would be important to see

What is still needed?

Where are conceptual issues?

How can we address this systematically?

This is part of our research. What you currently do is assuming a future answer, without finding solutions in the current setup. Thus, please try to find solutions in the current setup. Work with name analysis results and see what needs to be done to work with them. If this requires boilerplate, write the boilerplate. If this requires unpleasant properties everywhere, write the properties. I know this is sometimes unpleasant and hacky, but it helps us to understand. We first need to see what is currently needed, before we can make decisions how to approach this. Currently you make the decisions and we do not see how it works in the current setup. When you decide to bypass this, the whole Greenmarl case study is a nice exercise for you, but without value for the group and the group research.

Long story short: We are interested in solutions with the current setup. We are interested in problems in these solutions. If this results in imperfect code, that is totally fine.

On 23 July 2013 at 18:44 Vlad Vergu commented:

Honestly, I don’t like this attitude. It is jeopardising the group effort. We can work on things you need. But we cannot change the whole game for one use case. Particular not when the new game is based on your mindset during this use case. I would really like to see you stay in the current scheme.

ok

This is part of our research. What you currently do is assuming a future answer, without finding solutions in the current setup. Thus, please try to find solutions in the current setup. Work with name analysis results and see what needs to be done to work with them. If this requires boilerplate, write the boilerplate. If this requires unpleasant properties everywhere, write the properties. I know this is sometimes unpleasant and hacky, but it helps us to understand. We first need to see what is currently needed, before we can make decisions how to approach this.

ok

What I was suggesting was not necessarily to have a graph, but to be able to navigate the tree using name resolution. This does introduce the issue of obtaining surrounding terms, but it is not necessary. Something that is in between worlds is to navigate scopes up and down, i.e. have scope pointers. This would allow one to navigate from a use site to a definition site to its surrounding scope and form there navigate scopes. Then from each scope go to a list of variable definitions in that scope. From the definition go to the use sites. It’s still THE tree but with an alternative navigation scheme to generic traversal and syntactic matching.

I’ve started on the compiler for GM and it begins with obtaining each graph and graph member definition in order to compile this separately. I was imagining this to be SELECT * FROM Variable as g JOIN Variable as v ON v.graph = g.graph WHERE v.graph != NIL. Yes, this falls into flexible querying, and i gave up hope. But between flexible queries and pure syntactic matching (which seems to work in GM but will not work in many other languages) there’s not much to go on.

Let’s use a concrete example, somewhat related to the issue at hand. Given a procedure we need to lookup all the variables g with type DGraphTy() + UGraphTy(). Then for each g we need to lookup variables v with the property g.graph = v. I do this with the following code (full source here):
  gm2j-extract-graphs(ast):
  	proc@|[ Proc name(args outargs) retype { sentences } ]| -> graphs-member*
    with
      graph*         := <collect-om(where(type-of-decl; type-is-graph))> proc;
      graphs-member* := <map(!(<id>, proc); gm2j-extract-graphs(ast))> graph*

  gm2j-extract-graphs(ast):
  	(graph-decl, proc) -> (<arg-to-decl> graph-decl, graph-member*)
  	with
  		graph          := <name-of-decl; graph-of> graph-decl;
  		graph-member*  := <collect-all(not(?graph-decl); where(type-of-decl; graph-of; ?graph); try(arg-to-decl))> proc
After having worked with NaBL, tasks and lookups it’s frustrating to begin compilation with collect-om. In particular, we gather graph*, the list of all graph declarations in the procedure without any use of the name information already there, because it’s easier. Once we have this we look for member declarations - graph-member* - of each graph. This is done similarly, but we actually do retrieve the graph property of each type of each variable we encounter. It works, but it’s ugly. IMO the reason it’s ugly is because we don’t care what the program looks like, i.e. we’re not traversing program structures. I think compilation of programs that do not have control flow and extraction of program meta-data from the others has always been quite painful with traversals. WebDSL does it with dynamic rules, today we have NaBL.

I’m quite against the idea of attaching a property or a URI to every node in the tree. I was doing something similar initially with the graph property of every variable. This was propagated to every expression and to every statement and it led to annotation galore which made the tree unworkable: it was big and it broke equality and matching in unexpected ways.

On 23 July 2013 at 19:10 Vlad Vergu commented:

I’m currently investigating whether compilation could be done as part of a single topdown traversal of the AST. Will report when i have an answer. A topdown compilation would be more natural to perform name lookups.

On 23 July 2013 at 20:02 Vlad Vergu commented:

Ok. I’m reporting it’s not feasible to do compilation of GM in a single traversal. It is possible but not feasible because it requires incremental rewriting of the ASTs in the target language. Back to needing to lookup variables.

On 24 July 2013 at 00:25 Guido Wachsmuth commented:

Ok, so what do we have. First, a navigation possibility, where you can access properties of definition sites at use sites. Second, a scope navigation facility by manipulating URIs. We can associate properties with scope URIs, if this helps. This can include all variables and will only affect scope nodes, not all nodes. Third, a possibility to get all variables in a scope. This might not work currently, but this should be there and will be there, because it is needed for code completion. Finally, you point out a need to find subscopes. This seems reasonable and can be achieved, I think. Maybe you can sketch the need for this in a separate issue.

I see the problem with annotations of properties, and identified this, just as you, as a weakness of our approach. A first solution which can make things easier would be a library of equality strategies, which ignore property annotations. Would this already help?

I see the point that you need a working version for Oracle and I see valuable insights in your work. We just need to find a way to get this out of your head. Can you sketch, maybe in separate issues, what you try to achieve? Then we can discuss best practices for typical problems.

For this issue, I totally agree that a “give me all namespace instances in a scope” should be supported. I think this should be related to a scope URI and that compilation can involve traversals which search for scope URIs. Would this be fine for you?

On 24 July 2013 at 01:14 Vlad Vergu commented:

Statement

I don’t actually need anything new currently. I’m making GM work with what we’ve got and I intend to continue this way. The way this is turning out is old-fashioned with the exception of (1) no dynamic rules and (2) name and property lookups. I need to demo a compiler which produces somewhat runnable programs for this Fri morning. After I get the GM compiler some half-way I intend to stop and shift attention slightly. My GM process is to half-implement every feature in the stack.

Another statement

I am interested in the problem of compilation. All compilers are ugly and i don’t think they need to. Wasteful traversals are pity. Template fragmentation is something particularly annoying in both design and maintenance. I’m willing to support the evolution of our approaches by trying different approaches in GM; just not for this Fri.

Wish list

My wish list for navigation is short:

Navigate the AST by references, definitions, scopes. We can call it: name-bound AST shortcuts or name-driven traversal.

I’ve made a separate issue for this: https://yellowgrass.org/issue/NaBL/66

I think this is useful for compilation/transformation that is not localised in the AST. The use case in GM is only for gathering and emitting the graph structures, slightly too weak to justify a full on approach to this. I can picture more of the compiler benefiting from this though.

Annotations

The annotation issue got me so confused a few weeks ago that i don’t remember how I worked around it or how big of an issue it was. So far i have not encountered any related issues in compilation. Since i’m no longer encountering the issue i think we should avoid “featurizing” the issue into a library.

Lookups

I think the actual issue is getting the scope URI at a random position in the AST. Once one has this it’s possible (Gabriël had something) to look up all visible variables. This would be looking outwards in a scope. Something else would be looking inwards in a scope: give me all definitions (with some property) that are somewhere in this scope (indirectly & directly).

I think the actual initial issue from the original post has been (1) partially solved and (2) partially worked around. I propose we close this issue w/o requiring any additional implementation and use the comments above to spawn other issues.

On 24 July 2013 at 01:30 Guido Wachsmuth commented:

I still think an issue here is that you have a particular compilation scheme in mind, without trying other compilation schemes before. The whole ugliness might be hidden in generated code, just as it is now for name analysis code. As for annotations, this needs to become a feature, because your compiler is not the only one built.

I am totally fine with you going for the Friday deadline any way you want. I just like you to think of decisions such as “Currently it’s easier to forget that name binding results exist and do manual resolution during compilation” in a bigger context and discuss this with us, before you follow this approach. This is because we rely on your insights. And your research also does. Analysis of a problem is as important as proposing a solution. Your comments here proposed a solution. I am still missing an analysis of the issue in the current scheme. Part of this is a missing execution of the current scheme for GM. This is at least my impression. Maybe I am wrong.

On 24 July 2013 at 01:59 Vlad Vergu removed tag !vvergu

Log in to post comments

Lookup all visible variables

Issue Log

Statement

Another statement

Wish list

Annotations

Lookups