Hi, I'm trying to integrate in my language some Javadoc style comments, but I get some weird behaviors.
I've narrowed down a simple grammar that shows the glitch:
context-free syntax
	Module*					-> Start {cons("ModuleList")}
	"start" Sentence* "end"	-> Module {cons("Module")}
	LATEX					-> Sentence {cons("Latex")}

lexical syntax
	[\ \t\n\r] -> LAYOUT
	[\*]								-> CommentChar
	-> EOF
	"/*"  (~[\*] | CommentChar)* "*/"	-> LAYOUT
	"/**" (~[\*] | CommentChar)* "*/"	-> LATEX
	"//"  ~[\n\r]* ([\n\r] | EOF)		-> LAYOUT
	"//*" ~[\n\r]* ([\n\r] | EOF)		-> LATEX

lexical restrictions
	CommentChar   -/- [\/]
	EOF  -/- ~[]
	"/*" -/- [\*]
	"//" -/- [\*]

context-free restrictions
	LAYOUT? -/- [\ \t\n\r]
	LAYOUT? -/- [\/].[\/].~[\*]
	LAYOUT? -/- [\/].[\*].~[\*]

This is a sample text:
start
  /* 1 */
end
start
  /** 2 */
end
start
  // 3
end
start
  //* 4
end

And the AST for this:
ModuleList(
  [ amb([Module([]), Module([])])
  , Module([Latex("/** 2 */")])
  , Module([])
  , Module([])
  ]
)

The basic effect is that a comment of type 1 generates an ambiguity (for each use another one),
although it shouldn't because I have follow restrictions for LAYOUT.
Comments of type 2 and 3 are correct.
Comments of type 4 are even more surprising, as they generate a parse error.

I hope I'm doing this correctly. There may be a special way of dealing with this kind of comments.
Submitted by Radu Mereuta on 4 October 2011 at 22:15

On 5 October 2011 at 00:55 Vlad Vergu commented:

Hi Radu,

Would this work?


context-free syntax
Module* -> Start {cons(“ModuleList”)}
%% “start” Sentence* “end” -> Module {cons(“Module”)}
LATEX -> Sentence {cons(“Latex”)}

context-free priorities
{
“start” “end” -> Module {cons(“Module”)}
}>
{
“start” Sentence+ “end” -> Module {cons(“Module”)}
}

lexical syntax
    [\ \t\n\r] -> LAYOUT
    [\*]                                -> CommentChar
    %% "*" -> CommentChar
    
    -> EOF
    "/*"  (~[\*] | CommentChar)* "*/"   -> LAY
    "/**" (~[\*] | CommentChar)* "*/"   -> LATEX
    "//"  ~[\n\r]* ([\n\r] | EOF)       -> LAY
    "//*" ~[\n\r]*       		-> LATEX

    LAY -> LAYOUT

lexical restrictions
    CommentChar   -/- [\/]
    EOF  -/- ~[]
    "/*" -/- [\*]
    "//" -/- [\*]
    LAY -/- [\/].[\/].~[\*]
    LAY -/- [\/].[\*].~[\*]
    
context-free restrictions
    LAYOUT? -/- [\ \t\n\r]
    %% LAYOUT? -/- [\/].[\/].~[\*]
    %% LAYOUT? -/- [\/].[\*].~[\*]


On 5 October 2011 at 11:47 Lennart Kats tagged needsinfo

On 6 October 2011 at 09:48 Maartje commented:


Another solution could be to use the build in stratego primitives to extract the comment and then
inspect (or parse) the returned strings to extract javadoc info.

//Returns succeeding comments that attach to this node (heuristically determined)
origin-comments-after = origin-support-sublist(prim(“SSL_EXT_origin_comments_after”, ))

//Returns preceding comments that attach to this node (heuristically determined)
origin-comments-before = origin-support-sublist(prim(“SSL_EXT_origin_comments_before”, ))

//Extracts all block comments (see regex in MyLang-Syntax.esv)
//between the previous sibling and the current node,
//and all line comments between the current node and the next sibling (see lib/editor-commom.generated)
origin-surrounding-comments = prim(“SSL_EXT_origin_surrounding_comments”, “My-Lang”, )


On 6 October 2011 at 13:42 Radu Mereuta commented:

@Vlad. Thank you for that, interesting idea, but in my modules I also have other sentences. I tried to disambiguate with stratego. I managed that, but the performance became unacceptable (for a 800 lines of code file it jumped from half a second, to almost 10). The number of ambiguities generated is just too big.

@Maartje. Looks interesting and simple. How can I use those functions? Where can I find them? What should I import? And is this a feature only for eclipse? Because I want to create a runnable jar.


On 6 October 2011 at 14:37 Maartje commented:


1. How can I use those functions?

origin-comments-before = origin-support-sublist(prim(“SSL_EXT_origin_comments_before”, )) //adds primitive strategy to project

node-in-ast //returns preceding comment string associated to the node-in-ast

  1. Where can I find them?

origin-comments-before / origin-comments-after:
https://svn.strategoxt.org/repos/StrategoXT/spoofax/trunk/spoofax/org.spoofax.interpreter.library.jsglr/
src/org/spoofax/interpreter/library/jsglr/origi/OriginCommentsBeforePrimitive.java

origin-surrounding-comments:
https://svn.strategoxt.org/repos/StrategoXT/spoofax-imp/trunk/org.strategoxt.imp.runtime/src/org/strategoxt/imp/runtime/stratego/OriginSurroundingCommentsPrimitive.java

  1. What should I import? nothing special

  2. Is this a feature only for eclipse?
    origin-comments-before / origin-comments-after do not use any special Eclipse thing,
    origin-surrounding comments currently uses Syntax.esv file to find the expressions for blockcomments and line comments.


On 6 October 2011 at 14:42 Radu Mereuta commented:

Ah, sorry Maartje for the previous message, I found on the svn the missing functions and just copy pasted them into my code and they work. But not really the way I’m expecting.
Here is another simple grammar:

context-free syntax
    Module*                 -> Start {cons("ModuleList")}
    "start" Sentence* "end" -> Module {cons("Module")}
    "rule"					-> Sentence {cons("Rule")}

lexical syntax
    [\ \t\n\r]	-> LAYOUT
    [\*]                            -> CommentChar
    -> EOF
    "/*"  (~[\*] | CommentChar)* "*/"   -> LAYOUT
    "//"  ~[\n\r]* ([\n\r] | EOF)       -> LAYOUT

lexical restrictions
    CommentChar   -/- [\/]
    EOF  -/- ~[]

context-free restrictions
    LAYOUT? -/- [\ \t\n\r]
    LAYOUT? -/- [\/].[\/]
    LAYOUT? -/- [\/].[\*]

also a sample input:

//0
start
  /* 1 */
  rule
  //2
  rule
  //3
end
//4

I’ve tried a topdown traversal printing the before and after terms for every term. I could only catch #1 and #2 with the origin-comments-before function.
I also tried with origin-surrounding-comments which misses #0.


On 6 October 2011 at 15:32 Maartje commented:

The comment-after, comment-before functions were implemented to support layout/comment preservation for textual transformations. Some heuristics are used to determine if a comment is associated to the preceding or succeeding node, or to a sublist of nodes, or to no node at all (for example an outcommented function or statement). The heuristics that are used are described in http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2011-027.pdf.

For java doc it may make sense to implement an all-comments-before function and then filter the javadoc comment later using stratego.

Log in to post comments