javadoc comments vs normal comments generate ambiguities and parse errors
Hi, I'm trying to integrate in my language some Javadoc style comments, but I get some weird behaviors. I've narrowed down a simple grammar that shows the glitch: context-free syntax Module* -> Start {cons("ModuleList")} "start" Sentence* "end" -> Module {cons("Module")} LATEX -> Sentence {cons("Latex")} lexical syntax [\ \t\n\r] -> LAYOUT [\*] -> CommentChar -> EOF "/*" (~[\*] | CommentChar)* "*/" -> LAYOUT "/**" (~[\*] | CommentChar)* "*/" -> LATEX "//" ~[\n\r]* ([\n\r] | EOF) -> LAYOUT "//*" ~[\n\r]* ([\n\r] | EOF) -> LATEX lexical restrictions CommentChar -/- [\/] EOF -/- ~[] "/*" -/- [\*] "//" -/- [\*] context-free restrictions LAYOUT? -/- [\ \t\n\r] LAYOUT? -/- [\/].[\/].~[\*] LAYOUT? -/- [\/].[\*].~[\*] This is a sample text: start /* 1 */ end start /** 2 */ end start // 3 end start //* 4 end And the AST for this: ModuleList( [ amb([Module([]), Module([])]) , Module([Latex("/** 2 */")]) , Module([]) , Module([]) ] ) The basic effect is that a comment of type 1 generates an ambiguity (for each use another one), although it shouldn't because I have follow restrictions for LAYOUT. Comments of type 2 and 3 are correct. Comments of type 4 are even more surprising, as they generate a parse error. I hope I'm doing this correctly. There may be a special way of dealing with this kind of comments.Submitted by Radu Mereuta on 4 October 2011 at 22:15
Issue Log
Hi Radu,
Would this work?
context-free syntax
Module* -> Start {cons(“ModuleList”)}
%% “start” Sentence* “end” -> Module {cons(“Module”)}
LATEX -> Sentence {cons(“Latex”)}context-free priorities
{
“start” “end” -> Module {cons(“Module”)}
}>
{
“start” Sentence+ “end” -> Module {cons(“Module”)}
}lexical syntax [\ \t\n\r] -> LAYOUT [\*] -> CommentChar %% "*" -> CommentChar -> EOF "/*" (~[\*] | CommentChar)* "*/" -> LAY "/**" (~[\*] | CommentChar)* "*/" -> LATEX "//" ~[\n\r]* ([\n\r] | EOF) -> LAY "//*" ~[\n\r]* -> LATEX LAY -> LAYOUT lexical restrictions CommentChar -/- [\/] EOF -/- ~[] "/*" -/- [\*] "//" -/- [\*] LAY -/- [\/].[\/].~[\*] LAY -/- [\/].[\*].~[\*] context-free restrictions LAYOUT? -/- [\ \t\n\r] %% LAYOUT? -/- [\/].[\/].~[\*] %% LAYOUT? -/- [\/].[\*].~[\*]
Another solution could be to use the build in stratego primitives to extract the comment and then
inspect (or parse) the returned strings to extract javadoc info.//Returns succeeding comments that attach to this node (heuristically determined)
origin-comments-after = origin-support-sublist(prim(“SSL_EXT_origin_comments_after”, ))//Returns preceding comments that attach to this node (heuristically determined)
origin-comments-before = origin-support-sublist(prim(“SSL_EXT_origin_comments_before”, ))//Extracts all block comments (see regex in MyLang-Syntax.esv)
//between the previous sibling and the current node,
//and all line comments between the current node and the next sibling (see lib/editor-commom.generated)
origin-surrounding-comments = prim(“SSL_EXT_origin_surrounding_comments”, “My-Lang”, )
@Vlad. Thank you for that, interesting idea, but in my modules I also have other sentences. I tried to disambiguate with stratego. I managed that, but the performance became unacceptable (for a 800 lines of code file it jumped from half a second, to almost 10). The number of ambiguities generated is just too big.
@Maartje. Looks interesting and simple. How can I use those functions? Where can I find them? What should I import? And is this a feature only for eclipse? Because I want to create a runnable jar.
1. How can I use those functions?origin-comments-before = origin-support-sublist(prim(“SSL_EXT_origin_comments_before”, )) //adds primitive strategy to project
node-in-ast //returns preceding comment string associated to the node-in-ast
- Where can I find them?
origin-comments-before / origin-comments-after:
https://svn.strategoxt.org/repos/StrategoXT/spoofax/trunk/spoofax/org.spoofax.interpreter.library.jsglr/
src/org/spoofax/interpreter/library/jsglr/origi/OriginCommentsBeforePrimitive.javaorigin-surrounding-comments:
https://svn.strategoxt.org/repos/StrategoXT/spoofax-imp/trunk/org.strategoxt.imp.runtime/src/org/strategoxt/imp/runtime/stratego/OriginSurroundingCommentsPrimitive.java
What should I import? nothing special
Is this a feature only for eclipse?
origin-comments-before / origin-comments-after do not use any special Eclipse thing,
origin-surrounding comments currently uses Syntax.esv file to find the expressions for blockcomments and line comments.
Ah, sorry Maartje for the previous message, I found on the svn the missing functions and just copy pasted them into my code and they work. But not really the way I’m expecting.
Here is another simple grammar:context-free syntax Module* -> Start {cons("ModuleList")} "start" Sentence* "end" -> Module {cons("Module")} "rule" -> Sentence {cons("Rule")} lexical syntax [\ \t\n\r] -> LAYOUT [\*] -> CommentChar -> EOF "/*" (~[\*] | CommentChar)* "*/" -> LAYOUT "//" ~[\n\r]* ([\n\r] | EOF) -> LAYOUT lexical restrictions CommentChar -/- [\/] EOF -/- ~[] context-free restrictions LAYOUT? -/- [\ \t\n\r] LAYOUT? -/- [\/].[\/] LAYOUT? -/- [\/].[\*] also a sample input: //0 start /* 1 */ rule //2 rule //3 end //4I’ve tried a topdown traversal printing the before and after terms for every term. I could only catch #1 and #2 with the origin-comments-before function.
I also tried with origin-surrounding-comments which misses #0.
The comment-after, comment-before functions were implemented to support layout/comment preservation for textual transformations. Some heuristics are used to determine if a comment is associated to the preceding or succeeding node, or to a sublist of nodes, or to no node at all (for example an outcommented function or statement). The heuristics that are used are described in http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2011-027.pdf.
For java doc it may make sense to implement an all-comments-before function and then filter the javadoc comment later using stratego.
Log in to post comments