GC overhead limit exceeded in Task Engine
When parsing many programs per second, via the Java API, we run into the following error:
java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.HashMap.resize(HashMap.java:703) at java.util.HashMap.putVal(HashMap.java:628) at java.util.HashMap.put(HashMap.java:611) at com.google.common.collect.AbstractMapBasedMultimap$WrappedCollection.addToMap(AbstractMapBasedMultimap.java:417) at com.google.common.collect.AbstractMapBasedMultimap$WrappedCollection.addAll(AbstractMapBasedMultimap.java:538) at com.google.common.collect.AbstractMultimap.putAll(AbstractMultimap.java:80) at com.google.common.collect.ArrayListMultimap.putAll(ArrayListMultimap.java:66) at org.metaborg.runtime.task.TaskInsertion.createResultMapping(TaskInsertion.java:205) at org.metaborg.runtime.task.TaskInsertion.getResultsOf(TaskInsertion.java:235) at org.metaborg.runtime.task.TaskInsertion.createResultMapping(TaskInsertion.java:198) at org.metaborg.runtime.task.TaskInsertion.insertResultCombinations(TaskInsertion.java:175) at org.metaborg.runtime.task.TaskInsertion.taskCombinations(TaskInsertion.java:127) at org.metaborg.runtime.task.evaluation.BaseTaskEvaluator.evaluate(BaseTaskEvaluator.java:40) at org.metaborg.runtime.task.evaluation.TaskEvaluationQueue.evaluateQueuedTasks(TaskEvaluationQueue.java:318) at org.metaborg.runtime.task.evaluation.TaskEvaluationQueue.evaluate(TaskEvaluationQueue.java:182) at org.metaborg.runtime.task.engine.TaskEngine.evaluateScheduled(TaskEngine.java:279) at org.metaborg.runtime.task.primitives.task_api_evaluate_scheduled_3_0.call(task_api_evaluate_scheduled_3_0.java:21) at org.metaborg.runtime.task.primitives.TaskEnginePrimitive.call(TaskEnginePrimitive.java:27) at org.strategoxt.lang.Context.invokePrimitive(Context.java:227) at org.strategoxt.lang.Context.invokePrimitive(Context.java:216) at pgqllang.trans.task_api_evaluate_scheduled_3_0.invoke(task_api_evaluate_scheduled_3_0.java:28) at pgqllang.trans.task_evaluate_scheduled_0_0.invoke(task_evaluate_scheduled_0_0.java:29) at pgqllang.trans.lifted266.invoke(lifted266.java:32) at pgqllang.trans.measure_time_2_0.invoke(measure_time_2_0.java:39) at pgqllang.trans.analyze_all_no_builtins_4_1.invoke(analyze_all_no_builtins_4_1.java:143) at pgqllang.trans.analyze_all_4_1.invoke(analyze_all_4_1.java:34) at pgqllang.trans.analyze_all_3_1.invoke(analyze_all_3_1.java:29) at pgqllang.trans.editor_analyze_0_0.invoke(editor_analyze_0_0.java:34) at org.strategoxt.lang.Strategy.invokeDynamic(Strategy.java:30) at org.strategoxt.lang.InteropSDefT.evaluate(InteropSDefT.java:192) at org.strategoxt.lang.InteropSDefT.evaluate(InteropSDefT.java:183) at org.strategoxt.lang.InteropSDefT$StrategyBody.evaluate(InteropSDefT.java:245)
Is there any way we can disable this cache or clean it up after parsing a program?
This is how I’m using the API:
Submitted by Oskar van Rest on 3 October 2016 at 17:49
https://github.com/oracle/pgql-lang/blob/master/pgql-lang/src/main/java/oracle/pgql/lang/Pgql.java
Issue Log
Possibly this indicates a memory leak. I’m not sure if the cache ever gets cleaned.
How much is many programs per second?
You’re not just parsing stuff right? You’re also analyzing, otherwise the task engine would not even be used?
What cache do you want to disable? Do you mean the data in the task engine?
If so, what you can try is creating a temporary context instead (see IContextService), using that for analyzing a parsed file, and closing that (calling the close method on the temporary context). This should not persist any data, and memory can be freed immediately.
Indeed this is parsing + name analysis + type analysis but no code generation.
This was reported by one of our users and I think they were stress-testing the system to see if there’s any memory leaks by executing queries inside a
while(true)
loop.Our queries are always Strings and I write these to an in-memory VFS file system.
The file names for the queries are randomly generated, which seems where the problem lies. After some initial warmup, the parsing+analyzing of the query would take 5ms, but after repeatedly parsing+analyzing the same query, this time would go up to 20ms after a minute or so. From there on, it just keeps increasing.I found out that this slowdown does not happen when I reuse the same file names for queries, rather than random file names. Given that insight as well as the exception message, this indicates that Spoofax keeps around objects for each of these randomly generated files. The HashMap (this is what I meant with “cache” / task engine data) seems to grow so large that most of the time is spend in inserting new objects into it. At some point, there are so many objects that the JVM takes too much time to perform GC and throws the exception.
For PGQL, I’ve now fixed it by generating only a single random file name per Spoofax instance rather than per query: https://github.com/oracle/pgql-lang/commit/6b8f174d067d1d061a7c19790557eb4fd048febc
I’m not sure if for Spoofax you want to provide some mechanism for cleaning up resources on a per-file basis to avoid such leakage. Although the temporary context probably also does the trick already, so I’ll just close the issue.
The task engine will store data about all files (both in memory, and persisted to disk), so if you keep feeding it random file names it will indeed leak memory. In practice, when files are deleted we do clean up the task engine by sending it an empty tuple as AST for that file (kind of a hack right now), but since you keep generating new files, this does not happen.
For now, you can either use the same filename (as you just implemented), or use a temporary context. Using a temporary context is preferred because it is more efficient and elegant. If you use the same filename every time with a regular context, the task engine will incrementally update result every time there is analysis, which has some overhead.
In the future, we should add API to the analyzer for removing files, instead of the hack of sending an empty tuple that we have now.
Oskar, in case you implement a solution using temporary contexts, can you add an example how to process strings with temporary contexts to https://github.com/MetaBorgCube/metaborg-api-usage? This would help to document a working solution.
The temporary context works: https://github.com/oracle/pgql-lang/commit/829f8cd4d5f5dbc88296317b05e35d97608b63f9
@Guido: I wanted to add the temporary context to
org.metaborg.examples.api.analysis
but this requires a dummy project. I’m not sure if this is the right way to do it but I can add it if you want:dummyProject = new Project(dummyProjectDir, new IProjectConfig() { @Override public Collection<LanguageIdentifier> sourceDeps() { Set<LanguageIdentifier> sourceDeps = new HashSet<>(); sourceDeps.add(id); return sourceDeps; } @Override public Collection<LanguageIdentifier> javaDeps() { return new HashSet<>(); } @Override public Collection<LanguageIdentifier> compileDeps() { return new HashSet<>(); } @Override public String metaborgVersion() { return null; } @Override public boolean typesmart() { return false; } });
Also, I believe the following repository should be added to
org.metaborg.examples.api.analysis/pom.xml
otherwise users are required to add the repo to~/.m2/settings.xml
but that shouldn’t be the preferred way.<repositories> <repository> <id>metaborg-release-repo</id> <url>http://artifacts.metaborg.org/content/repositories/releases/</url> <releases> <enabled>true</enabled> </releases> <snapshots> <enabled>false</enabled> </snapshots> </repository> </repositories>
Log in to post comments