PermGen space leak on redeploys (1)
First a build fails with
BUILD FAILED
java.lang.VerifyError: (class: org/apache/tools/ant/BuildException, method: setLocation signature: (Lorg/apache/tools/ant/Location;)V) Illegal constant pool index
Then, subsequent builds fail directly with
java.lang.OutOfMemoryError: PermGen space
Workaround is restarting Eclipse.
Solution space
Submitted by Danny Groenewegen on 19 January 2011 at 16:35
- http://cdivilly.wordpress.com/2012/04/23/permgen-memory-leak/ –>
- Debug which root objects are keeping referenced after an undeploy app
- Try to move reference/management of these objects to the root classloader
- On init of an app, remove older references in the root class loader, possibly also checking for other undeployed, not redeployed apps somehow.
Issue Log
We might want to setup it differently in the eclipse tomcat, which is not meant as production environment.
We can move the jars to tomcat’s shared library {catalina.home}/shared/lib, such that they won’t get reloaded on each redeploy.
For this to work, we might need to do something different on war creation, i.e. leave out the jars from the lib folder.Or is the addressed issue different from the perm gen space errors we get from tomcat after several redeploys?
First attempt: close hibernate session factory and our recurring timer thread @ r5721
Result: No more tomcat messages about leaking/forced stopped hibernate (search) threads such as index writer threads, but still a leaking timer:SEVERE: The web application [/yellowgrass] registered the JDBC driver [com.mysql.jdbc.Driver] but failed to unregister it when the web application was stopped. To prevent a memory leak, the JDBC Driver has been forcibly unregistered. Jun 06, 2013 11:41:05 AM org.apache.catalina.loader.WebappClassLoader clearReferencesThreads SEVERE: The web application [/yellowgrass] appears to have started a thread named [MySQL Statement Cancellation Timer] but has failed to stop it. This is very likely to create a memory leak.
r5722: Updated Mysql connector to fix the statement cancellation timer.
Current state: No more GCRoots for theWebappClassLoader
, but still increasing permgen
(!) we need to check if calling
com.mysql.jdbc.AbandonedConnectionCleanupThread.shutdown();
is safe.The source code of this class shows that it sets its classloader to
null
-> not sure if it becomes shared between multiple webapps and maybe break others on undeploy.
…. but it gets GCed when it reaches max perm gen… so I think it is fixed by now :)
Here is jvisualvm permgen graph of repeatedly redeploying (touching) yellowgrass webapp + reloading page:
Fixed by now. Additionally added some checks for scheduled tasks to finish. Outputs something like this (in case something is still running at undeploy):
[12:38:19 researchr] WARN Undeploying application [12:38:19 researchr] cleanup: canceling future scheduled task [12:38:19 researchr] cleanup: done [12:38:19 researchr] cleanup: Waiting for recurring task to finish: 'invoke updateSuggestionIndex() every 12 hours' [12:38:20 researchr] Done [12:38:24 researchr] cleanup: recurring task: 'invoke updateSuggestionIndex() every 12 hours' has ended [12:38:24 researchr] cleanup: closing Hibernate session factory [12:38:24 researchr] cleanup: done [12:38:24 researchr] cleanup: stopping JDBC AbandonedConnectionCleanupThread [12:38:24 researchr] cleanup: done [12:38:24 researchr] WARN Application has been undeployed [12:38:24 researchr] Application is destroyed
Still the case in (for example) researchr app, reopening this issue
I’m not sure if this helps and it is probably already known, but here goes nevertheless maybe it helps.
Tomcat loads each application in a separate instance of the WebappClassLoader. Each class loader gets created with a the jar file of the application, and it follows the JVM class loading delegation model (if a class is available in the queried class loader then it is loaded by that class loader, otherwise the request is forwarded to the parent class loader). Even with garbage collection of PermGen space enabled garbage collection of a class is only possible in the following circumstances:
- A class is only garbage collected when its class loader is garbage collected. This is because the class loader maintains a hard reference cache to the loaded classes.
- A class loader is only garbage collectible if there are no references to it.
The last criteria is of course crucial. Since every object has a references to its class, and every class holds a reference to its class loader, then a class loader is collectible only if there are no instances of the classes it has loaded. Additionally a class loader has children in the hierarchy. If a class loader child is not collectible, nor is its parent.
In the case of WebDSL, hibernate probably adds its own class loader as children of the WebAppClassLoader (so that it can magically provide object proxies). If there is a single entity instance that is not garbage collectible, none of the classes in the jar file of the web application are garbage collectible.
My hypothesis is that there is an actual memory leak:
- Either there’s an object that is statically or dynamically referenced
- Or somehow a parent class loader (parent of the WebAppClassloader) holds a reference to the the child class loader (inverse delegation). This IS required by tomcat to get an application started but this reference should be release on undeploy.
- Or somehow a class form the application gets leaked outside of the web application container.
I think JVisualVM can assist in this. I would run the application which causes the leaks then undeploy it (without redeploying a new instance), then pause the machine, do a full memory dump to a file. Then JVisualVM can execute a garbage collection on the memory dump and has a query language which can query the heap (very slow, but gives good results). It can for example show all non-garbage collectible instances of classes which match a particular regex filter on the package & class name. From every instance one can navigate up the reference chain to find out what is holding the object back.
Vlad, thank you for your descriptive explanation. The previous attempts to fix the permgen leak were indeed based on looking for refs to the classloader(s) using Eclipse MAT (very powerful for this kind of issues).
In its current state, most webdsl applications show no permgen leaks. However, researchr still has the permgen issues, so it probably uses some special functionality that causes some of the 3 mentioned leaks. I will analyze the heap dumps of researchr shortly to find the culprit.
It’s the cleaner thread of Bobo-Browse, therefore only problematic in applications using faceted search. I will try to find this thread during application destroy and stop it.
Log in to post comments