There's a number of issues that may degrade performance of reading terms from files:

  • The SSL_read_term_from_stream class doesn't use the java nio channels yet

  • The method used in BasicTermFactory.parseFromStream() may be inefficient

  • The Java API can currently only write textual ATerms to files. The binary ATerm format (BAF) is known to be very performance-inefficient though, and I haven't seen much difference with the new, streamed ATerm format (SAF)

The first two issues can be addressed by writing our own PushbackInputStream (like) class that uses channels.

Update: apparently, memory-mapped IO should be considered harmful: Spoofax/106.

Submitted by Lennart Kats on 20 March 2010 at 11:49

On 29 March 2010 at 12:44 Lennart Kats commented:

Added caching of ReadFromFile in r20723, which should help with this issue.

On 7 January 2011 at 16:27 Nathan Bruning commented:

Writing textual aterm files was very memory inefficient because of in-memory string construction. This was fixed by streaming the terms to an OutputStream in revision 21582.

The implementation of Streaming ATerm reading and writing was completed (I hope) in r21640.
The writer however calculates sharing on its own... couldn't it use existing term sharing for terms with storage type = MAXIMALLY_SHARED ? Just a thought.

Writing binary aterms remains unimplemented, but shouldn't be too hard to implement either, re-using parts of the binary aterm reader and the streaming aterm writer.

On 7 January 2011 at 16:57 Lennart Kats commented:

Nathan: great about the SAF support. Note that in the new-terms branch terms also gained an efficient writeAsString(Appendable, int) method. I also changed the description of this issue a bit to reflect that memory-mapped IO is 3vil.

On 24 January 2013 at 22:43 removed tag java

Log in to post comments