parsing performance issues [message #4646] |
Tue, 22 May 2007 00:59 |
Eclipse User |
|
|
|
Originally posted by: jgangemi.yahoo.com
i'm noticing that my source parser is being invoked twice during a
reconciliation period:
- build the source module
- compute folding
i haven't implemented any code assist, etc features yet, but a quick
search through the code shows ISourceParser.parse being called quite a
bit.
i also notice that ISourceElementParser.parseSourceModule is invoked
three times when saving the document:
- 'commiting' a working copy
- same as above
- indexing the document
this seems very inefficent to me, and in my case, extremely expensive
b/c the test case i am working with is extremely large and takes ~5-6
seconds to parse.
after doing some investigation on what happends during the
reconcilliation period, it seems that the SourceModule object just
throws away the ModuleDeclaration object created. i see some caching
type classes in use as well (ISourceModuleInfoCache,
ISourceModuleInfo), but i'm not 100% sure what they do (some of the
code is commented out) and there are no javadocs. :) the
ASTFoldingProvider gets the 'source' contents from an IModelElement
(fInput) cast to an ISourceReference, so in that case, it seems the
FoldingProvider could be handed the ModuleDeclaration that had just
been created. as a failsafe, the code could resort to reparsing if the
ModuleDeclaration was null. i'm also not sure which interface would
best be suited to implement the method and cause little disruption
elsewhere.
it appears it could be added to the ISourceModule interface - all of
those classes could stand some refactoring anyway since there appears
to be a lot of duplication (more then happy to submit that patch if
this method is agreed upon).
the save period has different issues. every class that invokes
buildStructure on SourceModule all cause the document's source to be
handled directly. it seems that the cache classes may play a role here,
but i am not sure how. i do see that the parseSourceModule on the
ISourceElementParser class takes an ISourceModuleInfo object, but i'm
not sure what it's intended use is.
it seems that i could store a ModuleDeclaration inside the cache
object, but the only key that is avaiable before doing an actual parse
is the source code itself, but it seems that would be a never be a way
to clear out old entries and that object would just eat memory.
thoughts?
--
-jae
|
|
|
Re: parsing performance issues [message #4784 is a reply to message #4646] |
Wed, 23 May 2007 08:36 |
Andrei Sobolev Messages: 17 Registered: July 2009 |
Junior Member |
|
|
Hi Jae,
We know about this problems.
Originally code come are from JDT, but Java parser quite fast, and could
provide different AST representation for different situations, so in
different situations different parsers are used.
We use only one parser for all situations, but also have some
performance problems.
We discussed some other ways to do it, but for now we only added caching
SourceModuleInfoCashe could cache any kind of source module information.
Cache has auto cleanup functionality, via listening for model changes.
Also cache use SoftReferences, so it is memory safe.
You could look to RubySourceElementParser for example.
SourceModuleInfoCache used to build model, mixins.
Index aren't use this cache yet.
We plan to improve Indexing in future, to support caching.
We plan to move index to separate plugins, and make it more clear, and
improve performance, because it quite slow for complex code completion.
For ASTFoldingProvider we will add using of cache before RC2 release.
|
|
|
Powered by
FUDForum. Page generated in 0.04527 seconds