Skip to main content


Eclipse Community Forums
Forum Search:

Search      Help    Register    Login    Home
Home » Modeling » EMF » Large containment reference performance
Large containment reference performance [message #1847810] Tue, 09 November 2021 05:22 Go to next message
Denis Nikiforov is currently offline Denis NikiforovFriend
Messages: 344
Registered: August 2013
Senior Member
Hi

We store a simulation log in a model. The log contains 10-100 thousands of entries. The log generation using QVTo works fast enough. However Sirius takes a lot of time to update a table representation of the log. Profiling shows that most of the time takes BasicEList.contains() method:

index.php/fa/41284/0/

What are the best pacticies for large containment references?

I think I can't replace EObject by EDataType, because log entries has several fields, I need to show them in Sirius tables. Also log entries has a cross-references to some previous entries.

Maybe I can group some entries in a log... But I guess that it could complicate log generation, processing, showing.

I tried to set unique = false for the containment reference, but I get an error that a containment reference must be unique.

Here is an interesting quote:
Quote:
Furthermore, it is a good idea to avoid a containment design where a large number of EObjects is contained in the same EReference. Not only is it difficult to display, but the performance of operations on this containment list will be poor in general. There are of course ways to tweak performance of large lists of EObjects contained in one container (and I will provide hints in one of the future blogs of this series), but often large containment lists are the sign of a poor containment design.

However it seems that this future blog was not published.
Re: Large containment reference performance [message #1847811 is a reply to message #1847810] Tue, 09 November 2021 05:54 Go to previous messageGo to next message
Ed Willink is currently offline Ed WillinkFriend
Messages: 7655
Registered: July 2009
Senior Member
HI

The List rather than Set performance of large EList can be an issue, so don't do it. Usually it's not worth worrying about since sizes are not large and the potential improv,ents are dwarfed by other factors. However you've profile; well done.

Since this is your model it should't be too hard to avoid.

One possibility is to replace the one level containment by a two level containment in which each new intermediate has at most 100 children, but this requires changes to the add/remove methods.

Another transparent solution, is to take advantage of "@generated NOT" to replace the creation of the problematic BasicEList by a custom variant that can be tuned to your taste.


/**
* <!-- begin-user-doc -->
* <!-- end-user-doc -->
* @generated NOT FIXME workaround BUG 89325
*/
@SuppressWarnings("serial")
@Override
public List<CGClass> getTemplateParameters() {
if (templateParameters == null) {
templateParameters = new EObjectResolvingEList<CGClass>(CGClass.class, this, CGModelPackage.Literals.CG_CLASS__TEMPLATE_PARAMETERS.getFeatureID())
{
@Override
protected boolean isUnique() {
return false;
}
};
}
return templateParameters;
}

For instance something like org.eclipse.ocl.examples.xtext.build.analysis.UniqueList is a List that also implements Set. Internally it maintains a copy of a non-tinyList as a Set to avoid quadratic contains costs.

Regards

Ed Willink
Re: Large containment reference performance [message #1847813 is a reply to message #1847811] Tue, 09 November 2021 07:10 Go to previous messageGo to next message
Ed Merks is currently offline Ed MerksFriend
Messages: 33143
Registered: July 2009
Senior Member
Maybe this implementation of org.eclipse.emf.common.util.AbstractEList.getNonDuplicates(Collection<? extends E>) would be better:
  /**
   * Returns the collection of objects in the given collection that are not also contained by this list.
   * @param collection the other collection.
   * @return the collection of objects in the given collection that are not also contained by this list.
   */
  protected Collection<E> getNonDuplicates(Collection<? extends E> collection)
  {
    Collection<E> result = new LinkedHashSet<E>(collection);
    for (E object : this)
    {
      result.remove(object);
    }
    return new ArrayList<E>(result);
  }
Or this one:
  /**
   * Returns the collection of objects in the given collection that are not also contained by this list.
   * @param collection the other collection.
   * @return the collection of objects in the given collection that are not also contained by this list.
   */
  protected Collection<E> getNonDuplicates(Collection<? extends E> collection)
  {
    Collection<E> result = new LinkedHashSet<E>(collection);
    result.removeAll(new HashSet<E>(this));
    return new ArrayList<E>(result);
  }

Which implementation is better likely depends on the relative sizes of each of the lists. I.e., if the "collection" is small, very few contains tests will be done on the "this" collection, in which case the first implementation will do many removes that are no-ops.

Note that the implementation of org.eclipse.emf.ecore.util.EcoreEList.contains(Object) ensures that contains testing on a containment list is O(1) not O(n) as one would expect. Also note that the measurement you show has nothing to do with your list implementation but rather with the implementation of the org.eclipse.emf.ecore.change.impl.ChangeDescriptionImpl.objectsToDetach list.


Ed Merks
Professional Support: https://www.macromodeling.com/
Re: Large containment reference performance [message #1847821 is a reply to message #1847813] Tue, 09 November 2021 10:38 Go to previous messageGo to next message
Denis Nikiforov is currently offline Denis NikiforovFriend
Messages: 344
Registered: August 2013
Senior Member
Thanks for answers!

I've got it, the main source of performance problems is this method in Sirius:
    private boolean analyseNotifications(ResourceSetChangeEvent event,
            DRepresentation currentRep, List<DRepresentationElement> keptNotifiedElements) {
        boolean elementsToSelectUpdated = false;
        Collection<EObject> attachedEObjects = null;
        for (Notification n : event.getNotifications()) {
            // ...

It processes each of the thousands model change notifications separately, creates lists, calls contains method, etc. The problem is not that contains() works slow, but it's called a lot of times.
Re: Large containment reference performance [message #1848041 is a reply to message #1847821] Thu, 18 November 2021 07:55 Go to previous message
Pierre-Charles David is currently offline Pierre-Charles DavidFriend
Messages: 703
Registered: July 2009
Senior Member
For the record, Denis has created https://bugs.eclipse.org/bugs/show_bug.cgi?id=577165 on Sirius to track this.

Pierre-Charles David - Obeo

Need training or professional services for Sirius?
http://www.obeodesigner.com/sirius
Previous Topic:UML/XMI resource loader connects to www.eclipse.org
Next Topic:[CDO] Documentation is unavailable
Goto Forum:
  


Current Time: Thu May 02 01:17:05 GMT 2024

Powered by FUDForum. Page generated in 0.03820 seconds
.:: Contact :: Home ::.

Powered by: FUDforum 3.0.2.
Copyright ©2001-2010 FUDforum Bulletin Board Software

Back to the top