Eclipse Community Forums: VIATRA » Some questions regarding performance

Help

Home

Home » Modeling » VIATRA » Some questions regarding performance

Show: Today's Messages :: Show Polls :: Message Navigator

Some questions regarding performance [message #1833637]

Tue, 20 October 2020 08:26

Hans van der Laan

Messages: 34
Registered: February 2020

Member

Hey,

I have a few questions regarding performance. I thought it would be easier to group them into one thread. However, if people prefer I could create separate theads for each one of them.

Are match sets of patterns which are initialized but which are not being used by other patterns and which have no match update listeners attached fully updated upon a model change? Or are they ignored? Does this hold both for the Rete engine and the local search engine?

Is there an easy way to see how much time is spend computing/updating the match sets of patterns individually upon initialization/after a model change? So that we know how much time is spend dealing with pattern A, dealing with pattern B, etc.

If pattern A uses pattern B and the model is modified such that pattern B has many added/removed matches, do both engines wait until the match set of B is stabilized before updating the match set of pattern A? Is it possible to enable this? E.g. something like a priority/delayUpdatPropagation but for propagation matches between patterns?

Regarding setting parameters as incoming/outgoing , what exactly does it mean for a parameter to be "bound when the pattern matcher initializes"? Does this increase performance?

If we get a warning that "pattern body contains constraints which are only loosely connected. This may negatively impact performance. The weakly dependent partitions are: [......]". However we know the partitions are connected through another pattern in an aggregator. Can we add this information somehow?

For example, I have the following pattern:

pattern RSD(role: Role, scenario: java Scenario, demarcation: Demarcation) {
	find Scenarios(scenario);
	find connectedByTemporalGrantRule(role, demarcation);
	maxEnabledPriority == max find EnabledPriority(scenario, role, demarcation, #p);
	maxDisabledPriority == max find DisabledPriority(scenario, role, demarcation, #p2);
	check(maxEnabledPriority > maxDisabledPriority);
} or {
	find Scenarios(scenario);
	find connectedByTemporalGrantRule(role, demarcation);
	find EnabledPriority(scenario, role, demarcation, _);
	neg find DisabledPriority(scenario, role, demarcation, _);
}

Here, we get the warning that "The pattern body contains constraints which are only loosely connected. This may negatively impact performance. The weakly dependent partitions are: [maxEnabledPriority][role, demarcation][scenario][maxDisabledPriority]". However, we know that maxEnabledPriority and maxDisabledPriority are found given sets of [role, demarcation, scenario] in the patterns EnabledPriority/DisabledPriority. Can we add this information somehow?

Kind regards,

Hans

[Updated on: Wed, 21 October 2020 06:21]

Report message to a moderator

Re: Some questions regarding performance [message #1833665 is a reply to message #1833637]

Tue, 20 October 2020 18:47

Zoltan Ujhelyi

Messages: 392
Registered: July 2015

Senior Member

Hi Hans,

it is totally fine to group these questions together, they indeed seem to be related to each other.

Quote:

Are match sets of patterns which are initialized but which are not being used by other patterns and which have no match update listeners attached fully updated upon a model change? Or are they ignored? Does this hold both for the Rete engine and the local search engine?

There are three separate cases to consider here:

1. If the Rete network is initialized for a pattern, it always listens to model changes and updates its indexes. This means, both match set and partial results might be updated on model changes.
2. VIATRA uses a component called Base indexer that is responsible for enumerating instances of some types and edges. This is used by the Rete algorithm to calculate its input nodes. Furthermore, the local search algorithm can also use this index to make some operations execute faster, e.g. when traversing edges backward (if opposites are not set up in the metamodel) or when looking for a good starting point for a given search. This can be turned off with an appropriate option if necessary (but some operations become slower).
3. Finally, local search itself does not cache anything, so there is nothing to update on model changes there.

Quote:

Is there an easy way to see how much time is spend computing/updating the match sets of patterns individually upon initialization/after a model change? So that we know how much time is spend dealing with pattern A, dealing with pattern B, etc.

There is no simple way, the Rete network is global, e.g. some nodes may be reused by multiple patterns, etc. If you want to have an idea which pattern is problematic, the only way to do that is to have only the required pattern initialized and execute model modifications.

Quote:

If pattern A uses pattern B and the model is modified such that pattern B has many added/removed matches, do both engines wait until the match set of B is stabilized before updating the match set of pattern A? Is it possible to enable this? E.g. something like a priority/delayUpdatPropagation but for propagation matches between patterns?

Not really. We never had a case where this seemed important.

Quote:

Regarding setting parameters as incoming/outgoing , what exactly does it mean for a parameter to be "bound when the pattern matcher initializes"? Does this increase performance?

If a parameter is incoming it means the pattern must only be called with the given parameter bound to a given model element. On the other hand, an outgoing parameter must not be called. A parameter not marked as incoming or outgoing can be used both ways.

The Rete algorithm basically ignores these parameter directions, as it always calculates all matches (regardless of input parameter binding), and then filters the end results. This means, binding the parameters has no real effect on performance of the pattern matching.

On the other hand, the best search plan depends very much on whether a parameter is bound or not, and the fact that a selected parameter is always bound may result in some types not getting indexed when preparing the pattern. Another case where the parameter direction has an effect on performance is when a pattern is prepared: at that point it is not known what bindings will be necessary later so by default all possible search plans are calculated. This is not an issue when a pattern has only a few parameter (3-5), but in case of patterns with a large number of parameters, it is highly recommended to given some hint to the engine which parameter bindings are important to avoid creating hundreds of unnecessary plans.

Quote:

If we get a warning that "pattern body contains constraints which are only loosely connected. This may negatively impact performance. The weakly dependent partitions are: [......]". However we know the partitions are connected through another pattern in an aggregator. Can we add this information somehow?

For example, I have the following pattern:

pattern RSD(role: Role, scenario: java Scenario, demarcation: Demarcation) {
find Scenarios(scenario);
find connectedByTemporalGrantRule(role, demarcation);
maxEnabledPriority == max find EnabledPriority(scenario, role, demarcation, #p);
maxDisabledPriority == max find DisabledPriority(scenario, role, demarcation, #p2);
maxEnabledPriority > maxDisabledPriority);
} or {
find Scenarios(scenario);
find connectedByTemporalGrantRule(role, demarcation);
find EnabledPriority(scenario, role, demarcation, _);
neg find DisabledPriority(scenario, role, demarcation, _);
}

Here, we get the warning that "The pattern body contains constraints which are only loosely connected. This may negatively impact performance. The weakly dependent partitions are: [maxEnabledPriority][role, demarcation][scenario][maxDisabledPriority]". However, we know that maxEnabledPriority and maxDisabledPriority are found given sets of [role, demarcation, scenario] in the patterns EnabledPriority/DisabledPriority. Can we add this information somehow?

To be entirely honest, this is one of the trickiest aspects of the VQL language where it is very easy to make mistakes. What is important to note here that the engine cannot say for sure what are the valid combinations of scenario/role/demarcation sets it needs to check to calculate the aggregated value. The easiest way to understand this is to consider negation: if you say `neg find Scenario(scenario)` it will be fulfilled by every rule, even worse, every attribute value (e.g. the number 2 is not a Scenario), but if you have a given model element (or data value), you can always check whether its true. In other words, negation is not enumerable.

If you consider the previous pattern is equivalent to `0 == cound find Scenario(scenario)` it is easy to see that in general aggregations are also not enumerable. It might be possible that some aggregators should be considered enumerable, which means the engine has to consider all possible sets of scenario/role/demarcation not disallowed by any other constraints. While it might be possible that some aggregators are free from this issue but we did not have the time to figure out what these cases are and how to make sure the engine handles them correctly, so we opted for the safe approach and declare all aggregators as not enumerable constraints.

On the other hand, if you are only interested in pairs that have some connection to each other, you are free to declare this connection as an additional constraint in your pattern body. The somewhat counterintuitive aspect of VIATRA is that adding extra constraints may even improve performance as those extra constraints can greatly reduce the number of objects to consider. Here you could say you are interested only in scemario/role/demarcation sets that have an EnabledPriority by adding a call to that pattern without the aggregator. This would reduce the number of elements the engine has to check, and would also remove the warning.

Hope this was understandable, if not, feel free to ask for clarifications.

Best regards,
Zoltán

Report message to a moderator

Re: Some questions regarding performance [message #1833752 is a reply to message #1833665]

Fri, 23 October 2020 09:51

Hans van der Laan

Messages: 34
Registered: February 2020

Member

Hey Zoltán,

Thanks for the quick reply and the detailed responses! Next to clarifying a lot of questions, it also helped me to squish a bit more performance out of the engine these last couple of days

Quote:

Not really. We never had a case where this seemed important.

We have some performance problems when deleting elements from my model. I thought this might help here.

In one part of my model, roles are assigned groups of permissions. When/Which roles are assigned which groups of permissions and when they are assigned is inferred by a group of rules. E.g. we have rules like: Grant/Revoke [Permission Group X] to [Role X] during [Time X] with [Priority X].

However, when we remove an element not all references to that element are removed (see [1]). Thus, when we remove a role all rules referring to that role have to be manually removed. Usually, this goes very fast. However, we have a role of which the removing takes almost as long as reinitializing the whole Rete engine and one role of which the removing takes twice as longs as reinitializing the whole Rete engine (~27s, with DRED and without any recursion).

(Please note, these results are not final. Demarcations is the name we chose for permission groups.)

Quote:

The somewhat counterintuitive aspect of VIATRA is that adding extra constraints may even improve performance as those extra constraints can greatly reduce the number of objects to consider.

When testing your suggestion out for this specific pattern, the network initialization time did not decrease. Generally, I use the initialization time to check if a change has made a pattern more/less efficient. Can it be the case that a pattern is more/less efficient to recheck but this is not reflected in the initialization time? If so, what is an easy way to detect this? (so that I do not have to rerun all my benchmark tests for every small change)

Lastly, I also have a bit more-open ended question. At the moment, our results are quite good. I'm not fully done with the evaluation. However, In general, the tool I've built on top of VIATRA is quite fast (thanks to VIATRA!). However, at the moment I have found two exceptions. One is the case I presented just above. The other case is when I take into account reachability. In my model, permissions give access to zones. I compute reachability recursively.

To illustrate, removing various elements now takes much more time:

Also, the time to initialize the engine went up to ~140s from ~27s (with DRED).

I originally planned to just present this results and keep this as future work. However, since it is just one pattern which is drastically decreasing performance and it since computing conditional reachability sounds like a common use-case, I was wondering if you might know a way to rewrite the pattern and/or another way to compute conditional reachability with graph patterns. If not, I'm just going to leave it as future work. I figured asking never hurts.

The pattern computing reachability currently looks like this. I've refactored SecurityZoneAccessibleIntermediate1 and SecurityZoneAccessibleIntermediate2 out of SecurityZoneAccessible to increase performance (based upon our discussions from [2]).

pattern SecurityZoneAccessible(user: User, scenario: java Scenario, zone: SecurityZone) {
	SecurityZone.public(zone, true);
	find SecurityZoneAccessStatus(scenario, zone, 0);
	User(user);
} or {
	find USO(user, scenario, zone);
	SecurityZone.public(zone, true);
	find SecurityZoneAccessStatus(scenario, zone, 1);
} or {
	find SecurityZoneAccessible(user, scenario, prev);
	find SecurityZoneAccessibleIntermediate1(scenario, prev, zone);
} or {
  	find SecurityZoneAccessible(user, scenario, prev);
  	find SecurityZoneAccessibleIntermediate2(user, scenario, prev, zone);
 }

pattern SecurityZoneAccessibleIntermediate1(scenario: java Scenario, prev: SecurityZone, zone: SecurityZone) {
	find SecurityZoneAccessStatus(scenario, zone, 0);
	SecurityZone.reachable(prev,zone);
}

pattern SecurityZoneAccessibleIntermediate2(user: User, scenario: java Scenario, prev: SecurityZone, zone: SecurityZone) {
	SecurityZone.reachable(prev,zone);
	find SecurityZoneAccessStatus(scenario, zone, 1);
	find USO(user, scenario, zone);
}

Kind regards,

Hans

[EDIT: apparently smileys remove all the text after they smiley. ]

[1] https://bugs.eclipse.org/bugs/show_bug.cgi?id=567920
[2] https://www.eclipse.org/forums/index.php/t/1105472/

[Updated on: Fri, 23 October 2020 09:58]

Report message to a moderator

Re: Some questions regarding performance [message #1833947 is a reply to message #1833752]

Wed, 28 October 2020 16:19

Zoltan Ujhelyi

Messages: 392
Registered: July 2015

Senior Member

Hi Hans,

I am sorry as I have not really used the DReD implementation since it was introduced and have forgotten that we have indeed noticed slowdowns when it was turned on - in fact, this is the main reason that we optimized the internals of the engine not to rely on the changes required by DReD (e.g. mailboxes) when not necessary. In other words, DReD is indeed known to be slower, as it handles all recursive patterns, not only well-formed ones (see documentation for details: https://www.eclipse.org/viatra/documentation/query-language.html#recursion). In other words, we know that DReD is slower, but had not had the time so far to try to identify the root cause and figure out a better way to use it.

Maybe @Gábor Bergmann can share more details here.

Best regards,
Zoltán

Report message to a moderator

Previous Topic:	Deep Copy of Engine, Resource and Pattern Matches
Next Topic:	Scope uninitialized in Query Results

Goto Forum:

-=] Back to Top [=-

[ Syndicate this forum (XML) ] [

]

Current Time: Thu Dec 26 12:07:50 GMT 2024

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter