Eclipse Community Forums: OCL » OCL query from file

Home » Modeling » OCL » OCL query from file("Main" entry point for OCL document)

Mon, 25 February 2019 10:43

Eclipse User

Hi,

I was wondering whether there is a way to execute an OCL query (or series of queries) from an OCL document (i.e. Complete OCL) but without using invariants. I am trying to replicate an EOL script in OCL but from my understanding OCL does not provide any top-level constructs for executing pure queries (e.g. a "main" method).
If I'm not mistaken the API provides a way to execute queries from code by passing a String, but since my query refers to some other declared operations I'm not sure if it would work.
I have attached the script as an example. Ideally I want to print the value of "result", but since it's an operation (and no way of declaring "let" outside of an "inv") there's no way to force the execution.
To be clear, my intention is to use this to benchmark performance, so although ModelElementType.allInstances()->select(...) can be re-written as an inv I want to test the performance of the select operation.
I suspect I would have to write my parser or extend an existing one to be able to do this?

Thanks,
Sina

Attachment: imdb_select.ocl
(Size: 0.51KB, Downloaded 162 times)

Re: OCL query from file [message #1803244 is a reply to message #1803239]

Mon, 25 February 2019 12:02

Eclipse User

Hi

A bit of TopCased functionality was migrated to Eclipse OCL under Bug 425799. It provides a command line, currently for "help" or "validate" commands.

The documentation seems to have been overlooked. https://bugs.eclipse.org/bugs/show_bug.cgi?id=544787 raised. But the "help" command provides some good clues - see the Bugzilla.

"validate" can handle multiple files, so it might well do what you want.

The support was deliberately extensible, so a "query" or "run" command could probably be added pretty easily.

You can see examples of use in /org.eclipse.ocl.examples.xtext.tests/src/org/eclipse/ocl/examples/test/standalone/StandaloneExecutionTests.java

The StandaloneApplication is an IApplication so it can be Run As->Eclipse Application. Not sure why. Another 5 lines of code for a main() method and it can probably Run As->Java Application.

Regards

Ed Willink

Re: OCL query from file [message #1803245 is a reply to message #1803244]

Mon, 25 February 2019 12:11

Eclipse User

Ed Willink wrote on Mon, 25 February 2019 12:02

Another 5 lines of code for a main() method and it can probably Run As->Java Application.

A bit pessimistic. Only needs 4/3 lines:

	public static void main(String[] args) {
		StandaloneApplication standaloneApplication = new StandaloneApplication();
		standaloneApplication.execute(args);
	}

Regards

Ed Willink

Re: OCL query from file [message #1803246 is a reply to message #1803244]

Mon, 25 February 2019 13:19

Eclipse User

Hi Ed,
Thanks for the suggestion. However I already have a command-line utility for invoking OCL. I'm just not sure how to "trick" OCL into "executing" the attached script. I've looked at the source for StandaloneApplication and StandaloneCommand but there doesn't seem to be an obvious entry point; though I see there is some parsing logic there. However it appears that there is no legal way to define such "commands" / queries outside of an inv (presumably even if I tried to parse through this API it would be invalid). So it seems the only option is to get the file's contents as a String and present them as commands to StandaloneApplication? If so I wonder if each expression has to be standalone or if there can be dependencies; since my script requires reference to defined operations.

Perhaps there is an easier way of "tricking" Complete OCL into executing an invariant just once? Since the query can effectively be expressed as a "static inv" / outside of a context if there is a global context or a context with a single model element I can write the query in that context and wrap it as an inv to achieve the desired result.

Re: OCL query from file [message #1803248 is a reply to message #1803246]

Mon, 25 February 2019 13:55

Eclipse User

HI

If I knew what you were trying to do I might be able to help. Your example shows four helper operations, but as you indicate no main. So I'm left guessing.

Perhaps you can sketch out the call-tree that you are actually trying to execute and indicate how it is affected by what model elements are present.

Perhaps you can even write a query main() that does everything that you want so all that is necessary is to call main(), perhaps as validation of a uniquely typed root model element.

Regards

Ed Willink

Re: OCL query from file [message #1803249 is a reply to message #1803248]

Mon, 25 February 2019 14:42

Eclipse User

Hi Ed,

Apologies for not making the intention clearer in the attached file. The "main()" would be to execute the "result" operation, like so:

let result : Integer = Person.allInstances()->select(a | a.coactors->exists(areCoupleCoactors(a)))->size()

i.e. I am trying to execute the last operation just once, so the context / inv doesn't matter. However I do not want to modify the model. The other three operations need to be defined in context of Person. Hope that makes sense.

Thanks,
Sina

Re: OCL query from file [message #1803278 is a reply to message #1803249]

Tue, 26 February 2019 03:05

Eclipse User

Hi

A validate would validate everything, so you clearly need a query capability, which the standalone command line does not have. https://bugs.eclipse.org/bugs/show_bug.cgi?id=544796 raised.

You suggested that you already had a Java launcher so what is the problem with emulating for instance the example query from the OCL documentation that may be found in /org.eclipse.ocl.examples.xtext.tests/src/org/eclipse/ocl/examples/test/xtext/PivotDocumentationExamples.java

		Library library = getLibrary(resourceSet);  // get library from a hypothetical source

		// use the constraints defined in the OCL document

		// use getBooks() from the document in another query to find a book
		ExpressionInOCL expression = ocl.createQuery(EXTLibraryPackage.Literals.LIBRARY,
				"getBooks('Bleak House')->asSequence()->first()");

		Book book = (Book) ocl.evaluate(library, expression);
		debugPrintf("Got book: %s%n\n", book);

executes a custom query (expression) on a specific object (library) making use of

		URI uri = getInputURI("/models/documentation/parsingDocumentsExample.ocl");

Regards

Ed Willink

Re: OCL query from file [message #1803299 is a reply to message #1803278]

Tue, 26 February 2019 08:51

Eclipse User

Hi Ed,

Thank you for the hint and example. What I'd like to do in that case is to execute the last declared operation ("result"); which has dependencies on the other three defined operations. However since it needs to be executed once I just need a reference to an arbitrary instance of Person to pass as the context EObject. However since it doesn't matter at all what context the query is defined in, I was wondering if there is a way to declare a contextless operation in OCL, or if there is a context with a singleton element? This is what I have so far, just missing the model element:

final Resource scriptResource = ocl.parse(org.eclipse.emf.common.util.URI.createURI(script.toUri().toString()));
for (EObject eObj : (Iterable<EObject>) scriptResource::getAllContents) {
	if (eObj instanceof Operation) {
		Operation op = (Operation) eObj;
		if ("QUERY".equals(op.getName())) {
			EObject contextElement = null;	// What should this be?
			ExpressionInOCL asQuery = ocl.createQuery(contextElement, op.getBodyExpression().getBody());
			System.out.println(ocl.evaluate(contextElement, asQuery));
		}
	}
}

where QUERY is the result as an operation:

def: QUERY() : Integer = Person.allInstances()->select(a | a.coactors->exists(areCoupleCoactors(a)))->size()

Currently passing null gives me:

org.eclipse.ocl.pivot.utilities.SemanticException: The «unknown» '«null»' is invalid: 'Person.allInstances()->select(a | a.coactors->exists(areCoupleCoactors(a)))->size()'
1:1: Unresolved Property '::Person'
1:37: Unresolved Property 'OclInvalid::coactors'

Thanks,
Sina

Re: OCL query from file [message #1803302 is a reply to message #1803299]

Tue, 26 February 2019 09:45

Eclipse User

Hi

null is a perm,itted value of self, but only if its type has been separately specified.

What's wrong with scriptResource.eContents().get(0) ?

Regards

Ed Willink

Re: OCL query from file [message #1803307 is a reply to message #1803302]

Tue, 26 February 2019 10:30

Eclipse User

Hi Ed,

I have tried with the following values and still get the same result:
null
new DynamicEObjectImpl()
scriptResource.getContents().get(0)
modelResource.getContents().get(0)

I suppose I have to try and locate an arbitrary model element of type Person, but I'm not sure how to do this efficiently through the API. No matter what I try I can't seem to get hold of a type, for example:

String typeName = "Actor";
EClass targetType = metamodelPackage.eContents().stream().map(EObject::eClass).filter(e -> e.getName().equals(typeName)).findAny().orElse(null);
EObject contextElement = modelResource.getContents().stream().filter(e -> e.eClass().equals(targetType)).findAny().orElse(null);

gives me null in both cases.
I presume once I have gotten hold of a valid EObject the query evaluation should work.

Thanks,
Sina

Re: OCL query from file [message #1803314 is a reply to message #1803307]

Tue, 26 February 2019 11:32

Eclipse User

Hi

If you provided a repro, I could help. Instead we have to have a very prolonged game of ping-pong.

Regards

Ed Willink

Re: OCL query from file [message #1803322 is a reply to message #1803314]

Tue, 26 February 2019 13:34

Eclipse User

Hi Ed,

I have attached a minimal example. Interestingly I get a different error than before, though I'm not sure what the message is trying to tell me:

Exception in thread "main" org.eclipse.ocl.pivot.utilities.SemanticException: The «unknown» '«null»' is invalid: 'Person.allInstances()->select(a | a.coactors->exists(areCoupleCoactors(a)))->size()'
1: Unresolved Iteration '::select(a| a.coactors->exists(areCoupleCoactors(a)))'
1: No implicit source
	at org.eclipse.ocl.pivot.utilities.PivotUtil.checkResourceErrors(PivotUtil.java:210)
	at org.eclipse.ocl.pivot.internal.context.AbstractParserContext.parse(AbstractParserContext.java:233)
	at org.eclipse.ocl.pivot.internal.helper.OCLHelperImpl.createQuery(OCLHelperImpl.java:133)
	at org.eclipse.ocl.pivot.utilities.OCL.createQuery(OCL.java:281)
	at Minimal.main(Minimal.java:51)

Thanks,
Sina

Attachment: Minimal.zip
(Size: 178.40KB, Downloaded 193 times)

Re: OCL query from file [message #1803348 is a reply to message #1803322]

Wed, 27 February 2019 04:08

Eclipse User

HI

Thanks for the repro.

(It's much easier if you make your project a Plugin Project so that the Manifest works out the class path for you.)

(If you are looking at performance you should really genmodel your models).

That said, there seems to be a parsing/analysis bug on "a.coactors" with the type of "a" unresolved. If this is really happening then tons of tests should be failing. Need to investigate further.

Regards

Ed Willink

Re: OCL query from file [message #1803352 is a reply to message #1803348]

Wed, 27 February 2019 04:45

Eclipse User

Hi

First problem. You did not emulate the example in which EXTLibraryPackage.Literals.LIBRARY, a type, is passed to createQuery, and then "library", an instance of the type is passed to evaluate. You passed an instance to both causing the parser to fail miserably.

Next problem. Metamodel schizophrenia. Your moves model conforms to http://movies/1.0, whereas your OCL complements movies.ecore. EMF evaluation crashes. Change your OCL to http://movies/1.0.

The following is a bit simpler and gives the answer "8".

	public static void main(String... args) throws Exception {
		CompleteOCLStandaloneSetup.doSetup();

		OCL ocl = OCL.newInstance();
		ResourceSet resourceSet = ocl.getResourceSet();
		resourceSet.getPackageRegistry().put(MoviesPackage.eNS_URI, MoviesPackage.eINSTANCE); //metamodelPackage);

		// Model
		String modelFilePath = "model/imdb-small.xmi";
		URI modelUri = URI.createURI(modelFilePath);
		Resource modelResource = resourceSet.getResource(modelUri, true);

		// Script
		String scriptFilePath = "model/script.ocl";
		ASResource scriptResource = ocl.parse(URI.createURI(scriptFilePath));

		EObject contextElement = modelResource.getContents().get(0);
		ExpressionInOCL asQuery = ocl.createQuery(contextElement.eClass(), "Person.allInstances()->select(a | a.coactors->exists(areCoupleCoactors(a)))->size()");
		System.out.println(ocl.evaluate(contextElement, asQuery));
		ocl.dispose();
	}

Regards

Ed Willink

Re: OCL query from file [message #1803366 is a reply to message #1803352]

Wed, 27 February 2019 07:28

Eclipse User

Hi Ed,

Thank you very much for your help and provided solution! Adding .eClass() to the contextElement passed to the query and changing the import in OCL seems to have done it.

My end goal is to test performance of both compiled and interpreted OCL, and your reference to the MoviesPackage "literal" suggests a generated model. I was wondering whether it is then possible to combine this with a generated OCL query? If I'm not mistaken the OCL to Java generator works only in the context of OCLinEcore, however since the query is defined as an operation, which can also be generated from OCLInEcore, then presumably there is also a way to invoke such an operation as well by calling a method in the generated code?

Thanks,
Sina

Re: OCL query from file [message #1803371 is a reply to message #1803366]

Wed, 27 February 2019 08:24

Eclipse User

Hi

It should be possible to work dynamic rather than generated models, but obviously execution will be slower. I chnaged it to a generated package to reduce complexities and make debugging easier - EDynamicObjectImpl's are a bit opaque.

It is possible to generate Java from *.ocl code; the OCL tooling has its validation defined in Pivot.ocl. But the GenerateOCLPivotModel.mwe2 script that invokes the ConstraintMerger bean is not available in an easy to consume form.

All generated methods comply with the EMF API, so invoke it just like any other EMF operation.

The OCL->Load Document context menu in Ecore / OCLinEcore editors should support interactive merging.

Regards

Ed Willink

Re: OCL query from file [message #1803372 is a reply to message #1803371]

Wed, 27 February 2019 08:27

Eclipse User

Thanks again for your help!

Re: OCL query from file [message #1803396 is a reply to message #1803372]

Wed, 27 February 2019 18:39

Eclipse User

Hi Ed,

Apologies for the long post. Possibly a long shot / problem on my end but I thought I'd share on the off chance that it's an issue with OCL.

I just ran a preliminary benchmark as proof-of-concept (with Eclipse running however) and I found that with a 500k element model, the call to ocl.dispose() takes a whopping 2 minutes. Although I don't actually need this call since the application terminates after executing the query I included it out of curiosity. In my profiling I also call System.gc() both before and after profiling the query and dispose, and even taking the GC pause time into account (which for the entirety of the run was 5 seconds) it makes me wonder whether there is some sort of bug or is within the realm of reasonable expectation. I am happy to provide a repro if required. The observed output under Java 8 is as follows:

Windows 10
Java HotSpot(TM) 64-Bit Server VM 25.192-b12
Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
Logical processors: 4
Xms: 1963 MB
Xmx: 7282 MB
Starting execution at 27-Feb-2019 19:50:47
-----------------------------------------------------
Profiled processes:
Prepare model: 04.308 (4308 ms)
Prepare validator: 722 (722 ms)
Parse script: 818 (818 ms)
execute query: 12:14.661 (734661 ms)
GARBAGE_COLLECTION: 05.068 (5068 ms)
dispose: 02:00.388 (120388 ms)

Finished at 27-Feb-2019 20:05:13
-----------------------------------------------------
Result: 11718
-----------------------------------------------------

Using Java 11 with G1 and MaxGCPauseMillis=500:

Java HotSpot(TM) 64-Bit Server VM 11.0.1+13-LTS
Xms: 2048 MB
Xmx: 8192 MB
Starting execution at 27 Feb 2019, 20:59:34
-----------------------------------------------------
Profiled processes:
Prepare model: 04.779 (4779 ms)
Prepare validator: 986 (986 ms)
Parse script: 778 (778 ms)
execute query: 14:28.021 (868021 ms)
GARBAGE_COLLECTION: 02.193 (2193 ms)
dispose: 02:00.412 (120412 ms)

Finished at 27 Feb 2019, 21:16:11
-----------------------------------------------------
Result: 11718
-----------------------------------------------------

I noticed that there was period with high (90%) CPU usage and 8 GB memory usage from the process. The CPU usage can only be explained by garbage collection since this single-threaded application itself only uses 30% under normal operation. I tried this with Epsilon (single-threaded) which took about 10 minutes to execute and encountered no such issues, so I conclude that the excess memory usage must be in dispose(). For reference the application seems to use about 4.2 - 4.5 GB memory during normal execution with OCL (similar to Epsilon). To be sure I gave it the maximum memory with the same GC arguments. This is what came out:

Xms: 2048 MB
Xmx: 12288 MB
Starting execution at 27 Feb 2019, 23:02:19
-----------------------------------------------------
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 253755392 bytes for Failed to commit area from 0x00000007b4a00000 to 0x00000007c3c00000 of length 253755392.
00000007b4a00000, 253755392, 0) failed; error='The paging file is too small for this operation to complete' (DOS error/errno=1455)

I tried ommitting the call to dispose, but again ran out of memory. This never happens with Epsilon using the same model and query, so I suspect there must be something wrong with either my standalone OCL setup or perhaps some obscure bug (possibly even a memory leak).

I have tried with a smaller (100k) model and the dispose time is close to 6 seconds (where query time is around 2 mins 41s). Since I ran this with Oracle JDK 8 the default GC is the parallel one, so System.gc() calls are stop-the-world (at least in my experience). Here is the output with 200k model:

Java HotSpot(TM) 64-Bit Server VM 25.192-b12
Prepare model: 02.374 (2374 ms)
Prepare validator: 720 (720 ms)
Parse script: 841 (841 ms)
execute query: 04:40.535 (280535 ms)
GARBAGE_COLLECTION: 02.894 (2894 ms)
dispose: 19.750 (19750 ms)
-----------------------------------------------------
Result: 4674
-----------------------------------------------------

I ran the 200k model again with Java 11 using ParallelOldGC with the following results:

Java HotSpot(TM) 64-Bit Server VM 11.0.1+13-LTS
-----------------------------------------------------
Profiled processes:
Prepare model: 02.039 (2039 ms)
Prepare validator: 669 (669 ms)
Parse script: 820 (820 ms)
execute query: 04:38.066 (278066 ms)
GARBAGE_COLLECTION: 01.906 (1906 ms)
dispose: 18.897 (18897 ms)
-----------------------------------------------------

I ran with Java 11 again, using G1 and MaxGCPauseMillis=500 with the following results:

Prepare model: 02.456 (2456 ms)
Prepare validator: 804 (804 ms)
Parse script: 847 (847 ms)
execute query: 04:53.900 (293900 ms)
GARBAGE_COLLECTION: 831 (831 ms)
dispose: 19.059 (19059 ms)
-----------------------------------------------------

Same again without the call to ocl.dispose():

Starting execution at 27 Feb 2019, 23:20:13
-----------------------------------------------------
Profiled processes:
Prepare model: 02.840 (2840 ms)
GARBAGE_COLLECTION: 982 (982 ms)
execute query: 04:50.155 (290155 ms)

Finished at 27 Feb 2019, 23:25:09
-----------------------------------------------------

What's strange is how dispose() manages to crawl to a finish when Xmx is 8G or less under Java 8. Yet For some reason in Java 11 with G1 and 12G the application runs out of memory despite running exactly the same program and same inputs! Perhaps this is an issue with HotSpot but I just thought I'd share my experience just in case there is an issue with OCL. I will test further and report back with a repro if need be.

Thanks,
Sina

Re: OCL query from file [message #1803409 is a reply to message #1803396]

Thu, 28 February 2019 03:49

Eclipse User

Hi

Long shot. No. Thank you very much for reporting this. It is exactly what I need, but don't want, to know.

https://bugs.eclipse.org/bugs/show_bug.cgi?id=544903 raised

I suspect that much of the problem comes from EMF's polite unloading no doubt aggravated by extra stuff that OCL unloads.

The Java 8/10/11 discrepancies are perhaps something new that Eclipse Modeling needs to get to grips with.

Can you please post the exact repro so that I can investigate. If necessary send me a link to dropbox by private email.

When I do performance measures I like to iterate logarithmically over a wide diversity of model size. e.g
Fig 16 of http://www.eclipse.org/mmt/qvt/docs/ICMT2017/MicromappingMoC.pdf
Fig 8 of http://www.eclipse.org/modeling/mdt/ocl/docs/publications/OCL2017Evaluation/LazyDeterminism.pdf

A few numbers can be hard to grasp but a picture tells a story.

quadratic/linear/constant regions - static may be poor test harness costs.

noise for poor testing determinism.

kinks for strangeness - often M1/M2 caches kicking in.

collapse where GC kicks in.

I would love to see curves for at least Java 8, 11 and why not 5?

A profiler may shed some insight, partial algorithm curves may do the same.

Regards

Ed Willink

Re: OCL query from file [message #1803419 is a reply to message #1803409]

Thu, 28 February 2019 07:42

Eclipse User

Hi Ed,

I suppose this can be reproduced using the minimal example provided but the exact repro can be found in the following class: https://github.com/epsilonlabs/parallel-erl/blob/master/standalone/org.eclipse.ocl.standalone/src/org/eclipse/ocl/standalone/StandaloneOCL.java

This is run by calling the main method in StandaloneOCLBuilder with the following arguments:

"/path/to/script.ocl" 
"/path/to/imdb-0.5.xmi" 
"/path/to/movies.ecore" 
-profile -results

Models (the 500k one is imdb-0.5.xmi, for example).

Also requires org.eclipse.epsilon.common plugin (link for convenience).

I didn't test with Java older than 8 because 1) They are end-of-life and 2) The code requires Java 8 minimum.

Thanks,
Sina

[Updated on: Thu, 28 February 2019 08:15] by Moderator

Re: OCL query from file [message #1803423 is a reply to message #1803419]

Thu, 28 February 2019 09:44

Eclipse User

I have also tested on a different machine running Fedora 29, 32 GB memory and I set Xmx25g, MaxGCPauseMillis=650, G1, OpenJDK 11.0.2 but the dispose() still took 1 min 41 secs. I tried again with the 500k model and over time the memory usage of the application ballooned to 13 GB using ParallelGC instead of G1. I saw spikes in CPU usage followed by memory increases. I can verify these are GCs as I launched VisualVM to confirm. Looking at the actual live objects and bytes, nothing seemed out of the ordinary. Mostly taken up by int[] and byte[] as well as String which is perfectly normal. Though I was surprised to see that EcoreUtil$4 had more objects than DynamicEObjectImpl (which, as expected, was 500k). There were also over a million EcoreUtil$ProperContentIterator objects. Anyways, what I found really surprising is the discrepancy between G1 and ParallelGC even though I was using the same Java version and Xmx. ParallelGC seems to be about a minute slower than G1 to dispose. Here's the output with ParallelGC:

Linux 4.20.11-200.fc29.x86_64
OpenJDK 64-Bit Server VM 11.0.2+7
AMD Ryzen Threadripper 1950X 16-Core Processor
Logical processors: 32
Xms: 481 MB
Xmx: 22756 MB
Starting execution at 28 Feb 2019, 14:19:41
-----------------------------------------------------
Profiled processes:
Prepare model: 04.259 (4259 ms)
setup: 343 (343 ms)
Parse script: 543 (543 ms)
Check for query: 112 (112 ms)
execute query: 13:40.331 (820331 ms)
GARBAGE_COLLECTION: 06.040 (6040 ms)
dispose: 02:42.391 (162391 ms)

Finished at 28 Feb 2019, 14:36:15
-----------------------------------------------------
Result: 11718
-----------------------------------------------------

How can a model which is less than 38 MB on disk with only 2 types (Actor and Movie) end up consuming so much memory?

Re: OCL query from file [message #1803432 is a reply to message #1803423]

Thu, 28 February 2019 10:48

Eclipse User

Hi

I wouldn't pay too much attention to GC variation. Hopefully Oracle have done a plausible job ; they have certainly invested much more time developing, testing it. And of course they have far more users than EMF.

Your results clearly show something rubbish with EMF / OCL / ... Once the EMF / OCL is respectable it may then be worth contrasting the good / bad characteristics of GC approaches.

From Fig 16 of http://www.eclipse.org/mmt/qvt/docs/ICMT2017/MicromappingMoC.pdf we get a limit of about 10,000,000 model elements in 4GB 64 bit VM which for a TX may be about an input+middle+output object per element, suggesting 133 bytes per EObject. Actually there may be a proportionate overhead of HashMap$Node to manage test harness / shutdown Sets/Maps. These are 44 bytes at a time. On a 64bit VM, any unpacked boolean takes 8 bytes. EMF was developed in 16/32 bit days for which inefficient booleans inspired an EFlags packing option. I have suggested to Dimitris that a new packing capability for 64 bit days could be an interesting research topic; no object identity needs more than 32 bits ... When you see how expensive 64bits is, and how gratuitous some allocations are, the memory vanishes quite quickly.

EMF ELists features are particularly expensive. An empty list needs at least 48 bytes for its fields and 8 for its existence.

	public @NonNull List<CollectionLiteralPart> getOwnedParts()
	{
		if (ownedParts == null)
		{
			ownedParts = new EObjectContainmentEList<CollectionLiteralPart>(CollectionLiteralPart.class, this, 10);
		}
		return ownedParts;
	}

I suggested a basicGet capability so that all-feature traversal algorithms could avoid creating a small list in order to traverse its empty content. Once created these lists are permanent bloat. Your Movie and Person each have an EList. ocl.dispose() is doing at least one traverse everything.

The total size of names can be surprising. For /org.eclipse.qvtd.doc.bigmde2016.tests/src/org/eclipse/qvtd/doc/bigmde2016/tests/FamiliesGenerator.java test I used artificial short prefixed numeric names.

EcoreUtil$ProperContentIterator is worrying. It should be one per algorithm invocation. It may be that you have messed up / upset EMF by having totally flat models giving each ProperContentIterator nothing to do.

Once you start looking at these size / performance issues, there is a huge amount to find. Quite interesting if you're that way inclined, but also time consuming. For OCL, I have made some, but not nearly enough, efforts to improve normal execution. ocl.dispose() is not normal execution so I have only fixed leaks and StackOverflows therein.

Regards

Ed Willink

Re: OCL query from file [message #1803439 is a reply to message #1803432]

Thu, 28 February 2019 12:11

Eclipse User

Hi Ed,

Thank you for your thoughts and insights. I suppose EMF is showing its age and was probably not designed to scale to millions of elements as evident by the disk / memory usage discrepancy (and even then some would argue XML is extremely verbose form of persistence). Regarding bits per allocation, I think Java has done some funky things to reduce inefficient memory usage with compressed oops but clearly there is something inefficient here. As mentioned running with Epsilon incurs a lower memory overhead (I will have to re-check that more thoroughly) and takes less time. Though the ocl.dispose() time seems to be somewhat proportional to the query time. I tried with 1 million elements and observed 31 minutes execution and 6 mins 30 secs dispose, yet only 9 seconds model loading time.

Thanks,
Sina

Re: OCL query from file [message #1803441 is a reply to message #1803439]

Thu, 28 February 2019 12:21

Eclipse User

Hi

I seem to be missing at least ConstraintDiagnostician, StandaloneOCLBuilder builder. Please provide a zipped project rather than snippets.

That EMF can do M2M of 10,000,000 elements with pretty linear scalability shows that it is pretty good. But no doubt it can be better.

Are you using the latest OCL milestone? I fixed a major inefficiency in allInstances() / unnavigable opposites a couple of months ago.

Regards

Ed Willink

Re: OCL query from file [message #1803442 is a reply to message #1803441]

Thu, 28 February 2019 12:43

Eclipse User

Hi

I need your precise test case. If I modify testQVTcCompiler_Families2Persons_CG to create a new QVTiEnvironmentFactory per test run and include environmentFactory.dispose() within the instrumented time interval, the measurements increase by barely 1%. Specifically 1,000,000 elements take 1.469609 rather than 1.447304 seconds.

I regard 1,000,000 elements per second as a challenging target for a non-parallel execution. 31 minutes has an 1800-fold slowdown to be explained.

Regards

Ed Willink

Re: OCL query from file [message #1803444 is a reply to message #1803441]

Thu, 28 February 2019 12:56

Eclipse User

Hi Ed,

I was meaning that there must be something wrong with my particular use case or at least methodology rather than a general issue with EMF, as in this case the peak memory usage of even half a million elements is over 12GB for a query. I was alluding to the issue being with OCL since profiling shows the memory usage of EObjects being "normal".

Regarding the repro, if you navigate up to the folder of the provided link you will find the files you need. Here's the project for convenience. Apologies for it being hard to read / debug, it's very much proof-of-concept code designed for experiments only.

I appear to be using builds from almost exactly one year ago so I will update. That might explain why I am seeing identical execution times in both compiled and interpreted? Or is that a separate issue?

Thanks,
Sina

Re: OCL query from file [message #1803446 is a reply to message #1803444]

Thu, 28 February 2019 13:51

Eclipse User

Hi

IIRC it was your observations a couple of months ago that caused me to spot some bad CGed performance.

I hope that it boils down to an 'idiot user' problem, but even so it can be interesting to see quite what deep holes real users dig for themselves. Often there is an opportunity to generate a more helpful diagnostic.

Regards

Ed Willink

Re: OCL query from file [message #1803452 is a reply to message #1803446]

Thu, 28 February 2019 15:59

Eclipse User

Hi

The dispose() problem, arises because Resource.unload() takes forever; probably due to your flat model. N elements are doing something quadratically, like emptying a list by repeatedly removing the front element. EMF has many linear list algorithms. Much better on trees than flat hierarchies.

If you initialize OCL with OCL.newInstance(new ResourceSetImpl()) rather than OCL.newInstance() the caller takes responsibility for the external ResourceSet cleanup; dispose takes just 18 ms (one tick). Cost is dependent on meta-element count not model-element count.

Regards

Ed Willink

Re: OCL query from file [message #1803455 is a reply to message #1803452]

Thu, 28 February 2019 16:21

Eclipse User

Hi

Your flat 'model' seems almost designed to hit a number of really bad parts of EMF. The bidirectionals in the mega-root list provoke mutual clearing and horrendous iterating. Resource.unload() is a known problem. You seem to have made it quadratic if not cubic.

I was toying with raising an EMF bug, but the real problem is that you are not modeling. As a minimum one might expect an IMDB root object with actors/movies as children.

Regards

Ed Willink

Re: OCL query from file [message #1803456 is a reply to message #1803455]

Thu, 28 February 2019 17:09

Eclipse User

Hi Ed,

Thanks for investigating this further and your proposed solution. I didn't design the metamodel, though intuitively I don't see why in principle a flat model with just two element types is so much "worse" in performance terms than a complex one such as the Java metamodel. Whilst I can appreciate that the purpose of models is to abstract over complex systems and in this case the model is being used as a storage mechanism, surely the performance issues arise not from a fundamental weakness in modelling but in EMF's implementation? I would've thought a smaller (and simpler) metamodel would have less overhead, at least if the framework optimises for such scenarios.

In any case I'm only interested in the query performance execution time at the moment rather than disposal and memory consumption, so it doesn't affect me as a user; though I'm not sure whether such use cases are common or not. Given the results and current implementation of EMF, I suspect the issue must be very niche and so not worth optimising / raising a bug over it. I'm only comparing (Eclipse) OCL to Epsilon and EMF happens to be what they both have in common and support as a modelling technology so for my purposes it only presents a limitation on how big of a model I can use for a given amount of memory. Given the execution times observed with the "smaller" models I think it's sufficient. I will try with a different metamodel and see if there are any significant differences in memory consumption relative to model size.

Thanks,
Sina

Re: OCL query from file [message #1803487 is a reply to message #1803456]

Fri, 01 March 2019 09:27

Eclipse User

Hi

Quote:

though intuitively I don't see why in principle a flat model with just two element types is so much "worse" in performance terms

On reflection. Indeed. I investigated further. If you introduce a two level model with a root object and separate movies/actors children, performance improves four-fold for non-trivial model sizes. This exactly corresponds to a quadratic problem on list sizes; splitting the list in two gives a four-fold speed-up.

Seeing what unload() actually does is a bit frightening, but mostly linear. The quadratic bits come from assignment of a proxy URI to each unloaded model element, an N-fold loop that does both a contains() and an indexOf() over the growing to-N list. So N*N cost with allocation of URI pool elements to support the URI segments. And once the EMF Reference Clearer threads, it all vanishes anyway.

The attachment shows the quadratic behaviour and four-fold cost of the one level model for large models. Interestingly for intermediate models, the one level model is a bit quicker. Possibly because URI synthesis does not have to traverse the hierarchy to get the parent URI..

Conclusion: be very careful what you measure because there may be something completely bogus obscuring what you actually wanted to see.

Regards

Ed Willink