OCL query from file [message #1803239] |
Mon, 25 February 2019 10:43  |
Eclipse User |
|
|
|
Hi,
I was wondering whether there is a way to execute an OCL query (or series of queries) from an OCL document (i.e. Complete OCL) but without using invariants. I am trying to replicate an EOL script in OCL but from my understanding OCL does not provide any top-level constructs for executing pure queries (e.g. a "main" method).
If I'm not mistaken the API provides a way to execute queries from code by passing a String, but since my query refers to some other declared operations I'm not sure if it would work.
I have attached the script as an example. Ideally I want to print the value of "result", but since it's an operation (and no way of declaring "let" outside of an "inv") there's no way to force the execution.
To be clear, my intention is to use this to benchmark performance, so although ModelElementType.allInstances()->select(...) can be re-written as an inv I want to test the performance of the select operation.
I suspect I would have to write my parser or extend an existing one to be able to do this?
Thanks,
Sina
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Re: OCL query from file [message #1803352 is a reply to message #1803348] |
Wed, 27 February 2019 04:45   |
Eclipse User |
|
|
|
Hi
First problem. You did not emulate the example in which EXTLibraryPackage.Literals.LIBRARY, a type, is passed to createQuery, and then "library", an instance of the type is passed to evaluate. You passed an instance to both causing the parser to fail miserably.
Next problem. Metamodel schizophrenia. Your moves model conforms to http://movies/1.0, whereas your OCL complements movies.ecore. EMF evaluation crashes. Change your OCL to http://movies/1.0.
The following is a bit simpler and gives the answer "8".
public static void main(String... args) throws Exception {
CompleteOCLStandaloneSetup.doSetup();
OCL ocl = OCL.newInstance();
ResourceSet resourceSet = ocl.getResourceSet();
resourceSet.getPackageRegistry().put(MoviesPackage.eNS_URI, MoviesPackage.eINSTANCE); //metamodelPackage);
// Model
String modelFilePath = "model/imdb-small.xmi";
URI modelUri = URI.createURI(modelFilePath);
Resource modelResource = resourceSet.getResource(modelUri, true);
// Script
String scriptFilePath = "model/script.ocl";
ASResource scriptResource = ocl.parse(URI.createURI(scriptFilePath));
EObject contextElement = modelResource.getContents().get(0);
ExpressionInOCL asQuery = ocl.createQuery(contextElement.eClass(), "Person.allInstances()->select(a | a.coactors->exists(areCoupleCoactors(a)))->size()");
System.out.println(ocl.evaluate(contextElement, asQuery));
ocl.dispose();
}
Regards
Ed Willink
|
|
|
|
|
|
Re: OCL query from file [message #1803396 is a reply to message #1803372] |
Wed, 27 February 2019 18:39   |
Eclipse User |
|
|
|
Hi Ed,
Apologies for the long post. Possibly a long shot / problem on my end but I thought I'd share on the off chance that it's an issue with OCL.
I just ran a preliminary benchmark as proof-of-concept (with Eclipse running however) and I found that with a 500k element model, the call to ocl.dispose() takes a whopping 2 minutes. Although I don't actually need this call since the application terminates after executing the query I included it out of curiosity. In my profiling I also call System.gc() both before and after profiling the query and dispose, and even taking the GC pause time into account (which for the entirety of the run was 5 seconds) it makes me wonder whether there is some sort of bug or is within the realm of reasonable expectation. I am happy to provide a repro if required. The observed output under Java 8 is as follows:
Windows 10
Java HotSpot(TM) 64-Bit Server VM 25.192-b12
Intel(R) Core(TM) i5-3470 CPU @ 3.20GHz
Logical processors: 4
Xms: 1963 MB
Xmx: 7282 MB
Starting execution at 27-Feb-2019 19:50:47
-----------------------------------------------------
Profiled processes:
Prepare model: 04.308 (4308 ms)
Prepare validator: 722 (722 ms)
Parse script: 818 (818 ms)
execute query: 12:14.661 (734661 ms)
GARBAGE_COLLECTION: 05.068 (5068 ms)
dispose: 02:00.388 (120388 ms)
Finished at 27-Feb-2019 20:05:13
-----------------------------------------------------
Result: 11718
-----------------------------------------------------
Using Java 11 with G1 and MaxGCPauseMillis=500:
Java HotSpot(TM) 64-Bit Server VM 11.0.1+13-LTS
Xms: 2048 MB
Xmx: 8192 MB
Starting execution at 27 Feb 2019, 20:59:34
-----------------------------------------------------
Profiled processes:
Prepare model: 04.779 (4779 ms)
Prepare validator: 986 (986 ms)
Parse script: 778 (778 ms)
execute query: 14:28.021 (868021 ms)
GARBAGE_COLLECTION: 02.193 (2193 ms)
dispose: 02:00.412 (120412 ms)
Finished at 27 Feb 2019, 21:16:11
-----------------------------------------------------
Result: 11718
-----------------------------------------------------
I noticed that there was period with high (90%) CPU usage and 8 GB memory usage from the process. The CPU usage can only be explained by garbage collection since this single-threaded application itself only uses 30% under normal operation. I tried this with Epsilon (single-threaded) which took about 10 minutes to execute and encountered no such issues, so I conclude that the excess memory usage must be in dispose(). For reference the application seems to use about 4.2 - 4.5 GB memory during normal execution with OCL (similar to Epsilon). To be sure I gave it the maximum memory with the same GC arguments. This is what came out:
Xms: 2048 MB
Xmx: 12288 MB
Starting execution at 27 Feb 2019, 23:02:19
-----------------------------------------------------
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (mmap) failed to map 253755392 bytes for Failed to commit area from 0x00000007b4a00000 to 0x00000007c3c00000 of length 253755392.
00000007b4a00000, 253755392, 0) failed; error='The paging file is too small for this operation to complete' (DOS error/errno=1455)
I tried ommitting the call to dispose, but again ran out of memory. This never happens with Epsilon using the same model and query, so I suspect there must be something wrong with either my standalone OCL setup or perhaps some obscure bug (possibly even a memory leak).
I have tried with a smaller (100k) model and the dispose time is close to 6 seconds (where query time is around 2 mins 41s). Since I ran this with Oracle JDK 8 the default GC is the parallel one, so System.gc() calls are stop-the-world (at least in my experience). Here is the output with 200k model:
Java HotSpot(TM) 64-Bit Server VM 25.192-b12
Prepare model: 02.374 (2374 ms)
Prepare validator: 720 (720 ms)
Parse script: 841 (841 ms)
execute query: 04:40.535 (280535 ms)
GARBAGE_COLLECTION: 02.894 (2894 ms)
dispose: 19.750 (19750 ms)
-----------------------------------------------------
Result: 4674
-----------------------------------------------------
I ran the 200k model again with Java 11 using ParallelOldGC with the following results:
Java HotSpot(TM) 64-Bit Server VM 11.0.1+13-LTS
-----------------------------------------------------
Profiled processes:
Prepare model: 02.039 (2039 ms)
Prepare validator: 669 (669 ms)
Parse script: 820 (820 ms)
execute query: 04:38.066 (278066 ms)
GARBAGE_COLLECTION: 01.906 (1906 ms)
dispose: 18.897 (18897 ms)
-----------------------------------------------------
I ran with Java 11 again, using G1 and MaxGCPauseMillis=500 with the following results:
Prepare model: 02.456 (2456 ms)
Prepare validator: 804 (804 ms)
Parse script: 847 (847 ms)
execute query: 04:53.900 (293900 ms)
GARBAGE_COLLECTION: 831 (831 ms)
dispose: 19.059 (19059 ms)
-----------------------------------------------------
Same again without the call to ocl.dispose():
Starting execution at 27 Feb 2019, 23:20:13
-----------------------------------------------------
Profiled processes:
Prepare model: 02.840 (2840 ms)
GARBAGE_COLLECTION: 982 (982 ms)
execute query: 04:50.155 (290155 ms)
Finished at 27 Feb 2019, 23:25:09
-----------------------------------------------------
What's strange is how dispose() manages to crawl to a finish when Xmx is 8G or less under Java 8. Yet For some reason in Java 11 with G1 and 12G the application runs out of memory despite running exactly the same program and same inputs! Perhaps this is an issue with HotSpot but I just thought I'd share my experience just in case there is an issue with OCL. I will test further and report back with a repro if need be.
Thanks,
Sina
|
|
|
|
|
Re: OCL query from file [message #1803423 is a reply to message #1803419] |
Thu, 28 February 2019 09:44   |
Eclipse User |
|
|
|
I have also tested on a different machine running Fedora 29, 32 GB memory and I set Xmx25g, MaxGCPauseMillis=650, G1, OpenJDK 11.0.2 but the dispose() still took 1 min 41 secs. I tried again with the 500k model and over time the memory usage of the application ballooned to 13 GB using ParallelGC instead of G1. I saw spikes in CPU usage followed by memory increases. I can verify these are GCs as I launched VisualVM to confirm. Looking at the actual live objects and bytes, nothing seemed out of the ordinary. Mostly taken up by int[] and byte[] as well as String which is perfectly normal. Though I was surprised to see that EcoreUtil$4 had more objects than DynamicEObjectImpl (which, as expected, was 500k). There were also over a million EcoreUtil$ProperContentIterator objects. Anyways, what I found really surprising is the discrepancy between G1 and ParallelGC even though I was using the same Java version and Xmx. ParallelGC seems to be about a minute slower than G1 to dispose. Here's the output with ParallelGC:
Linux 4.20.11-200.fc29.x86_64
OpenJDK 64-Bit Server VM 11.0.2+7
AMD Ryzen Threadripper 1950X 16-Core Processor
Logical processors: 32
Xms: 481 MB
Xmx: 22756 MB
Starting execution at 28 Feb 2019, 14:19:41
-----------------------------------------------------
Profiled processes:
Prepare model: 04.259 (4259 ms)
setup: 343 (343 ms)
Parse script: 543 (543 ms)
Check for query: 112 (112 ms)
execute query: 13:40.331 (820331 ms)
GARBAGE_COLLECTION: 06.040 (6040 ms)
dispose: 02:42.391 (162391 ms)
Finished at 28 Feb 2019, 14:36:15
-----------------------------------------------------
Result: 11718
-----------------------------------------------------
How can a model which is less than 38 MB on disk with only 2 types (Actor and Movie) end up consuming so much memory?
|
|
|
Re: OCL query from file [message #1803432 is a reply to message #1803423] |
Thu, 28 February 2019 10:48   |
Eclipse User |
|
|
|
Hi
I wouldn't pay too much attention to GC variation. Hopefully Oracle have done a plausible job ; they have certainly invested much more time developing, testing it. And of course they have far more users than EMF.
Your results clearly show something rubbish with EMF / OCL / ... Once the EMF / OCL is respectable it may then be worth contrasting the good / bad characteristics of GC approaches.
From Fig 16 of http://www.eclipse.org/mmt/qvt/docs/ICMT2017/MicromappingMoC.pdf we get a limit of about 10,000,000 model elements in 4GB 64 bit VM which for a TX may be about an input+middle+output object per element, suggesting 133 bytes per EObject. Actually there may be a proportionate overhead of HashMap$Node to manage test harness / shutdown Sets/Maps. These are 44 bytes at a time. On a 64bit VM, any unpacked boolean takes 8 bytes. EMF was developed in 16/32 bit days for which inefficient booleans inspired an EFlags packing option. I have suggested to Dimitris that a new packing capability for 64 bit days could be an interesting research topic; no object identity needs more than 32 bits ... When you see how expensive 64bits is, and how gratuitous some allocations are, the memory vanishes quite quickly.
EMF ELists features are particularly expensive. An empty list needs at least 48 bytes for its fields and 8 for its existence.
public @NonNull List<CollectionLiteralPart> getOwnedParts()
{
if (ownedParts == null)
{
ownedParts = new EObjectContainmentEList<CollectionLiteralPart>(CollectionLiteralPart.class, this, 10);
}
return ownedParts;
}
I suggested a basicGet capability so that all-feature traversal algorithms could avoid creating a small list in order to traverse its empty content. Once created these lists are permanent bloat. Your Movie and Person each have an EList. ocl.dispose() is doing at least one traverse everything.
The total size of names can be surprising. For /org.eclipse.qvtd.doc.bigmde2016.tests/src/org/eclipse/qvtd/doc/bigmde2016/tests/FamiliesGenerator.java test I used artificial short prefixed numeric names.
EcoreUtil$ProperContentIterator is worrying. It should be one per algorithm invocation. It may be that you have messed up / upset EMF by having totally flat models giving each ProperContentIterator nothing to do.
Once you start looking at these size / performance issues, there is a huge amount to find. Quite interesting if you're that way inclined, but also time consuming. For OCL, I have made some, but not nearly enough, efforts to improve normal execution. ocl.dispose() is not normal execution so I have only fixed leaks and StackOverflows therein.
Regards
Ed Willink
|
|
|
|
|
|
|
|
|
|
|
|
|
|