EMF-IncQuery

The project has been created.

EMF-IncQuery

The EMF-IncQuery project is a proposed open source project under the Eclipse Modeling Framework Technology (EMFT) Container Project.

This proposal is in the Project Proposal Phase (as defined in the Eclipse Development Process) and is written to declare its intent and scope. We solicit additional participation and input from the Eclipse community. Please send all feedback to the Eclipse Proposals Forum.

Background

The EMF-IncQuery project supports (1) the declarative and reusable specification of queries over EMF models and (2) the efficient search and retrieval of EMF model elements by maintaining an incremental cache of query results. The motivation for EMF-IncQuery is simple: while the EMF API already provides basic functionality, it lacks several features that are crucial both from the functionality as well as the performance viewpoints. As a result, such features are frequently implemented on an ad-hoc, case-by-case basis in today's EMF-based applications.

From a technical viewpoint, EMF-IncQuery addresses three key challenges:

  • Simple backward getters are missing from the EMF API, which are essential in many modeling applications.
  • There is no standard way to define complex queries that concern multi-object configurations.
  • Existing EMF query technologies are not specifically optimized for in-memory scalability, and thus have performance issues when used with complex queries and large instance models.

As queries are frequently called in many applications (such as UI back-ends, model transformations), an incremental, online evaluation approach is desirable to provide fast response times even for large models.

EMF-IncQuery provides a light-weight framework built on an adaptation of the RETE algorithm that is well-known from expert systems. The key contributions of EMF-IncQuery are the following:

  • It provides efficient, incremental support for evaluating backward getters over EMF models, aligned with existing types and conventions of EMF's APIs. With the IncQuery API, the enumeration of instances of a specific type or reverse navigation along references are instant, O(1) operations with minimal model modification overhead.
  • It features a high performance incremental query engine for complex queries. As a key advantage, the evaluation times are practically independent of the complexity of the query and the size of the model. As a result, EMF-IncQuery can scale up to handling models containing millions of model elements with fast response times, even for very complex queries.
  • It provides a user friendly, declarative graph pattern-based query language and tools for easy definition of queries. The query language is built on Xtext2 and can be integrated into a hierarchy of such DSLs.

EMF-IncQuery conceptually builds upon the research and development on incremental graph pattern matching carried out within the scope of the VIATRA2 Eclipse GMT subproject, which project started back in 2004 and has been continuously maintained since then.

Scope

EMF-IncQuery provides a new API for querying EMF models, an implementation of the API that leverages an incremental cache for fast retrieval, and a DSL and tools for specifying and testing queries. The emphasis is on scalability, which means fast (re-)execution even for complex queries and large models. EMF-IncQuery is intended to be a technological enabler, to provide a robust and scalable infrastructure on which higher level features such as well-formedness validation, view maintenance, or model synchronization can be built.

Description

The architecture of the EMF-IncQuery framework and tools is shown in Figure 1.

Figure 1: The EMF-IncQuery architecture

Runtime and API

The complex query evaluator of the EMF-IncQuery framework is built on a graph pattern matching engine that uses the RETE algorithm, adapted from expert systems to facilitate the efficient storage and retrieval of partial views of graph-like models. EMF-IncQuery also features the Base component that provides incremental support for backward getters as well as transitive closures for the efficient computation of e.g. reachability regions. EMF-IncQuery relies on the EMF Notification facility to incrementally update its internal cache to guarantee the consistency of the result set with respect to the actual contents of the model.

As an interface over the core algorithms, the IncQuery runtime consists of:

  • a query / pattern matching API that provides a typesafe wrapper for instantiating and executing queries over EMF Notifiers (Resources, ResourceSets and EObject hierarchies)
  • add-ons such as the Validation Framework (for the specification and live checking of well-formedness rules that correspond to Eclipse Resource Markers) or Derived feature support (for the efficient evaluation of derived EAttributes and EReferences with IncQuery queries as back-ends)

The runtime is designed to be compatible with standard Eclipse-EMF configurations as well as a headless, standalone execution mode. The efficiency of the execution engine has been demonstrated in an industrial setting as well as in several academic research case studies.

Query language and tools

The IncQuery tools consist of an Xtext-based pattern language editor, and a development UI that can be used to execute, test and debug queries within the Eclipse IDE. The tools can make use of the interpretative runtime API for on-the-fly query execution and testing, while it can also generate type-safe wrappers that ease the integration of queries into application code. The tools also interfacee with IncQuery's add-ons to generate feature-specific adapter code.

IncQuery's declarative queries can be evaluated over EMF models without manual (programmed) traversal. The query language is built upon the concepts of graph patterns (which is a key concept in many transformation tools) to provide a concise, reusable and easy way to specify complex structural model queries. The key features are:

  • complex interconnections of EMF objects can be easily formulated as a graph pattern,
  • the language is highly expressive and provides powerful features such as negation or counting,
  • queries can be evaluated with great freedom, i.e. input and output parameters can be bound freely at evaluation time,
  • graph patterns are composable, making it possible to build reusable query libraries.

Available documentation already discusses numerous demonstrative examples as well as advanced issues.

Why Eclipse?

With EMF-IncQuery, our aim is to gain access to a much wider audience by designing the technology around the EMF ecosystem from the ground up. Based on OptXware's experience gained in a number of industrial tool development projects, we believe that IncQuery can add significant value to the foundation layer, that is, EMF itself. Additionally, many other EMF technologies can consume IncQuery technology, including:

  • EMF Search, EMF Query 2 and Eclipse OCL: using IncQuery as an efficient back-end and index provider;
  • EMF Validation: interfacing with IncQuery's validation engine to provide more expressive and faster well-formedness rules;
  • EEF: using IncQuery as a back-end for UI components such as views, forms and editors.

Relation to Other Eclipse Projects

EMF-IncQuery builds on the foundations developed in another Eclipse.org project, the VIATRA2 model transformation framework. Our experience with VIATRA2, both in BUTE's academic research as well as OptXware's industrial projects, has already proven the values and benefits of open sourcing a versatile, enabling framework, under the EPL with Eclipse.org's guarantee of being IP risk free.

EMF-IncQuery's key distinguishing features compared to existing EMF model query technologies are:

  • EMF-IncQuery focuses on high-performance queries of in-memory models, whereas EMF Query and Query2 are concerned with querying workspace files or models persisted in databases;
  • EMF-IncQuery features a unique, graph pattern-based, fully declarative query language that is more expressive than the languages of Query and Query2; its expressive power is comparable to OCL, but IncQuery supports query reuse and dynamic parameterization, making it better suited for general purpose queries;
  • In contrast with both Query/Query2 and OCL, EMF-IncQuery is an incremental query evaluation engine, which means that once the results of a query have been computed, these are stored in a cache and continuously updated as the model changes, so that the consecutive result retrievals are instant operations.

Initial Contribution

The initial contribution will be a set of plug-ins already under the EPL, consisting of the components and features described in the Description section (no external dependencies).

These are currently available at two sites and described in detail here:

Legal Issues

There are no outstanding legal issues concerning EMF-IncQuery.

  • The core IP (that is, the RETE implementation) stems directly from VIATRA2, which is already part of Eclipse and thus has been IP checked thoroughly.
  • Other components have been developed from scratch, and licensed under the EPL from the start. We have a proven track record (peer reviewed academic publications) of the entire research and development process.
  • All previous and current contributors have obtained Eclipse-compatible employer permissions.

Committers

The following individuals are proposed as initial committers to the project:

Dr. István Ráth (rath@mit.bme.hu), Budapest University of Technology and Economics
He has provided significant contributions both the GMT/VIATRA2 project and the existing code base of EMF-IncQuery. He will be the development and technical lead in this new project.
Dr. Dániel Varró (varro@mit.bme.hu), Budapest University of Technology and Economics
He is already a committer on the GMT/VIATRA2 project where he has made significant contributions over many years. He will be the research lead in this new project.
Ákos Horváth (ahorvath@mit.bme.hu), Budapest University of Technology and Economics.
He has provided significant contributions both to VIATRA2 and the existing codebase. He will contribute as an architect and researcher.
Gábor Bergmann (bergmann@mit.bme.hu), Budapest University of Technology and Economics.
He has provided significant contributions both to VIATRA2 and to the existing codebase, developed the RETE implementation of both VIATRA2 and IncQuery, and is going to work on further core query evaluation engine improvements.
Zoltán Ujhelyi (ujhelyiz@mit.bme.hu), Budapest University of Technology and Economics.
He has provided significant contributions to the existing codebase of EMF-IncQuery, and is already a committer in the GEF/Zest project. He will be primarily responsible for developing the query language and the Xtext-based tools.
Ábel Hegedüs (hegedusa@mit.bme.hu), Budapest University of Technology and Economics.
He has provided significant contributions to the existing codebase of EMF-IncQuery, and he will primarily work on IncQuery add-ons such as the Validation Engine and derived feature integration.
Zoltán Balogh (zoltan.balogh@optxware.com), OptXWare Research and Development Ltd.
He has provided significant contributions to the existing codebase, and he will primarily work on tools, core APIs and testing.
András Ökrös (andras.okros@optxware.com), OptXWare Research and Development Ltd.
He has provided significant contributions to the existing codebase, and he will primarily work performance-oriented optimization and testing.

We welcome additional committers and contributions.

Mentors

The following Architecture Council members will mentor this project:

  • Ed Merks
  • Cédric Brun

Interested Parties

The following individuals, organisations, companies and projects have expressed interest in this project:

  • Dr. András Pataricza, professor, Budapest University of Technology and Economics, Department of Measurement and Information Systems, Fault Tolerant Systems Research Group (Hungary)
  • Dr. György Csertán, CEO. OptXware Research and Development Ltd. (Hungary)
  • Jose Ricardo Parizi Negrão, Technology Development Engineer, Embraer (Brazil)
  • Dr. Zsolt Szepessy and Zoltán Theisz, EvoPro Informatics and Automation Ltd. (Hungary)

Project Scheduling

  • June 2012: initial contribution, core components
  • August 2012: first build
  • June 2013: 1.0

Changes to this Document

Date Change
30-May-2012 Updated schedule.
08-February-2012 István Ráth: Document created