Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
[xtext-dev] Discussion: Syntax of Xtext-Files

Hi,

our roadmap for the next milestone of TMF-Xtext contains one point (https://bugs.eclipse.org/bugs/show_bug.cgi?id=263699), that reads '[Grammar Language] Prepare syntax for grammar mixins'. Sven and I talked about it and came up with the following suggestions:

1. Language or Grammar
The keyword reads language but actually a grammar is defined according to the abstract syntax. We propose to rename the keyword to 'grammar' although it breaks some clients who will have to change their xtext-files.

2. Abstract language
We think that it is not necessary to have abstract languages. We couldn't find any code that calls Grammar#isAbstract except for the generator. This one can be configured as required if you don't want some services to be generated for your specific language. Removing the abstract flag does not seem to have any drawbacks.

3. One language per file
We want to stay with the one language per file paradigm.

4. No default inheritance
Currently any language inherits some terminal tokens from XtextBuiltin. This is rather counterintuitive as there exist some rules that seem to appear from nowhere. We prefer the explicitly extend org.eclipse.xtext.defaults.DefaultTerminals (or something like this). The wizard would generate an appropriate language stub.

5. Mixing instead of extending
To prepare language mixins, we want to rename the 'extends' keyword to 'with' and allow many used grammars at least on the syntax level. For now a check will create an error if you use more than one grammar.

6. Allow zero rules per grammar
We should allow grammars without any new rule. This enables clients to mix several languages and actually not define any new rule.
This will require a way to mark the entry rule. We decided to allow defining the 'main' rule.


Summing up #1-6 the grammar rule will look like this:

Grammar:
  'grammar' name=GrammarID ('with' usedGrammars+=[Grammar|GrammarID] (',' usedGrammars+=[Grammar|GrammarID])*)?
  (definesHiddenTokens?='hidden' '(' (hiddenTokens+=[AbstractRule] (',' hiddenTokens+=[AbstractRule])*)? ')' )?
  ('main' entryRule=[AbstractRule])?
  metamodelDeclarations+=AbstractMetamodelDeclaration* 
  rules+=AbstractRule*


7. Lexer rule vs. terminal rule
Lexer rules are obsolete and superseded by terminal rules. TerminalRule will become a subtype of AbstractRule.

8. Alternatives and Groups
Since Alternatives and Groups support more then two children on the abstract syntax level, we want to change the grammar slightly to create flattened trees.
The rule for alternatives will look like this: 

Alternatives returns AbstractElement:
    Group ({Alternatives.groups+=current} ('|' groups+=Group)+)?
;

Note the underlines part of the rule that will lead to only one alternative for any parsed group.

9. Assignments
We allow any AbstractTerminal (type AbstractElement) to be assigned on the concrete syntax level. However, most of the services (linker, transformer etc) can only deal with a smaller subset of AbstractElements. We think about restricting the tokens on concreate syntax level to assignable elements (nested Alternatives, CrossReferences, Keywords, calls to datatype rules and terminal rules without any cardinality defined) but stay with AbstractElement on abstract syntax level.

10. Syntax for actions
Actions should be simplified to Action: '{' newType=[TypeRef]'.'feature=ID operator=('=' | '+=') 'current' '}'. The noisier, optional version 'current=' .. did not help to understand the concept of actions. So we try to stay as close to what happens internally while minimizing to variablity and characters to type. Furthermore we want to allow Actions to be defined without a feature assignment. The use case is to create instances based on concrete syntax without assigning any feature.
One example can be found in the Xtext grammar itself:

Wildcards in terminal rules are currently defined as: Wildcard: isWildcard?='.';
The preferable derived metamodel should not contain the feature 'isWildcard' as it is oviously true for any parsed instance of Wildcard. We discussed two possiblities to achieve this. The first, more limited suggestion was to create an instance of Wildcard when the rule is consumed completely even if no feature has been assigned. Wildcard: '.';
The other solution would be to use unassigned actions: Wildcard: {Wildcard} '.';
This does not clash with type inference for datatype rules and is more flexible since it allows rules like: Rule returns SuperType: ('foo' {SubTypeFoo} | 'bar' {SubTypeBar}) zonk=NextRule;

11. Rename UpToToken to UntilToken
UpTo clashes with the intuitive understanding of 'up to' as it is a construct to describe loops in a few languages.

12. Allow AbstractElements as terminal part of a CrossReference
The concrete syntax should be aware of the fact, that actions or parser RuleCalls are not allowed. Should be similar if not equal to assignable elements.

Deferred:
We chatted a little bit about 
- syntax for fragments, 
- qualified rule calls, 
- aliases for languages, 
- super calls when overriding rules, 
- abstract rules, 
- annotations for grammar elements and 
- higher order production rules. 
However, these features are out of scope for the given ticket, milestone and time constraint. We think, many of them can be added in the future without breaking (to many) clients.

Please answer to this mail or comment preferably in bugzilla (https://bugs.eclipse.org/bugs/show_bug.cgi?id=263699).

Regards
Sebastian
--
Sebastian Zarnekow

mobile: +49 (0) 151 / 1739 6724
phone:  +49 (0) 431 / 5606-338
fax:   +49 (0) 431 / 5606-339


itemis AG
Schauenburgerstraße 116
24118 Kiel
Germany

Rechtlicher Hinweis:
Amtsgericht Dortmund, HRB 20621
Vorstand: Wolfgang Neuhaus, Jens Wagner, Dr. Georg Pietrek
Aufsichtsrat: Prof. Dr. Burkhard Igel (Vors.), Stephan Grollmann, Michael Neuhaus


Back to the top