Home » Modeling » TMF (Xtext) » [xtext] is lexical analysis over syntactic analysis in xtext?
| | |
Re: [xtext] is lexical analysis over syntactic analysis in xtext? [message #660406 is a reply to message #660276] |
Fri, 18 March 2011 10:12 |
hanys Messages: 188 Registered: July 2009 |
Senior Member |
|
|
Yes 'for' is our keyword.
Actually we are going realize editor for extended JavaScript.
What is your opinion. Is it possible with XText? I noticed that javascript
contains some automatic semicolon insertions in its syntax.
Thanks,
Jan
"Jan Koehnlein" <jan.koehnlein@itemis.de> wrote in message
news:ilta73$g72$1@news.eclipse.org...
> The context free lexing (a priori lexing) is a property of the Antlr
> parser generator we use in the backend of Xtext. It is a tradeoff for a
> lot of beautiful things we get from using Antlr, such as error recovery.
>
> As a result, you must be careful what kind of terminal rules you define
> and in which order.
>
> An easy workaround for your example should be to switch from a terminal
> rule to a datatype rule (by leaving out the 'terminal' keyword). That way,
> it won't be the lexer do decide whether its a token or an identifier.
> 'for' is a keyword in your langauge anyway, isn't it?
>
> Am 17.03.11 15:35, schrieb Jan:
>> Hello,
>>
>> I have noticed that xtext tries first to identify lexical tokens and then
>> it
>> tries to combine it to the grammar syntactic rules.
>> I think that this appoach is bad.
>>
>> Does really identifying lexems happens before syntactic analysis?
>>
>>
>> example:
>> we had defined token FOR:
>> terminal FOR: "for".
>>
>> But problem with xtext was that it identifies in the following example
>> for
>> and terminal and not the other word:
>> ////////// parsed file:start
>> ...
>> format
>> ...
>> ////////// parsed file:end
>>
>>
>> xtext marked for in the word format as a LEXICAL TOKEN - terminal symbol.
>> This is wrong IMHO.
>>
>>
>> And we had to do the following
>> ForKeyword: F O R;
>>
>> terminal F: ('f' | 'F');
>>
>>
>> terminal O: ('o' | 'O');
>>
>>
>> terminal R: ('r' | 'R');
>>
>> But the parsing of such grammar is slower I would say,
>>
>> BR,
>> Jan
>>
>>
>
>
> --
> Need professional support for Eclipse Modeling?
> Go visit: http://xtext.itemis.com
|
|
|
Re: [xtext] is lexical analysis over syntactic analysis in xtext? [message #660408 is a reply to message #660406] |
Fri, 18 March 2011 10:24 |
Jan Koehnlein Messages: 760 Registered: July 2009 Location: Hamburg |
Senior Member |
|
|
Not sure, because I don't know extended JavaScript too well.
The challenge in such projects is usually to get the grammar right and
free of ambiguities. You might have to enable backtracking or use
syntactic predicates (Xtext2 only).
Am 18.03.11 11:12, schrieb Jan:
> Yes 'for' is our keyword.
> Actually we are going realize editor for extended JavaScript.
> What is your opinion. Is it possible with XText? I noticed that javascript
> contains some automatic semicolon insertions in its syntax.
>
> Thanks,
> Jan
>
> "Jan Koehnlein"<jan.koehnlein@itemis.de> wrote in message
> news:ilta73$g72$1@news.eclipse.org...
>> The context free lexing (a priori lexing) is a property of the Antlr
>> parser generator we use in the backend of Xtext. It is a tradeoff for a
>> lot of beautiful things we get from using Antlr, such as error recovery.
>>
>> As a result, you must be careful what kind of terminal rules you define
>> and in which order.
>>
>> An easy workaround for your example should be to switch from a terminal
>> rule to a datatype rule (by leaving out the 'terminal' keyword). That way,
>> it won't be the lexer do decide whether its a token or an identifier.
>> 'for' is a keyword in your langauge anyway, isn't it?
>>
>> Am 17.03.11 15:35, schrieb Jan:
>>> Hello,
>>>
>>> I have noticed that xtext tries first to identify lexical tokens and then
>>> it
>>> tries to combine it to the grammar syntactic rules.
>>> I think that this appoach is bad.
>>>
>>> Does really identifying lexems happens before syntactic analysis?
>>>
>>>
>>> example:
>>> we had defined token FOR:
>>> terminal FOR: "for".
>>>
>>> But problem with xtext was that it identifies in the following example
>>> for
>>> and terminal and not the other word:
>>> ////////// parsed file:start
>>> ...
>>> format
>>> ...
>>> ////////// parsed file:end
>>>
>>>
>>> xtext marked for in the word format as a LEXICAL TOKEN - terminal symbol.
>>> This is wrong IMHO.
>>>
>>>
>>> And we had to do the following
>>> ForKeyword: F O R;
>>>
>>> terminal F: ('f' | 'F');
>>>
>>>
>>> terminal O: ('o' | 'O');
>>>
>>>
>>> terminal R: ('r' | 'R');
>>>
>>> But the parsing of such grammar is slower I would say,
>>>
>>> BR,
>>> Jan
>>>
>>>
>>
>>
>> --
>> Need professional support for Eclipse Modeling?
>> Go visit: http://xtext.itemis.com
>
>
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com
---
Get professional support from the Xtext committers at www.typefox.io
|
|
|
Re: [xtext] is lexical analysis over syntactic analysis in xtext? [message #660409 is a reply to message #660408] |
Fri, 18 March 2011 10:36 |
hanys Messages: 188 Registered: July 2009 |
Senior Member |
|
|
Basically it's JSON + javascript.
Is there any documentation about " syntactic predicates "?
Thanks,
Jan
"Jan Koehnlein" <jan.koehnlein@itemis.de> wrote in message
news:ilvbf3$s4p$1@news.eclipse.org...
> Not sure, because I don't know extended JavaScript too well.
> The challenge in such projects is usually to get the grammar right and
> free of ambiguities. You might have to enable backtracking or use
> syntactic predicates (Xtext2 only).
>
> Am 18.03.11 11:12, schrieb Jan:
>> Yes 'for' is our keyword.
>> Actually we are going realize editor for extended JavaScript.
>> What is your opinion. Is it possible with XText? I noticed that
>> javascript
>> contains some automatic semicolon insertions in its syntax.
>>
>> Thanks,
>> Jan
>>
>> "Jan Koehnlein"<jan.koehnlein@itemis.de> wrote in message
>> news:ilta73$g72$1@news.eclipse.org...
>>> The context free lexing (a priori lexing) is a property of the Antlr
>>> parser generator we use in the backend of Xtext. It is a tradeoff for a
>>> lot of beautiful things we get from using Antlr, such as error recovery.
>>>
>>> As a result, you must be careful what kind of terminal rules you define
>>> and in which order.
>>>
>>> An easy workaround for your example should be to switch from a terminal
>>> rule to a datatype rule (by leaving out the 'terminal' keyword). That
>>> way,
>>> it won't be the lexer do decide whether its a token or an identifier.
>>> 'for' is a keyword in your langauge anyway, isn't it?
>>>
>>> Am 17.03.11 15:35, schrieb Jan:
>>>> Hello,
>>>>
>>>> I have noticed that xtext tries first to identify lexical tokens and
>>>> then
>>>> it
>>>> tries to combine it to the grammar syntactic rules.
>>>> I think that this appoach is bad.
>>>>
>>>> Does really identifying lexems happens before syntactic analysis?
>>>>
>>>>
>>>> example:
>>>> we had defined token FOR:
>>>> terminal FOR: "for".
>>>>
>>>> But problem with xtext was that it identifies in the following example
>>>> for
>>>> and terminal and not the other word:
>>>> ////////// parsed file:start
>>>> ...
>>>> format
>>>> ...
>>>> ////////// parsed file:end
>>>>
>>>>
>>>> xtext marked for in the word format as a LEXICAL TOKEN - terminal
>>>> symbol.
>>>> This is wrong IMHO.
>>>>
>>>>
>>>> And we had to do the following
>>>> ForKeyword: F O R;
>>>>
>>>> terminal F: ('f' | 'F');
>>>>
>>>>
>>>> terminal O: ('o' | 'O');
>>>>
>>>>
>>>> terminal R: ('r' | 'R');
>>>>
>>>> But the parsing of such grammar is slower I would say,
>>>>
>>>> BR,
>>>> Jan
>>>>
>>>>
>>>
>>>
>>> --
>>> Need professional support for Eclipse Modeling?
>>> Go visit: http://xtext.itemis.com
>>
>>
>
>
> --
> Need professional support for Eclipse Modeling?
> Go visit: http://xtext.itemis.com
|
|
|
Re: [xtext] is lexical analysis over syntactic analysis in xtext? [message #660453 is a reply to message #660409] |
Fri, 18 March 2011 13:59 |
Jan Koehnlein Messages: 760 Registered: July 2009 Location: Hamburg |
Senior Member |
|
|
Syntactic predicates are new in Xtext 2.0 and the documentation is not
yet finished. But there's a thread "syntatic predicates in Xtext 2.0" in
this newsgroup.
Am 18.03.11 11:36, schrieb Jan:
> Basically it's JSON + javascript.
>
> Is there any documentation about " syntactic predicates "?
>
>
>
> Thanks,
>
> Jan
>
>
>
> "Jan Koehnlein"<jan.koehnlein@itemis.de> wrote in message
> news:ilvbf3$s4p$1@news.eclipse.org...
>> Not sure, because I don't know extended JavaScript too well.
>> The challenge in such projects is usually to get the grammar right and
>> free of ambiguities. You might have to enable backtracking or use
>> syntactic predicates (Xtext2 only).
>>
>> Am 18.03.11 11:12, schrieb Jan:
>>> Yes 'for' is our keyword.
>>> Actually we are going realize editor for extended JavaScript.
>>> What is your opinion. Is it possible with XText? I noticed that
>>> javascript
>>> contains some automatic semicolon insertions in its syntax.
>>>
>>> Thanks,
>>> Jan
>>>
>>> "Jan Koehnlein"<jan.koehnlein@itemis.de> wrote in message
>>> news:ilta73$g72$1@news.eclipse.org...
>>>> The context free lexing (a priori lexing) is a property of the Antlr
>>>> parser generator we use in the backend of Xtext. It is a tradeoff for a
>>>> lot of beautiful things we get from using Antlr, such as error recovery.
>>>>
>>>> As a result, you must be careful what kind of terminal rules you define
>>>> and in which order.
>>>>
>>>> An easy workaround for your example should be to switch from a terminal
>>>> rule to a datatype rule (by leaving out the 'terminal' keyword). That
>>>> way,
>>>> it won't be the lexer do decide whether its a token or an identifier.
>>>> 'for' is a keyword in your langauge anyway, isn't it?
>>>>
>>>> Am 17.03.11 15:35, schrieb Jan:
>>>>> Hello,
>>>>>
>>>>> I have noticed that xtext tries first to identify lexical tokens and
>>>>> then
>>>>> it
>>>>> tries to combine it to the grammar syntactic rules.
>>>>> I think that this appoach is bad.
>>>>>
>>>>> Does really identifying lexems happens before syntactic analysis?
>>>>>
>>>>>
>>>>> example:
>>>>> we had defined token FOR:
>>>>> terminal FOR: "for".
>>>>>
>>>>> But problem with xtext was that it identifies in the following example
>>>>> for
>>>>> and terminal and not the other word:
>>>>> ////////// parsed file:start
>>>>> ...
>>>>> format
>>>>> ...
>>>>> ////////// parsed file:end
>>>>>
>>>>>
>>>>> xtext marked for in the word format as a LEXICAL TOKEN - terminal
>>>>> symbol.
>>>>> This is wrong IMHO.
>>>>>
>>>>>
>>>>> And we had to do the following
>>>>> ForKeyword: F O R;
>>>>>
>>>>> terminal F: ('f' | 'F');
>>>>>
>>>>>
>>>>> terminal O: ('o' | 'O');
>>>>>
>>>>>
>>>>> terminal R: ('r' | 'R');
>>>>>
>>>>> But the parsing of such grammar is slower I would say,
>>>>>
>>>>> BR,
>>>>> Jan
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Need professional support for Eclipse Modeling?
>>>> Go visit: http://xtext.itemis.com
>>>
>>>
>>
>>
>> --
>> Need professional support for Eclipse Modeling?
>> Go visit: http://xtext.itemis.com
>
>
--
Need professional support for Eclipse Modeling?
Go visit: http://xtext.itemis.com
---
Get professional support from the Xtext committers at www.typefox.io
|
|
|
Re: [xtext] is lexical analysis over syntactic analysis in xtext? [message #660509 is a reply to message #660409] |
Fri, 18 March 2011 17:20 |
Henrik Lindberg Messages: 2509 Registered: July 2009 |
Senior Member |
|
|
I don't think it is enough with syntactic predicates to specify a JS
parser, you probably also need semantic predicates (which are not
supported in Xtext). JS is a *bitch* to parse if you aim to correctly
cover the entire language.
Suggest you get the antlr book and look at some JS parser samples
written for antlr so you know what sort of challenges you will encounter
before you start.
Regards
- henrik
On 3/18/11 11:36 AM, Jan wrote:
> Basically it's JSON + javascript.
>
> Is there any documentation about " syntactic predicates "?
>
>
>
> Thanks,
>
> Jan
>
>
>
> "Jan Koehnlein"<jan.koehnlein@itemis.de> wrote in message
> news:ilvbf3$s4p$1@news.eclipse.org...
>> Not sure, because I don't know extended JavaScript too well.
>> The challenge in such projects is usually to get the grammar right and
>> free of ambiguities. You might have to enable backtracking or use
>> syntactic predicates (Xtext2 only).
>>
>> Am 18.03.11 11:12, schrieb Jan:
>>> Yes 'for' is our keyword.
>>> Actually we are going realize editor for extended JavaScript.
>>> What is your opinion. Is it possible with XText? I noticed that
>>> javascript
>>> contains some automatic semicolon insertions in its syntax.
>>>
>>> Thanks,
>>> Jan
>>>
>>> "Jan Koehnlein"<jan.koehnlein@itemis.de> wrote in message
>>> news:ilta73$g72$1@news.eclipse.org...
>>>> The context free lexing (a priori lexing) is a property of the Antlr
>>>> parser generator we use in the backend of Xtext. It is a tradeoff for a
>>>> lot of beautiful things we get from using Antlr, such as error recovery.
>>>>
>>>> As a result, you must be careful what kind of terminal rules you define
>>>> and in which order.
>>>>
>>>> An easy workaround for your example should be to switch from a terminal
>>>> rule to a datatype rule (by leaving out the 'terminal' keyword). That
>>>> way,
>>>> it won't be the lexer do decide whether its a token or an identifier.
>>>> 'for' is a keyword in your langauge anyway, isn't it?
>>>>
>>>> Am 17.03.11 15:35, schrieb Jan:
>>>>> Hello,
>>>>>
>>>>> I have noticed that xtext tries first to identify lexical tokens and
>>>>> then
>>>>> it
>>>>> tries to combine it to the grammar syntactic rules.
>>>>> I think that this appoach is bad.
>>>>>
>>>>> Does really identifying lexems happens before syntactic analysis?
>>>>>
>>>>>
>>>>> example:
>>>>> we had defined token FOR:
>>>>> terminal FOR: "for".
>>>>>
>>>>> But problem with xtext was that it identifies in the following example
>>>>> for
>>>>> and terminal and not the other word:
>>>>> ////////// parsed file:start
>>>>> ...
>>>>> format
>>>>> ...
>>>>> ////////// parsed file:end
>>>>>
>>>>>
>>>>> xtext marked for in the word format as a LEXICAL TOKEN - terminal
>>>>> symbol.
>>>>> This is wrong IMHO.
>>>>>
>>>>>
>>>>> And we had to do the following
>>>>> ForKeyword: F O R;
>>>>>
>>>>> terminal F: ('f' | 'F');
>>>>>
>>>>>
>>>>> terminal O: ('o' | 'O');
>>>>>
>>>>>
>>>>> terminal R: ('r' | 'R');
>>>>>
>>>>> But the parsing of such grammar is slower I would say,
>>>>>
>>>>> BR,
>>>>> Jan
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Need professional support for Eclipse Modeling?
>>>> Go visit: http://xtext.itemis.com
>>>
>>>
>>
>>
>> --
>> Need professional support for Eclipse Modeling?
>> Go visit: http://xtext.itemis.com
>
>
|
|
|
Goto Forum:
Current Time: Thu Sep 26 22:20:29 GMT 2024
Powered by FUDForum. Page generated in 0.04124 seconds
|