Eclipse Community Forums: TMF (Xtext) » Unicode literals in Xtext grammar

Home » Modeling » TMF (Xtext) » Unicode literals in Xtext grammar

Unicode literals in Xtext grammar [message #54617]

Thu, 02 July 2009 01:55

Eclipse User

Hello,

newbie question: how do I express unicode literals in an Xtext grammar?

What I want to do is to map some W3C recommendations into Xtext and they
contain production rules such as this one:

PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] |
[#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] |
[#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] |
[#xFDF0-#xFFFD] | [#x10000-#xEFFFF]

( http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/#rPN _CHARS_BASE)

I expected to be able to use the Java literal notation as in
"('\u00C0'..'\u00d6')" for the third clause, but that doesn't seem to work.

BTW: is there some reference documentation that I am missing (e.g. the
grammar for Xtext grammars)?

Regards,
Peter

Re: Unicode literals in Xtext grammar [message #54734 is a reply to message #54617]

Thu, 02 July 2009 13:44

Eclipse User

Hi Peter,

the 0.7.0 release of Xtext has no out-of-the-box support for escaped
unicode signs in a string literal.
You may want to override the value convert to handle unicode gracefully.

See here:
http://www.eclipse.org/Xtext/documentation/0_7_0/xtext.html# valueconverter

Regards,
Sebastian

Am 02.07.2009 7:55 Uhr, schrieb Peter Becker:
> Hello,
>
> newbie question: how do I express unicode literals in an Xtext grammar?
>
> What I want to do is to map some W3C recommendations into Xtext and they
> contain production rules such as this one:
>
> PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] |
> [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] |
> [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] |
> [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
>
> ( http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/#rPN _CHARS_BASE)
>
> I expected to be able to use the Java literal notation as in
> "('\u00C0'..'\u00d6')" for the third clause, but that doesn't seem to work.
>
> BTW: is there some reference documentation that I am missing (e.g. the
> grammar for Xtext grammars)?
>
> Regards,
> Peter
>

Re: Unicode literals in Xtext grammar [message #55062 is a reply to message #54734]

Thu, 02 July 2009 21:41

Eclipse User

Hello Sebastian,

thanks for that hint, but I am a bit confused. As far as I understand a
value converter is equivalent to a production rule in the grammar. If that
is correct, I would have to write a value converter per unicode character or
range referenced, wouldn't I? As far as I can tell there is no way to
parametrize the converter.

With the BIG_DECIMAL example, the BIG_DECIMAL rule means "any big decimal".
But what I would need is "a unicode character between $1 and $2" -- I would
need to pass the two values defining the range. Or define a separate converter
per range.

I might just skip the unicode ranges for now -- I'm just using OWL2
Functional Syntax as a test case anyway, no real need for unicode just yet.
I assume you have escaped unicode literals on a feature request somewhere?
Or should I file something?

Regards,
Peter

Sebastian Zarnekow wrote:

> Hi Peter,
>
> the 0.7.0 release of Xtext has no out-of-the-box support for escaped
> unicode signs in a string literal.
> You may want to override the value convert to handle unicode gracefully.
>
> See here:
> http://www.eclipse.org/Xtext/documentation/0_7_0/xtext.html# valueconverter
>
> Regards,
> Sebastian
>
> Am 02.07.2009 7:55 Uhr, schrieb Peter Becker:
>> Hello,
>>
>> newbie question: how do I express unicode literals in an Xtext grammar?
>>
>> What I want to do is to map some W3C recommendations into Xtext and they
>> contain production rules such as this one:
>>
>> PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] |
>> [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] |
>> [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] |
>> [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
>>
>> ( http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/#rPN _CHARS_BASE)
>>
>> I expected to be able to use the Java literal notation as in
>> "('\u00C0'..'\u00d6')" for the third clause, but that doesn't seem to
>> work.
>>
>> BTW: is there some reference documentation that I am missing (e.g. the
>> grammar for Xtext grammars)?
>>
>> Regards,
>> Peter
>>

Re: Unicode literals in Xtext grammar [message #55113 is a reply to message #55062]

Fri, 03 July 2009 02:45

Eclipse User

Hi Peter,

sorry, I got you initial post wrong.

Even if value converters are a way to go, if you want to use unicode
characters in your own dsl, they cannot help with the grammar language.

You may want subscribe yourself this one here
https://bugs.eclipse.org/bugs/show_bug.cgi?id=280659 , so you'll be
notified when it's done.

Regards,
Sebastian

Am 03.07.2009 3:41 Uhr, schrieb Peter Becker:
> Hello Sebastian,
>
> thanks for that hint, but I am a bit confused. As far as I understand a
> value converter is equivalent to a production rule in the grammar. If that
> is correct, I would have to write a value converter per unicode character or
> range referenced, wouldn't I? As far as I can tell there is no way to
> parametrize the converter.
>
> With the BIG_DECIMAL example, the BIG_DECIMAL rule means "any big decimal".
> But what I would need is "a unicode character between $1 and $2" -- I would
> need to pass the two values defining the range. Or define a separate converter
> per range.
>
> I might just skip the unicode ranges for now -- I'm just using OWL2
> Functional Syntax as a test case anyway, no real need for unicode just yet.
> I assume you have escaped unicode literals on a feature request somewhere?
> Or should I file something?
>
> Regards,
> Peter
>
>
> Sebastian Zarnekow wrote:
>
>> Hi Peter,
>>
>> the 0.7.0 release of Xtext has no out-of-the-box support for escaped
>> unicode signs in a string literal.
>> You may want to override the value convert to handle unicode gracefully.
>>
>> See here:
>> http://www.eclipse.org/Xtext/documentation/0_7_0/xtext.html# valueconverter
>>
>> Regards,
>> Sebastian
>>
>> Am 02.07.2009 7:55 Uhr, schrieb Peter Becker:
>>> Hello,
>>>
>>> newbie question: how do I express unicode literals in an Xtext grammar?
>>>
>>> What I want to do is to map some W3C recommendations into Xtext and they
>>> contain production rules such as this one:
>>>
>>> PN_CHARS_BASE ::= [A-Z] | [a-z] | [#x00C0-#x00D6] | [#x00D8-#x00F6] |
>>> [#x00F8-#x02FF] | [#x0370-#x037D] | [#x037F-#x1FFF] | [#x200C-#x200D] |
>>> [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] |
>>> [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
>>>
>>> ( http://www.w3.org/TR/2008/REC-rdf-sparql-query-20080115/#rPN _CHARS_BASE)
>>>
>>> I expected to be able to use the Java literal notation as in
>>> "('\u00C0'..'\u00d6')" for the third clause, but that doesn't seem to
>>> work.
>>>
>>> BTW: is there some reference documentation that I am missing (e.g. the
>>> grammar for Xtext grammars)?
>>>
>>> Regards,
>>> Peter
>>>
>
>

Previous Topic:	How to use tree rewrite actions?
Next Topic:	Galileo: Fatal error in eclipse following xText tutorial

Goto Forum:

-=] Back to Top [=-

Current Time: Sun Jul 13 03:01:19 EDT 2025

.:: Contact :: Home ::.

Breadcrumbs

Sign up to our Newsletter