Home » Eclipse Projects » Eclipse Platform » dealing with multi-byte characters in editor
dealing with multi-byte characters in editor [message #329369] |
Sat, 21 June 2008 13:00  |
Eclipse User |
|
|
|
Hello there.
I'm going to talk about dealing multi-byte in the Eclipse editor.
I think the fixed space font display is necessary for almost all
developers, especially for those using various byte character set. So
dealing correctly multi-byte characters might be really important. I'm
now curious how the Eclipse editor deals with the multi-byte characters.
IMHO, multi-byte characters, such as Korean, Chinese, or Japanese, might
be handled in 2-column sense. So one multi-byte character is normally
same with two alphabets in counting. But Eclipse doesn't look like that.
It counts both multi-byte and single-byte characters in same manner.
Although this doesn't seem to be significant so much, sometimes it makes
annoying problems when we work with multi-byte characters.
When I use TexLipse plug-in for editing latex document, some options
such as word wrapping work incorrectly. In addition, even though
normally we do not use variable name with multi-byte characters in code,
I've intentionally tested something similar and got the same problem
with wrapping.
Any opinions or comments?
|
|
|
Re: dealing with multi-byte characters in editor [message #329373 is a reply to message #329369] |
Sat, 21 June 2008 14:30   |
Eclipse User |
|
|
|
Originally posted by: merks.ca.ibm.com
Kiwon,
Given Eclipse is editing Java characters which are unicode, it's
generally irrelevant the number of bytes used for a character. Also
byte length has nothing to do with character width; that's something the
font used to render the character will control. So it seems most of
your issue comes down to choosing a font with the properties you desire...
The issue with wrapping sounds different. There my well be issues with
some code not properly recognizing some characters as letters in a
word. For that type of problem, you should make a specific test case
and report the issue in a bugzilla.
Kiwon Um wrote:
> Hello there.
>
> I'm going to talk about dealing multi-byte in the Eclipse editor.
>
> I think the fixed space font display is necessary for almost all
> developers, especially for those using various byte character set. So
> dealing correctly multi-byte characters might be really important. I'm
> now curious how the Eclipse editor deals with the multi-byte characters.
>
> IMHO, multi-byte characters, such as Korean, Chinese, or Japanese,
> might be handled in 2-column sense. So one multi-byte character is
> normally same with two alphabets in counting. But Eclipse doesn't look
> like that.
> It counts both multi-byte and single-byte characters in same manner.
> Although this doesn't seem to be significant so much, sometimes it
> makes annoying problems when we work with multi-byte characters.
>
> When I use TexLipse plug-in for editing latex document, some options
> such as word wrapping work incorrectly. In addition, even though
> normally we do not use variable name with multi-byte characters in
> code, I've intentionally tested something similar and got the same
> problem with wrapping.
>
> Any opinions or comments?
|
|
|
Re: dealing with multi-byte characters in editor [message #329375 is a reply to message #329373] |
Sun, 22 June 2008 01:18   |
Eclipse User |
|
|
|
Thanks for your faithful response.
I fully agree with you; in unicode context it must be true to be one for
one unicode character. And actually now I don't want to talk about the
font issue; there are many kinds of fonts supporting fixed space display.
My issues come from the fact that normally double-space character such
as Korean, Chinese, or Japanese keeps two-alphabet-space. So when we use
column related options such as 'print margin column' or 'number of
characters in line', some problems can be risen. Since these kind of
options take only number of characters not their occupied-space to
display, I'm now wondering how we can deal with these kind of issues in
Eclipse. These problems become worse if we should use mixed-context both
multi-byte characters and alphabets; unfortunately this is really
common. It's, of course, funny to use same spaces to display both
double-space characters and single-space alphabets. :)
I think these are really not of bug. It might be a matter of principal
how Eclipse deal with unicode. So I wanted to inform these kind of issues.
Ed Merks wrote:
> Kiwon,
>
> Given Eclipse is editing Java characters which are unicode, it's
> generally irrelevant the number of bytes used for a character. Also
> byte length has nothing to do with character width; that's something the
> font used to render the character will control. So it seems most of
> your issue comes down to choosing a font with the properties you desire...
>
> The issue with wrapping sounds different. There my well be issues with
> some code not properly recognizing some characters as letters in a
> word. For that type of problem, you should make a specific test case
> and report the issue in a bugzilla.
>
>
> Kiwon Um wrote:
>> Hello there.
>>
>> I'm going to talk about dealing multi-byte in the Eclipse editor.
>>
>> I think the fixed space font display is necessary for almost all
>> developers, especially for those using various byte character set. So
>> dealing correctly multi-byte characters might be really important. I'm
>> now curious how the Eclipse editor deals with the multi-byte characters.
>>
>> IMHO, multi-byte characters, such as Korean, Chinese, or Japanese,
>> might be handled in 2-column sense. So one multi-byte character is
>> normally same with two alphabets in counting. But Eclipse doesn't look
>> like that.
>> It counts both multi-byte and single-byte characters in same manner.
>> Although this doesn't seem to be significant so much, sometimes it
>> makes annoying problems when we work with multi-byte characters.
>>
>> When I use TexLipse plug-in for editing latex document, some options
>> such as word wrapping work incorrectly. In addition, even though
>> normally we do not use variable name with multi-byte characters in
>> code, I've intentionally tested something similar and got the same
>> problem with wrapping.
>>
>> Any opinions or comments?
|
|
|
Re: dealing with multi-byte characters in editor [message #329381 is a reply to message #329375] |
Sun, 22 June 2008 05:31   |
Eclipse User |
|
|
|
"Kiwon Um" <um.kiwon@gmail.com> wrote in message
news:g3knc3$l8h$1@build.eclipse.org...
> Thanks for your faithful response.
>
> I fully agree with you; in unicode context it must be true to be one for
> one unicode character. And actually now I don't want to talk about the
> font issue; there are many kinds of fonts supporting fixed space display.
>
> My issues come from the fact that normally double-space character such as
> Korean, Chinese, or Japanese keeps two-alphabet-space. So when we use
> column related options such as 'print margin column' or 'number of
> characters in line', some problems can be risen. Since these kind of
> options take only number of characters not their occupied-space to
> display, I'm now wondering how we can deal with these kind of issues in
> Eclipse. These problems become worse if we should use mixed-context both
> multi-byte characters and alphabets; unfortunately this is really common.
> It's, of course, funny to use same spaces to display both double-space
> characters and single-space alphabets. :)
>
> I think these are really not of bug. It might be a matter of principal how
> Eclipse deal with unicode. So I wanted to inform these kind of issues.
I think you will have an easier time thinking about this if you stop
thinking about "multi-byte characters" and instead think about double-width
glyphs.
As Ed pointed out, with very rare exceptions, all characters in Unicode are
exactly two bytes, regardless of what language they come from. However, as
you say, some characters take more space than others to draw - for instance,
even in a "fixed-width" font, kanji characters often take twice the space of
romaji. So, some character glyphs take more space than others on the
screen. (A related issue is the Thai language, in which multiple logical
characters can be composed into a single glyph; I do not know whether
Eclipse supports Thai.)
This issue affects rendering, text selection, and word wrapping algorithms.
To the best of my knowledge (which is not very much), Eclipse has a goal of
being able to gracefully handle characters of arbitrary glyph width.
So I think that if you are encountering specific problems, whether they have
to do with failures of existing methods or with APIs that are not adequate
to handle variable-width glyphs, these should be reported as bugs.
|
|
|
Re: dealing with multi-byte characters in editor [message #329384 is a reply to message #329381] |
Sun, 22 June 2008 07:39   |
Eclipse User |
|
|
|
Originally posted by: merks.ca.ibm.com
This is a multi-part message in MIME format.
--------------020602040701080307010208
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Walter,
I totally agree with your comments. The issue of how many bytes are
used to represent a character encoded as bytes is a red herring that
distracts from the actual issue. Best not to even think about the
number of bytes. It's interesting to me that a fixed width font
actually can actually still have variable width characters (i.e., some
characters take double the width of other characters), if that's indeed
true. I'm sure there's a lot of code that assumes if you use a fixed
width font that things will just line up properly. I'm not sure if
there are any SWT metrics that would allow one to determine if a
character has a double wide rendering. And even if it does, I'm not
sure what the code would do to compensate for that.
Some concrete examples of specific problems perhaps with suggestions
about how to deal better with those would help. (I'm pretty sure
Eclipse supports Thai because I seem to recall fixing translation
problems related to Thai, but maybe it was some other very pretty
looking script.)
Walter Harley wrote:
> "Kiwon Um" <um.kiwon@gmail.com> wrote in message
> news:g3knc3$l8h$1@build.eclipse.org...
>
>> Thanks for your faithful response.
>>
>> I fully agree with you; in unicode context it must be true to be one for
>> one unicode character. And actually now I don't want to talk about the
>> font issue; there are many kinds of fonts supporting fixed space display.
>>
>> My issues come from the fact that normally double-space character such as
>> Korean, Chinese, or Japanese keeps two-alphabet-space. So when we use
>> column related options such as 'print margin column' or 'number of
>> characters in line', some problems can be risen. Since these kind of
>> options take only number of characters not their occupied-space to
>> display, I'm now wondering how we can deal with these kind of issues in
>> Eclipse. These problems become worse if we should use mixed-context both
>> multi-byte characters and alphabets; unfortunately this is really common.
>> It's, of course, funny to use same spaces to display both double-space
>> characters and single-space alphabets. :)
>>
>> I think these are really not of bug. It might be a matter of principal how
>> Eclipse deal with unicode. So I wanted to inform these kind of issues.
>>
>
>
> I think you will have an easier time thinking about this if you stop
> thinking about "multi-byte characters" and instead think about double-width
> glyphs.
>
> As Ed pointed out, with very rare exceptions, all characters in Unicode are
> exactly two bytes, regardless of what language they come from. However, as
> you say, some characters take more space than others to draw - for instance,
> even in a "fixed-width" font, kanji characters often take twice the space of
> romaji. So, some character glyphs take more space than others on the
> screen. (A related issue is the Thai language, in which multiple logical
> characters can be composed into a single glyph; I do not know whether
> Eclipse supports Thai.)
>
> This issue affects rendering, text selection, and word wrapping algorithms.
> To the best of my knowledge (which is not very much), Eclipse has a goal of
> being able to gracefully handle characters of arbitrary glyph width.
>
> So I think that if you are encountering specific problems, whether they have
> to do with failures of existing methods or with APIs that are not adequate
> to handle variable-width glyphs, these should be reported as bugs.
>
>
>
--------------020602040701080307010208
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Walter,<br>
<br>
I totally agree with your comments. The issue of how many bytes are
used to represent a character encoded as bytes is a red herring that
distracts from the actual issue. Best not to even think about the
number of bytes. It's interesting to me that a fixed width font
actually can actually still have variable width characters (i.e., some
characters take double the width of other characters), if that's indeed
true. I'm sure there's a lot of code that assumes if you use a fixed
width font that things will just line up properly. I'm not sure if
there are any SWT metrics that would allow one to determine if a
character has a double wide rendering. And even if it does, I'm not
sure what the code would do to compensate for that.<br>
<br>
Some concrete examples of specific problems perhaps with suggestions
about how to deal better with those would help. (I'm pretty sure
Eclipse supports Thai because I seem to recall fixing translation
problems related to Thai, but maybe it was some other very pretty
looking script.)<br>
<br>
<br>
Walter Harley wrote:
<blockquote cite="mid:g3l65h$8fh$1@build.eclipse.org" type="cite">
<pre wrap="">"Kiwon Um" <a class="moz-txt-link-rfc2396E" href="mailto:um.kiwon@gmail.com"><um.kiwon@gmail.com></a> wrote in message
<a class="moz-txt-link-freetext" href="news:g3knc3$l8h$1@build.eclipse.org">news:g3knc3$l8h$1@build.eclipse.org</a>...
</pre>
<blockquote type="cite">
<pre wrap="">Thanks for your faithful response.
I fully agree with you; in unicode context it must be true to be one for
one unicode character. And actually now I don't want to talk about the
font issue; there are many kinds of fonts supporting fixed space display.
My issues come from the fact that normally double-space character such as
Korean, Chinese, or Japanese keeps two-alphabet-space. So when we use
column related options such as 'print margin column' or 'number of
characters in line', some problems can be risen. Since these kind of
options take only number of characters not their occupied-space to
display, I'm now wondering how we can deal with these kind of issues in
Eclipse. These problems become worse if we should use mixed-context both
multi-byte characters and alphabets; unfortunately this is really common.
It's, of course, funny to use same spaces to display both double-space
characters and single-space alphabets. :)
I think these are really not of bug. It might be a matter of principal how
Eclipse deal with unicode. So I wanted to inform these kind of issues.
</pre>
</blockquote>
<pre wrap=""><!---->
I think you will have an easier time thinking about this if you stop
thinking about "multi-byte characters" and instead think about double-width
glyphs.
As Ed pointed out, with very rare exceptions, all characters in Unicode are
exactly two bytes, regardless of what language they come from. However, as
you say, some characters take more space than others to draw - for instance,
even in a "fixed-width" font, kanji characters often take twice the space of
romaji. So, some character glyphs take more space than others on the
screen. (A related issue is the Thai language, in which multiple logical
characters can be composed into a single glyph; I do not know whether
Eclipse supports Thai.)
This issue affects rendering, text selection, and word wrapping algorithms.
To the best of my knowledge (which is not very much), Eclipse has a goal of
being able to gracefully handle characters of arbitrary glyph width.
So I think that if you are encountering specific problems, whether they have
to do with failures of existing methods or with APIs that are not adequate
to handle variable-width glyphs, these should be reported as bugs.
</pre>
</blockquote>
<br>
</body>
</html>
--------------020602040701080307010208--
|
|
|
Re: dealing with multi-byte characters in editor [message #329386 is a reply to message #329384] |
Sun, 22 June 2008 10:11   |
Eclipse User |
|
|
|
This is a multi-part message in MIME format.
--------------000704010803040807030406
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Walter and Ed.
Exactly, I mean it.
Double-width glyphs... ah... these words are best. ;) My issue was the
matter of that. FYI, I attach a screen-shot from Eclipse. In this
figure, all four lines (#23~#26) have same 80 characters; numbers,
Korean characters, alphabets, and mixed characters. But as you can see,
it takes different width in spaces. So when Eclipse apply word warping,
the result should be zigzag. But this is not what we want. We generally
want that only 80-width spaces, not 80-characters, is allowed for a line
if the setting for wrapping is 80 column.
In my knowledge, since many languages, especially for Asian, use
double-width glyphs to render, many other editors (like vim) deal it in
different way such that they consider this special characteristic.
What do you guys think about this? Is it really worth to report to
bugzilla? :)
Ed Merks wrote:
> Walter,
>
> I totally agree with your comments. The issue of how many bytes are
> used to represent a character encoded as bytes is a red herring that
> distracts from the actual issue. Best not to even think about the
> number of bytes. It's interesting to me that a fixed width font
> actually can actually still have variable width characters (i.e., some
> characters take double the width of other characters), if that's indeed
> true. I'm sure there's a lot of code that assumes if you use a fixed
> width font that things will just line up properly. I'm not sure if
> there are any SWT metrics that would allow one to determine if a
> character has a double wide rendering. And even if it does, I'm not
> sure what the code would do to compensate for that.
>
> Some concrete examples of specific problems perhaps with suggestions
> about how to deal better with those would help. (I'm pretty sure
> Eclipse supports Thai because I seem to recall fixing translation
> problems related to Thai, but maybe it was some other very pretty
> looking script.)
>
>
> Walter Harley wrote:
>> "Kiwon Um" <um.kiwon@gmail.com> wrote in message
>> news:g3knc3$l8h$1@build.eclipse.org...
>>
>>> Thanks for your faithful response.
>>>
>>> I fully agree with you; in unicode context it must be true to be one for
>>> one unicode character. And actually now I don't want to talk about the
>>> font issue; there are many kinds of fonts supporting fixed space display.
>>>
>>> My issues come from the fact that normally double-space character such as
>>> Korean, Chinese, or Japanese keeps two-alphabet-space. So when we use
>>> column related options such as 'print margin column' or 'number of
>>> characters in line', some problems can be risen. Since these kind of
>>> options take only number of characters not their occupied-space to
>>> display, I'm now wondering how we can deal with these kind of issues in
>>> Eclipse. These problems become worse if we should use mixed-context both
>>> multi-byte characters and alphabets; unfortunately this is really common.
>>> It's, of course, funny to use same spaces to display both double-space
>>> characters and single-space alphabets. :)
>>>
>>> I think these are really not of bug. It might be a matter of principal how
>>> Eclipse deal with unicode. So I wanted to inform these kind of issues.
>>>
>>
>>
>> I think you will have an easier time thinking about this if you stop
>> thinking about "multi-byte characters" and instead think about double-width
>> glyphs.
>>
>> As Ed pointed out, with very rare exceptions, all characters in Unicode are
>> exactly two bytes, regardless of what language they come from. However, as
>> you say, some characters take more space than others to draw - for instance,
>> even in a "fixed-width" font, kanji characters often take twice the space of
>> romaji. So, some character glyphs take more space than others on the
>> screen. (A related issue is the Thai language, in which multiple logical
>> characters can be composed into a single glyph; I do not know whether
>> Eclipse supports Thai.)
>>
>> This issue affects rendering, text selection, and word wrapping algorithms.
>> To the best of my knowledge (which is not very much), Eclipse has a goal of
>> being able to gracefully handle characters of arbitrary glyph width.
>>
>> So I think that if you are encountering specific problems, whether they have
>> to do with failures of existing methods or with APIs that are not adequate
>> to handle variable-width glyphs, these should be reported as bugs.
>>
>>
>>
>
--------------000704010803040807030406
Content-Type: image/png;
name="80chs.png"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
filename="80chs.png"
iVBORw0KGgoAAAANSUhEUgAAA+gAAABOCAYAAACzDjyKAAAAAXNSR0IArs4c 6QAAAAZiS0dE
AP8A/wD/oL2nkwAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9gGFg0d B6Fbo+0AAAwm
SURBVHja7d2xbtw4GgBgjZHminORIlc4uOqqe4Kttts0KfIS684PkDyBX2C6 vEQCJEWy3Vbb
3jPYVziAA1yVwjjMFYfJKjIpkRKloWa+Dwh2reFoSOqXhv+Ikjb3dze7puPb w3lTwsXF8Hq+
frlt/vmPvzdN0zT/+vd/GwBgGX/8/lvz08+/6AgAqMSZLgAAAAAJOgAAACBB BwAAgDo8yX3D
dnv9/f+vrt4MLgcAAAAKJ+jb7fWjpPzq6k10OQAAAJAma4p7LOmWjAMAAMCC CXpb7Cy5s+cA
AACwUIIuOQcAAIADJ+iScwAAADhwgi45BwAAgHlMesxa0/x5g7jYcgAAAGDY 5v7uZtdd+O3h
vMjKLy6G1/P1y23z8JcLWwIAFvbH7781P/38i44AgEqc6QIAAACQoAMAAADN iGvQ5/C3v25s
CQDwHQwAJ80ZdAAAAJCgAwAAABJ0AAAAqMSk56C3n3UeW57ssnUN3Nvd49dC y0Ll+9bTXddl
4Lq70Gtjy4+tp/Zqr/Zqr/Zq71LtBQDqcX93s+v+u739T/Df69evg3/Hlqe4 v7vZ7X5tfly4
//vX5s9/oddD5YfKtZd1Xx9aHiuXWx/t1V7t1V7t1d4K2vvuw6cdAFCPrCnu sTPjo86Yp55J
D/3KH/vlv++MQOisRw20V3u1V3u1V3sBAJoJj1nbbq8fJeb7ae6jE/b9oOVy k/+e0KCqPd0v
NhgqMTWxW+fUgZf2aq/2aq/2am8N7QUA1pugh5LzdmIeez17YDP2PaGBUHvA sl/ed21hd3A0
5u++wZ32aq/2aq/2am8N7QUA1pugj0q+U5QaDA2te+mphaVu0KO92qu92qu9 2jtHewGAamRd
gx5Lztt3cF9EbFCTO+BYywBFe7VXe7VXe7UXADh6m/u7m0cji28P59EEvas9 rb277OLifLAC
X7/cNk+fPc97jMyYx8vEXjvUY3a0V3u1V3u1V3sP3N73Hz83r16+MBoCgDUm 6LmyEnQAYFES
dACoy5kuAAAAAAk6AAAAIEEHAACAOjyppSKbzeMb7ex2ux9e7/s7tL5u+ZB9 mZTyQ/WZUv/U
9o1Z3q1Lif4Jvad0/6SWj9X90PUXn+JTfIrP2uNT/IgfxzfxKT7Fp/g8zPdp 9Qn6UOfN8Rml
65MSMEv/6NEXoFODZ8n+TCmX094l6i8+xaf4FJ+1x6f4ET+Ob+JTfIpP8XmY 79/qE3TmC+59
sKQGbur65wrKJfpm7h0L8Sk+xecxxifix/FNfIpPxOe8shP00PPOu6+HlnNc xvxiVWPdEZ/i
E/GJ+HF8Q3yKT/G5ygS9m3yH/p5T7HqEOcqnTGGYWp+hz8hdf239GXrvlP6J TTMp1f9T+1l8
ik/xKT7XFp/vPnwSP+LH8U18ik/xKT4ris+su7j3nRlf4sz5brfLvm5gSvn9 37H1lFx/ifpv
Npusa0fm7s92fYb6cun+3/9q1v7X7bva+lN8ik/xKT5riE/xI34c38Sn+BSf 4nO+/h/9mLV2
Ql46OTflr79f+q6fKPErX+ovcEPlY/VZy/Zde/3Fp/hEfPr+FT+Ob+JTfIpP 8bmu+ByVoIcS
8u32+vsU97mnugMAAMCxGXWTuG5y3nddei6/Pg33S8odDbtlUvo19zNyyx/D 9hWf4lN8is9j
jU/xI34c38Sn+BSf4vPw23fSTeJKb8B9R+W+J7RxayqfW/+UoMgJ8pR+zf2M MeXn2r7tcqH/
j13/kXMTEPEpPsWn+DzG+BQ/4sfxTXyKT/EpPjcH+36cnKDvk/S20F3cxyTy ub9KnFL5vpsS
lPqc3M8otXzO/jnkZ4hP8Sk+xeca4lN58eP4Jj7Fp/gUn4f5/i2SoA8l3Z5/ DgAAAOOcraWi
fqFSXnnllVdeeeWVV1555ZVX/pjLryZBBwAAgGMmQQcAAAAJOgAAACBBBwAA AAk6AAAAsDfp
Oej7x6p1n43efm2KoQfZl3x/yoPoh9bX95D79vK+9aTUo3ZTt9ta6y3exJt4 E2/qDQAslqBv
t9c/JN7tv9f8DPRSA5cSD7nf7XbfB7EcJ/GGeAMAICRrivuak/C+wWv7vyDe EG8AABzCk7Fv
DJ1Nn5LIdweQ+7My7eXdaZRDy2Prb5ffn9HpngUaW5/YevqmfA6dgeqrS/uM 1FD/xNYTey11
/aHPaLcrNgV2aHnsvznr6RrqC/Em3sSbeDu1eAMAVp6gd5PzbkLefT0lOU8d nOYOZPoGUKXq
0x7whQZMfYPq1EFSbCAX+v++9ub0W+r6Y/UsrW8wGyvf1w7xJt7Em3g79XgD AFaeoOcm30tI
GWTUNsWzVH3mHmB1z8aFBuVj65A7IB07QC49EBVv4k28ibdjjTcA4LCyrkGP Jeehu7gfemAX
Gvi0/9Xwo8Kab5jU7sucNkxtb9/0zjGDV/Em3sSbeBNvAEAtJj1mrWn+P739 6urNpGvQu9cB
dqcgtsuFygwNCGPlU+qTOzDLPZsRmtI5Zj199R+a7jh1ENqtZ3d79m3f0Bmj lOmu7ddi9Q/d
IGuoPuJNvIk38XZK8QYA1GVzf3fz6Fv628N5kZVfXAyv5+uX2+bps+d1d9KC zyuuoR1rnzJ5
6vUXb+JNvNleqd5//Ny8evnCaAgAKvFEF8QHQXtjB0NDdzimru0l3hBv4k28 AQAS9AqVGATV
NpDKuaPyKW4v8SbexJt4E28AwCGd6QIAAAA4vGrOoMduftR+vdSzfkOfkVJ+ qD5T6p/avjHL
u3Up0T+h95Tun9TyfTeAKlV/cDx0PHQ8tD/aH+2Pte2Pc8cncMIJ+tDBaY7P KF2flC+UpQcV
fV8AUwdjS/ZnSrmc9ubWHxwPHQ8dD+2P9kf7Y2374xLxCZxogs58X377A3jJ X0WHftGuvW8M
VMHx0PFw+JjoeIj90fgEqDxBjz3vfMpz0FmfNU9/nHLX6r6zEEPPY46VT322 dew9fdMgh9YT
al/felKmLw6tJ7d/5iofGxjm9sOU5QZfjoerPCZetmL37e7xa6FlofJ96+mu 6zKQuIReG1s+
pT7YHysdnwAnnKBvt9ePkvKrqzePls8l97E+U8qnTBGaWp+hz5j7MUZz92fo vVP6JzaNq1T/
p1yTFfrlPycxTVmemkinDkByptUNfVZK/wy9P7d/SpWf0g9LDMRypyU6Hjoe lqj/uw+f8o6H
3QR8//flJr98aHkoWU5N4vvK7tc3tj72R/tjpeOT2voTmC7rLu6HPjPeHmQv UX7/d2w9Jddf
ov6bzSZrkD93f7brM9SXS/d/O2nb/wsllyWu5Zr7C27OAVLOdXah8qmDyTX2 z6H62vHQ8bCa
42H3jHNKojy0fERyXAP7o/2x5vHJ3P0JlDX6GvTQ2fRSibw7SPb3S6x/SvTZ 0GfklI+9fy3b
N3TGtdvW0FTqsf0eWk/u+kslwyW3UU6SPqW93W20dILejo2c/nNHXcfDNR0P H52Jvsw45vRN
f29PP3/bc/Y+lODnTJXv1vntPLOK7I/2x0X2R/EJEvRYct5Nypea8g41fUlO +WLMmeIeSlBj
U75rGDDkJOljziKH+r/EdhlKuPtmDMRugJR7eQJUacwZ7qGp6LHrzffL+651 H5oqn/J37o8N
AFBLgj538m3AOtwvKb8gh870piR7OZ+RW37N27fWx5ZMmULety1K31X31PbT lLsUH/P+4nh4
3MfDYsn50LqXnuo+cA28/dH+uJbtAJxQgj5ncj7mhhS5j9I4VPmpSV/qmbq+ 5G3MF1POnbtT
ys+1fdvlQv8fm26XcxOWKWem1/qlWVuSnjutcc4+CN0Ub8yA0PHQ8XCNx8Mi yXlu8n3A69Lt
j/bHmvfHJeITqDhB3yfpbVdXb77fyb29bMxAWPm8sinryLnJ1yGWz9k/c70v 9iWXcg10yvKU
HwtyrrGODQpS1tN3zf3YPsytf7t8StI7Zf25+9XUm+DF+tPx0PGw2uNh7Bru 7rT02Gv75UPX
gnevdY+Vz72mvK98ZLn90f5Y8/hk7ngAlre5v7t5tKd+ezgvsvKLi+H1fP1y 2zx99jxpIJv7
66Dyyi9VngUPWraN/VH5YuXff/zcvHr5wj5nf1Reefs7VOJsLRX1C7byNZen 3mOBPlBeecdD
+6PyytvfQYIOAAAASNABAABAgg4AAABI0AEAAGBt/geHmwh0GGTGugAAAABJ RU5ErkJggg==
--------------000704010803040807030406--
|
|
|
Re: dealing with multi-byte characters in editor [message #329394 is a reply to message #329386] |
Sun, 22 June 2008 20:48  |
Eclipse User |
|
|
|
"Kiwon Um" <um.kiwon@gmail.com> wrote in message
news:g3lmj0$s2c$1@build.eclipse.org...
> Walter and Ed.
>
> Exactly, I mean it.
> Double-width glyphs... ah... these words are best. ;) My issue was the
> matter of that. FYI, I attach a screen-shot from Eclipse. In this
> figure, all four lines (#23~#26) have same 80 characters; numbers,
> Korean characters, alphabets, and mixed characters. But as you can see,
> it takes different width in spaces. So when Eclipse apply word warping,
> the result should be zigzag. But this is not what we want. We generally
> want that only 80-width spaces, not 80-characters, is allowed for a line
> if the setting for wrapping is 80 column.
>
> In my knowledge, since many languages, especially for Asian, use
> double-width glyphs to render, many other editors (like vim) deal it in
> different way such that they consider this special characteristic.
>
> What do you guys think about this? Is it really worth to report to
> bugzilla? :)
Yes, I definitely think that is worth reporting in Bugzilla. Bugzilla is a
good place to track feature requests as well as bugs; the line between them
is often hard to define.
To me, this looks like a bug, although it might be the sort of bug that is a
consequence of design choices that are hard to change. The only way to find
out is to post the bug and let the devs who know something about it make
their comments.
Make sure you include that screen shot in the bug report. If you can be
specific about a few particular cases and say exactly what you help the
behavior in those cases would ideally be, that will help developers
understand the situation.
|
|
|
Goto Forum:
Current Time: Fri Apr 25 00:17:40 EDT 2025
Powered by FUDForum. Page generated in 0.04209 seconds
|