Home » Eclipse Projects » Eclipse Platform » .html files always Shift_JIS regardless of preferences or content
.html files always Shift_JIS regardless of preferences or content [message #334193] |
Mon, 26 January 2009 07:36 |
Ed Wright Messages: 11 Registered: July 2009 |
Junior Member |
|
|
Eclipse 3.4 on Linux:
If I create a new *empty* .html file, it is always created with a file
encoding of Shift_JIS, "determined from content".
If I copy an existing html snippet (no headers), which has UTF-8
content, for which I have manually set the encoding to UTF-8, to another
location, the encoding is reset to Shift_JIS.
An annoying side note is that when I do reset the encoding type, Eclipse
tells me UTF-8 conflicts with the content type *even if the file is
empty or contains UTF-8 content*.
This is a major headache for me, apart from the need to manually reset
the content type on every such file, in that editing and saving a file
results in permanently corrupted Japanese text.
settings:
---------
env var LANG = ja_JP.utf8
parent folder text file encoding set to UTF-8
preferences:web:html files encoding set to UTF-8
preferences:general:content types:text:html default encoding for all
file associations set to UTF-8
Am I missing something somewhere? Is this a bug? (Ain't no feature in my
book :) )
Thanks,
Ed
|
|
|
Re: .html files always Shift_JIS regardless of preferences or content [message #334196 is a reply to message #334193] |
Mon, 26 January 2009 11:45 |
Ed Merks Messages: 33252 Registered: July 2009 |
Senior Member |
|
|
Ed,
Comments below.
Ed Wright wrote:
> Eclipse 3.4 on Linux:
>
> If I create a new *empty* .html file, it is always created with a file
> encoding of Shift_JIS, "determined from content".
How do you create it? Are you using WTP's editor or just Eclipse's text
editor (which isn't XML aware)?
>
> If I copy an existing html snippet (no headers), which has UTF-8 content
Once you're working with text, you're working with characters (or
codepoints) where there is simply no such notion as it being UTF-8 which
is a concept that applies to the manner in which those characters are
encoded as bytes.
> , for which I have manually set the encoding to UTF-8, to another
> location, the encoding is reset to Shift_JIS.
I don't get this comment. The characters should be copied into the
editor without affecting the encoding the editor remembers. That's not
the case?
>
> An annoying side note is that when I do reset the encoding type,
How do you do that?
> Eclipse tells me UTF-8 conflicts with the content type *even if the
> file is empty or contains UTF-8 content*.
An empty file could contain a byte order marker that would be hard to
notice...
>
> This is a major headache for me, apart from the need to manually reset
> the content type on every such file, in that editing and saving a file
> results in permanently corrupted Japanese text.
>
> settings:
> ---------
> env var LANG = ja_JP.utf8
>
> parent folder text file encoding set to UTF-8
>
> preferences:web:html files encoding set to UTF-8
>
> preferences:general:content types:text:html default encoding for all
> file associations set to UTF-8
>
> Am I missing something somewhere? Is this a bug? (Ain't no feature in
> my book :) )
It might well be, but if this is a question about WTP's editor, it's
better to ask there on their newsgroup.
>
> Thanks,
> Ed
Ed Merks
Professional Support: https://www.macromodeling.com/
|
|
|
Re: .html files always Shift_JIS regardless of preferences or content [message #334198 is a reply to message #334196] |
Mon, 26 January 2009 13:37 |
Ed Wright Messages: 11 Registered: July 2009 |
Junior Member |
|
|
Hi Ed,
Also comments below...
On 2009/01/26 20:45, Ed Merks wrote:
>> Eclipse 3.4 on Linux:
>>
>> If I create a new *empty* .html file, it is always created with a file
>> encoding of Shift_JIS, "determined from content".
> How do you create it? Are you using WTP's editor or just Eclipse's text
> editor (which isn't XML aware)?
Doesn't matter. I can "touch test.html" from the linux command line,
creating a stone empty file; "refresh" in Eclipse; and the file is
marked Shift_JIS. However, generally, I use Eclipse Navigator's New/ File.
>> If I copy an existing html snippet (no headers), which has UTF-8 content
> Once you're working with text, you're working with characters (or
> codepoints) where there is simply no such notion as it being UTF-8 which
> is a concept that applies to the manner in which those characters are
> encoded as bytes.
Understood. The main point I wanted to make was that I manually set the
encoding to UTF-8 on the original file, but when I used copy/paste
within Eclipse, the encoding gets changed to Shift_JIS. I would expect a
file copy (within Navigator) to copy properties as well as contents.
>> , for which I have manually set the encoding to UTF-8, to another
>> location, the encoding is reset to Shift_JIS.
> I don't get this comment. The characters should be copied into the
> editor without affecting the encoding the editor remembers. That's not
> the case?
I'm not copy/pasting file contents from file to file, rather I am
copy/pasting the file itself from directory A to directory B.
The actual contents are correctly/accurately copied. The problem is that
Eclipse now thinks the file is Shift_JIS encoded and will open and edit
and save the file as Shift_JIS unless I manually reset the encoding.
>>
>> An annoying side note is that when I do reset the encoding type,
> How do you do that?
In the "Navigator" I right click the file name, select "properties"
which displays "Resource" where the file encoding is set/settable.
>> Eclipse tells me UTF-8 conflicts with the content type *even if the
>> file is empty or contains UTF-8 content*.
> An empty file could contain a byte order marker that would be hard to
> notice...
Could, but doesn't :) - See my first note above.
>> Am I missing something somewhere? Is this a bug? (Ain't no feature in
>> my book :) )
> It might well be, but if this is a question about WTP's editor, it's
> better to ask there on their newsgroup.
I wouldn't *think* it would be an editor issue as the encoding seems to
be set regardless or what editor I use. However, it *does* seem to be
specific to .html files. (If I repeat the procedure in my first note
above, only touch "test.txt" instead, the file is marked as UTF-8
encoded, as expected.)
Any additional thoughts greatly appreciated. And if you think it is
likely to be an editor issue, I'll repost over there. (Would that be
eclipse.webtools?)
Thanks again.
Ed
|
|
|
Re: .html files always Shift_JIS regardless of preferences or content [message #334204 is a reply to message #334198] |
Mon, 26 January 2009 16:02 |
Ed Merks Messages: 33252 Registered: July 2009 |
Senior Member |
|
|
Ed,
Yes, if it's specific to the WTP editor, I'd ask on eclipse.webtools.
Certainly for XML files, I'd expect the content type to be determined by
the XML header...
Ed Wright wrote:
> Hi Ed,
>
> Also comments below...
>
> On 2009/01/26 20:45, Ed Merks wrote:
>>> Eclipse 3.4 on Linux:
>>>
>>> If I create a new *empty* .html file, it is always created with a
>>> file encoding of Shift_JIS, "determined from content".
>> How do you create it? Are you using WTP's editor or just Eclipse's
>> text editor (which isn't XML aware)?
>
> Doesn't matter. I can "touch test.html" from the linux command line,
> creating a stone empty file; "refresh" in Eclipse; and the file is
> marked Shift_JIS. However, generally, I use Eclipse Navigator's New/
> File.
>
>>> If I copy an existing html snippet (no headers), which has UTF-8
>>> content
>> Once you're working with text, you're working with characters (or
>> codepoints) where there is simply no such notion as it being UTF-8
>> which is a concept that applies to the manner in which those
>> characters are encoded as bytes.
>
> Understood. The main point I wanted to make was that I manually set
> the encoding to UTF-8 on the original file, but when I used copy/paste
> within Eclipse, the encoding gets changed to Shift_JIS. I would expect
> a file copy (within Navigator) to copy properties as well as contents.
>
>>> , for which I have manually set the encoding to UTF-8, to another
>>> location, the encoding is reset to Shift_JIS.
>> I don't get this comment. The characters should be copied into the
>> editor without affecting the encoding the editor remembers. That's
>> not the case?
>
> I'm not copy/pasting file contents from file to file, rather I am
> copy/pasting the file itself from directory A to directory B.
>
> The actual contents are correctly/accurately copied. The problem is
> that Eclipse now thinks the file is Shift_JIS encoded and will open
> and edit and save the file as Shift_JIS unless I manually reset the
> encoding.
>
>>>
>>> An annoying side note is that when I do reset the encoding type,
>> How do you do that?
>
> In the "Navigator" I right click the file name, select "properties"
> which displays "Resource" where the file encoding is set/settable.
>
>>> Eclipse tells me UTF-8 conflicts with the content type *even if the
>>> file is empty or contains UTF-8 content*.
>> An empty file could contain a byte order marker that would be hard to
>> notice...
>
> Could, but doesn't :) - See my first note above.
>
>>> Am I missing something somewhere? Is this a bug? (Ain't no feature
>>> in my book :) )
>> It might well be, but if this is a question about WTP's editor, it's
>> better to ask there on their newsgroup.
>
> I wouldn't *think* it would be an editor issue as the encoding seems
> to be set regardless or what editor I use. However, it *does* seem to
> be specific to .html files. (If I repeat the procedure in my first
> note above, only touch "test.txt" instead, the file is marked as UTF-8
> encoded, as expected.)
>
> Any additional thoughts greatly appreciated. And if you think it is
> likely to be an editor issue, I'll repost over there. (Would that be
> eclipse.webtools?)
>
> Thanks again.
> Ed
Ed Merks
Professional Support: https://www.macromodeling.com/
|
|
| |
Goto Forum:
Current Time: Sat Nov 09 05:22:05 GMT 2024
Powered by FUDForum. Page generated in 0.03150 seconds
|