[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
[udig-devel] Re: Why utf-8 (0.8 feedback)
|
Hey Jesse,
I read your answer to be: "ASCII was not enough so I guessed UTF-16
would be better, seemed one step beyond UTF-8".
The above statement is totally *wrong*.
UTF-16 is not bigger than UTF-8! It's a completely different mapping of
the same charcter set (think ASCII vs. EBCDIC), and is not used that
often. UTF-8 maps the whole Universal Character System! UTF-8 is
brilliant! There are a few (very few) reasons to use something else but
probably not ones that are relvant. I read your answer to be, ASCII was
not enough so I guessed UTF-16 would be better.
If you don't have time for this, pick UTF-8. Check that it works on
windows and you should be fine. XML people have thought a lot about this
and UTF-8 is becoming the world-wide standard.
If you didn't know what UTF-16 was, you owe yourself a few hours of
education. Seriously. This is not meant as a criticism. We all need to
understand Unicode. You can start with Joel's column which is, for me,
not quite enough. Budget a few hours, over separate days, on the FAQ.
Think of this as refreshing your education on what ASCII was all
about :-).
The Absolute Minimum Every Software Developer Absolutely,
Positively Must Know About Unicode and Character Sets (No
Excuses!)
http://www.joelonsoftware.com/articles/Unicode.html
UTF-8 and Unicode FAQ for Unix/Linux
http://www.cl.cam.ac.uk/~mgk25/unicode.html
Unicode
http://en.wikipedia.org/wiki/Unicode
ciao,
adrian
Jody Garnett wrote:
> Adrian Custer wrote:
>
>> 8) Why are .udig files utf-16?
>> Wouldn't utf-8 be more readable, compact and generally desirable?
>>
>>
> Unsure - may be an over site or it may be something required as we
> move between windows and linux. The whole point of project files it be
> able to email them to others.
>
> Lets ask jesse as this is EMF generated code.
>
I only concerned about the restrictions of UTF-8 because there is a
more limited number of characters permitted but I don't know too much
about the specification. I acknowledge that it is more readable and
compact but how much would it impact our ability to internationalize?
From a programming point it is not important it requires a total of 1
line of change. If we could have a vote so I can get some opinions
that'd be great.
The default was ASCII and I found that certain URLs couldn't be encoded
correctly so I recieved a number of bug reports about maps not being
saved correctly. I bumped it to UTF-16 to virtually guarantee that that
problem would no long occur but maybe that was overkill.
Jesse