[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
[
List Home]
| Re: [equinox-dev] Equinox and UTF-8 | 
Hello,
On Thu, 10 Jul 2008 03:01:40 -0400, BJ Hargrave <hargrave@xxxxxxxxxx>
wrote:
> Well you should not be getting bytes from a String. A String is a set of 
> Characters. Some characters may fit into bytes, but some are wider.
that is correct. 
 
> Also, remember that the length of  a String is the number of characters 
> not the number of bytes into which those characters may be encoded.
I agree with you. The output
=== cut ===
§ length() = 1
§ cast to byte = -89 
§ getBytes() = -62 -89
=== cut ===
is correct. But I am getting a wrong output when running the same code
without Eclipse 
as Equinox standalone application.
=== cut ===
+é-º length() = 2
+é-º cast to byte = -62 -89
+é-º getBytes() = -61 -126 -62 -89
=== cut ===
The result of length() is wrong. And also the result of getBytes(). 
I discovered the behaviour while encoding Strings as Base64. The Base64
class in Apache-Commons Codec uses byte[] as input. However the resulting
Base64 String differs in both execution environments because of the
different
results of getBytes() in both cases.
-- 
Holger Mense