Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [cdt-dev] Of Char[] and String

Hi, I just want to add my two cents to this discussion.


1) A char[] "constant" is not actually constant. There is nothing stopping anyone from changing the contents of the array. The 'final' modifier only affects the reference to the array, but not the array elements.

For example

private static final char[] ONE = "1".toCharArray(); //$NON-NLS-1$

There's nothing stopping you from writing some evil code...

ONE[0] = '2';


2) The fact that strings are immutable and backed by a char[] actually makes certain string operations, like substring(), very fast and space efficient. A String object consists of a char[] and two offsets that mark the start and end of the string. When you call substring() the String object that is returned shares the underlying char[], but has different offsets.

http://stackoverflow.com/questions/93091/why-cant-string-be-mutable-in-java-and-net


In any case, like Markus said, its really not been a problem. Yeah having char[] constants may look ugly, but its not hurting anything. And its been proven for us that using char[] in the parser has been a big memory and speed improvement over String.

Now I'd like to take this opportunity to plug the awesome CharArrayMap class that I wrote. It allows you to use char arrays and sections of char arrays as keys in a Map, removing the need to convert them into Strings.



Mike Kucera
Software Developer
IBM Eclipse CDT Team
mkucera@xxxxxxxxxx

Inactive hide details for Doug Schaefer ---07/21/2009 09:26:21 AM---You can do whatever you'd like with your constants. But forDoug Schaefer ---07/21/2009 09:26:21 AM---You can do whatever you'd like with your constants. But for the tokens created by the parser, using char[]'s just makes sense.


From:

Doug Schaefer <cdtdoug@xxxxxxxxx>

To:

"CDT General developers list." <cdt-dev@xxxxxxxxxxx>

Date:

07/21/2009 09:26 AM

Subject:

Re: [cdt-dev] Of Char[] and String




You can do whatever you'd like with your constants. But for the tokens created by the parser, using char[]'s just makes sense.

On Tue, Jul 21, 2009 at 3:58 AM, Schorn, Markus <Markus.Schorn@xxxxxxxxxxxxx> wrote:
    It will neither be a problem to represent the constants as Strings nor is it a problem to have them as char[]. There are
    just a limited number of them.
    However, the AST needs to represent a lot of names. For that we cannot afford the additional memory needed by the
    string-objects. The char[] constants are more or less a consequence of having to work with char[] here or there.
     
    Markus.


    From: cdt-dev-bounces@xxxxxxxxxxx [mailto:cdt-dev-bounces@xxxxxxxxxxx] On Behalf Of Alex Blewitt
    Sent:
    Monday, July 20, 2009 8:44 PM
    To:
    CDT General developers list.
    Subject:
    [cdt-dev] Of Char[] and String
    Importance:
    Low

    As a general observation, I'm confused with the amount of char[] that happens in the CDT codebase. Is this a general consequence of C programmers working in Java, or are there underlying reasons? I happen to come across this today:

        private static final char[] EMPTY_CHAR_ARRAY = new char[0];
        private static final char[] ONE = "1".toCharArray(); //$NON-NLS-1$

    The problem with char[] is that it's generally a less efficient one for storage than the underlying String model is, and in any case, you end up with the String being backed by a similar array in the first place (which is then interned). 

    Consider the following class:

    public class Test {
      public static final char[] foo = "1".toCharArray();
      // public static final char[] foo = {'1'};
      // public static final String foo = "1";
    }

    If I compile this (Mac OS X with Java 6) I get the following sizes of class file generated:

    char with toCharArray = 329b
    char with in-line array = 272b
    String = 248b

    What I can't understand is why we have the string "1" (which will take up space in the Class' intern pool) and then taking up more space than if we'd just used the string on its own. 

    There's probably a reason, but one that isn't immediately obvious to me. Perhaps someone could enlighten me? It's probably all related to the fact that Token has a char[] getCharImage(), but that in itself just lends the question to 'why doesn't that return a String ...'

    Alex 

    _______________________________________________
    cdt-dev mailing list

    cdt-dev@xxxxxxxxxxx
    https://dev.eclipse.org/mailman/listinfo/cdt-dev
_______________________________________________
cdt-dev mailing list
cdt-dev@xxxxxxxxxxx
https://dev.eclipse.org/mailman/listinfo/cdt-dev


GIF image

GIF image


Back to the top