1

I couldn't find it on the api or by searching the web.. I know the JVM keeps every String object it has in a string pool in order to optimize memory usage. However I can't figure out how does it save it 'under the hood', Since a String is an immutable object using the toCharArray will get me a copy of the internal array the is stored on the String object in the pool? (if so than every operation involved in getting the string array as a char is O(n)) also - using charAt(i) uses the internal array of the string? or does it copy it to a new array and return the char at position i of the new copied array?

2
  • You should probably also tell us whether you are asking this for Java 8, Java 7, or something else (and ideally it would be for Java 8 or later). Commented Jul 21, 2018 at 13:03
  • The source is included with the JDK, you could just look at that. Commented Jul 21, 2018 at 13:43

1 Answer 1

5
  1. Until Java 8, Strings were internally represented as an array of characters – char[], encoded in UTF-16, so that every character uses two bytes of memory.
  2. When we create a String via the new operator, the Java compiler will create a new object and store it in the heap. For example.

    String str1= new String("Hello");
    

    When we create a String variable and assign a value to it, the JVM searches the pool for a String of equal value. If found, the Java compiler will simply return a reference to its memory address, without allocating additional memory.If not found, it’ll be added to the pool and its reference will be returned.

    String str2= "Hello";
    
  3. toCharArray() internally creates a new char[] array by copying the characters of original array to the new one.

  4. charAt(int index) returns the value of specified index of the internal (original) char[] array.

With Java 9 a new representation is provided, called Compact Strings. This new format will choose the appropriate encoding between char[] and byte[] depending on the stored content. Since the new String representation will use the UTF-16 encoding only when necessary, the amount of heap memory will be significantly lower, which in turn causes less Garbage Collector overhead on the JVM.

Source: http://www.baeldung.com/java-string-pool

Sign up to request clarification or add additional context in comments.

3 Comments

Good info, and a doc reference would also be nice here.
Not everyone lives in the USA and western Europe. For most of the world char[] will be used and there will be no saving.
@JonathanRosenne I don't think so. Look at the heap dump of a random Java program - it has a lot of strings, and only a small part of them represent actual text that a human would see. A lot of it is e.g. metadata about classes; and a lot of it could fit into a byte array.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.