I couldn't find it on the api or by searching the web.. I know the JVM keeps every String object it has in a string pool in order to optimize memory usage. However I can't figure out how does it save it 'under the hood', Since a String is an immutable object using the toCharArray will get me a copy of the internal array the is stored on the String object in the pool? (if so than every operation involved in getting the string array as a char is O(n)) also - using charAt(i) uses the internal array of the string? or does it copy it to a new array and return the char at position i of the new copied array?
-
You should probably also tell us whether you are asking this for Java 8, Java 7, or something else (and ideally it would be for Java 8 or later).Tim Biegeleisen– Tim Biegeleisen2018-07-21 13:03:58 +00:00Commented Jul 21, 2018 at 13:03
-
The source is included with the JDK, you could just look at that.Mark Rotteveel– Mark Rotteveel2018-07-21 13:43:49 +00:00Commented Jul 21, 2018 at 13:43
1 Answer
- Until Java 8, Strings were internally represented as an array of characters – char[], encoded in UTF-16, so that every character uses two bytes of memory.
When we create a String via the
newoperator, the Java compiler will create a new object and store it in the heap. For example.String str1= new String("Hello");When we create a String variable and assign a value to it, the JVM searches the pool for a String of equal value. If found, the Java compiler will simply return a reference to its memory address, without allocating additional memory.If not found, it’ll be added to the pool and its reference will be returned.
String str2= "Hello";toCharArray()internally creates a newchar[]array by copying the characters of original array to the new one.charAt(int index)returns the value of specified index of the internal (original)char[]array.
With Java 9 a new representation is provided, called Compact Strings. This new format will choose the appropriate encoding between char[] and byte[] depending on the stored content. Since the new String representation will use the UTF-16 encoding only when necessary, the amount of heap memory will be significantly lower, which in turn causes less Garbage Collector overhead on the JVM.