Is a Java string really immutable?

JavaStringReflectionImmutability

Java Problem Overview


We all know that String is immutable in Java, but check the following code:

String s1 = "Hello World";  
String s2 = "Hello World";  
String s3 = s1.substring(6);  
System.out.println(s1); // Hello World  
System.out.println(s2); // Hello World  
System.out.println(s3); // World  

Field field = String.class.getDeclaredField("value");  
field.setAccessible(true);  
char[] value = (char[])field.get(s1);  
value[6] = 'J';  
value[7] = 'a';  
value[8] = 'v';  
value[9] = 'a';  
value[10] = '!';  
     
System.out.println(s1); // Hello Java!  
System.out.println(s2); // Hello Java!  
System.out.println(s3); // World  

Why does this program operate like this? And why is the value of s1 and s2 changed, but not s3?

Java Solutions


Solution 1 - Java

String is immutable* but this only means you cannot change it using its public API.

What you are doing here is circumventing the normal API, using reflection. The same way, you can change the values of enums, change the lookup table used in Integer autoboxing etc.

Now, the reason s1 and s2 change value, is that they both refer to the same interned string. The compiler does this (as mentioned by other answers).

The reason s3 does not was actually a bit surprising to me, as I thought it would share the value array (it did in earlier version of Java, before Java 7u6). However, looking at the source code of String, we can see that the value character array for a substring is actually copied (using Arrays.copyOfRange(..)). This is why it goes unchanged.

You can install a SecurityManager, to avoid malicious code to do such things. But keep in mind that some libraries depend on using these kind of reflection tricks (typically ORM tools, AOP libraries etc).

*) I initially wrote that Strings aren't really immutable, just "effective immutable". This might be misleading in the current implementation of String, where the value array is indeed marked private final. It's still worth noting, though, that there is no way to declare an array in Java as immutable, so care must be taken not to expose it outside its class, even with the proper access modifiers.


As this topic seems overwhelmingly popular, here's some suggested further reading: Heinz Kabutz's Reflection Madness talk from JavaZone 2009, which covers a lot of the issues in the OP, along with other reflection... well... madness.

It covers why this is sometimes useful. And why, most of the time, you should avoid it. :-)

Solution 2 - Java

In Java, if two string primitive variables are initialized to the same literal, it assigns the same reference to both variables:

String Test1="Hello World";
String Test2="Hello World";
System.out.println(test1==test2); // true

>initialization

That is the reason the comparison returns true. The third string is created using substring() which makes a new string instead of pointing to the same.

> sub string

When you access a string using reflection, you get the actual pointer:

Field field = String.class.getDeclaredField("value");
field.setAccessible(true);

So change to this will change the string holding a pointer to it, but as s3 is created with a new string due to substring() it would not change.

> change

Solution 3 - Java

You are using reflection to circumvent the immutability of String - it's a form of "attack".

There are lots of examples you can create like this (eg you can even instantiate a Void object too), but it doesn't mean that String is not "immutable".

There are use cases where this type of code may be used to your advantage and be "good coding", such as clearing passwords from memory at the earliest possible moment (before GC).

Depending on the security manager, you may not be able to execute your code.

Solution 4 - Java

You are using reflection to access the "implementation details" of string object. Immutability is the feature of the public interface of an object.

Solution 5 - Java

Visibility modifiers and final (i.e. immutability) are not a measurement against malicious code in Java; they are merely tools to protect against mistakes and to make the code more maintainable (one of the big selling points of the system). That is why you can access internal implementation details like the backing char array for Strings via reflection.

The second effect you see is that all Strings change while it looks like you only change s1. It is a certain property of Java String literals that they are automatically interned, i.e. cached. Two String literals with the same value will actually be the same object. When you create a String with new it will not be interned automatically and you will not see this effect.

#substring until recently (Java 7u6) worked in a similar way, which would have explained the behaviour in the original version of your question. It didn't create a new backing char array but reused the one from the original String; it just created a new String object that used an offset and a length to present only a part of that array. This generally worked as Strings are immutable - unless you circumvent that. This property of #substring also meant that the whole original String couldn't be garbage collected when a shorter substring created from it still existed.

As of current Java and your current version of the question there is no strange behaviour of #substring.

Solution 6 - Java

String immutability is from the interface perspective. You are using reflection to bypass the interface and directly modify the internals of the String instances.

s1 and s2 are both changed because they are both assigned to the same "intern" String instance. You can find out a bit more about that part from this article about string equality and interning. You might be surprised to find out that in your sample code, s1 == s2 returns true!

Solution 7 - Java

Which version of Java are you using? From Java 1.7.0_06, Oracle has changed the internal representation of String, especially the substring.

Quoting from [Oracle Tunes Java's Internal String Representation][1]:

> In the new paradigm, the String offset and count fields have been removed, so substrings no longer share the underlying char [] value.

With this change, it may happen without reflection (???).

[1]: http://www.infoq.com/news/2013/12/Oracle-Tunes-Java-String "Oracle Tunes Java's Internal String Representation"

Solution 8 - Java

There are really two questions here:

  1. Are strings really immutable?
  2. Why is s3 not changed?

To point 1: Except for ROM there is no immutable memory in your computer. Nowadays even ROM is sometimes writable. There is always some code somewhere (whether it's the kernel or native code sidestepping your managed environment) that can write to your memory address. So, in "reality", no they are not absolutely immutable.

To point 2: This is because substring is probably allocating a new string instance, which is likely copying the array. It is possible to implement substring in such a way that it won't do a copy, but that doesn't mean it does. There are tradeoffs involved.

For example, should holding a reference to reallyLargeString.substring(reallyLargeString.length - 2) cause a large amount of memory to be held alive, or only a few bytes?

That depends on how substring is implemented. A deep copy will keep less memory alive, but it will run slightly slower. A shallow copy will keep more memory alive, but it will be faster. Using a deep copy can also reduce heap fragmentation, as the string object and its buffer can be allocated in one block, as opposed to 2 separate heap allocations.

In any case, it looks like your JVM chose to use deep copies for substring calls.

Solution 9 - Java

According to the concept of pooling, all the String variables containing the same value will point to the same memory address. Therefore s1 and s2, both containing the same value of “Hello World”, will point towards the same memory location (say M1).

On the other hand, s3 contains “World”, hence it will point to a different memory allocation (say M2).

So now what's happening is that the value of S1 is being changed (by using the char [ ] value). So the value at the memory location M1 pointed both by s1 and s2 has been changed.

Hence as a result, memory location M1 has been modified which causes change in the value of s1 and s2.

But the value of location M2 remains unaltered, hence s3 contains the same original value.

Solution 10 - Java

The reason s3 does not actually change is because in Java when you do a substring the value character array for a substring is internally copied (using Arrays.copyOfRange()).

s1 and s2 are the same because in Java they both refer to the same interned string. It's by design in Java.

Solution 11 - Java

To add to the @haraldK's answer - this is a security hack which could lead to a serious impact in the app.

First thing is a modification to a constant string stored in a String Pool. When string is declared as a String s = "Hello World";, it's being places into a special object pool for further potential reusing. The issue is that compiler will place a reference to the modified version at compile time and once the user modifies the string stored in this pool at runtime, all references in code will point to the modified version. This would result into a following bug:

System.out.println("Hello World"); 

Will print:

Hello Java!

There was another issue I experienced when I was implementing a heavy computation over such risky strings. There was a bug which happened in like 1 out of 1000000 times during the computation which made the result undeterministic. I was able to find the problem by switching off the JIT - I was always getting the same result with JIT turned off. My guess is that the reason was this String security hack which broke some of the JIT optimization contracts.

Solution 12 - Java

String is immutable, but through reflection you're allowed to change the String class. You've just redefined the String class as mutable in real-time. You could redefine methods to be public or private or static if you wanted.

Solution 13 - Java

[Disclaimer this is a deliberately opinionated style of answer as I feel a more "don't do this at home kids" answer is warranted]

The sin is the line field.setAccessible(true); which says to violate the public api by allowing access to a private field. Thats a giant security hole which can be locked down by configuring a security manager.

The phenomenon in the question are implementation details which you would never see when not using that dangerous line of code to violate the access modifiers via reflection. Clearly two (normally) immutable strings can share the same char array. Whether a substring shares the same array depends on whether it can and whether the developer thought to share it. Normally these are invisible implementation details which you should not have to know unless you shoot the access modifier through the head with that line of code.

It is simply not a good idea to rely upon such details which cannot be experienced without violating the access modifiers using reflection. The owner of that class only supports the normal public API and is free to make implementation changes in the future.

Having said all that the line of code is really very useful when you have a gun held you your head forcing you to do such dangerous things. Using that back door is usually a code smell that you need to upgrade to better library code where you don't have to sin. Another common use of that dangerous line of code is to write a "voodoo framework" (orm, injection container, ...). Many folks get religious about such frameworks (both for and against them) so I will avoid inviting a flame war by saying nothing other than the vast majority of programmers don't have to go there.

Solution 14 - Java

Strings are created in permanent area of the JVM heap memory. So yes, it's really immutable and cannot be changed after being created. Because in the JVM, there are three types of heap memory:

  1. Young generation
  2. Old generation
  3. Permanent generation.

When any object are created, it goes into the young generation heap area and PermGen area reserved for String pooling.

Here is more detail you can go and grab more information from: How Garbage Collection works in Java .

Solution 15 - Java

String is immutable in nature Because there is no method to modify String object. That is the reason They introduced StringBuilder and StringBuffer classes

Solution 16 - Java

This is a quick guide to everything


        // Character array
        char[] chr = {'O', 'K', '!'};

        // this is String class
        String str1 = new String(chr);
        
        // this is concat
        str1 = str1.concat("another string's ");
        
        // this is format
        System.out.println(String.format(str1 + " %s ", "string"));
        
        // this is equals
        System.out.println(str1.equals("another string"));

        //this is split
        for(String s: str1.split(" ")){
            System.out.println(s);
        }

        // this is length
        System.out.println(str1.length());

        //gives an score of the total change in the length
        System.out.println(str1.compareTo("OK!another string string's"));

        // trim
        System.out.println(str1.trim());

        // intern
        System.out.println(str1.intern());

        // character at
        System.out.println(str1.charAt(5));

        // substring
        System.out.println(str1.substring(5, 12));

        // to uppercase
        System.out.println(str1.toUpperCase());

        // to lowerCase
        System.out.println(str1.toLowerCase());

        // replace
        System.out.println(str1.replace("another", "hello"));

       //   output

        // OK!another string's  string 
        // false
        // OK!another
        // string's
        // 20
        // 7
        // OK!another string's
        // OK!another string's 
        // o
        // other s
        // OK!ANOTHER STRING'S 
        // ok!another string's 
        // OK!hello string's 


Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDarshan PatelView Question on Stackoverflow
Solution 1 - JavaHarald KView Answer on Stackoverflow
Solution 2 - JavaZaheer AhmedView Answer on Stackoverflow
Solution 3 - JavaBohemianView Answer on Stackoverflow
Solution 4 - JavaAnkurView Answer on Stackoverflow
Solution 5 - JavaHauke Ingmar SchmidtView Answer on Stackoverflow
Solution 6 - JavaKreaseView Answer on Stackoverflow
Solution 7 - JavamanikantaView Answer on Stackoverflow
Solution 8 - JavaScott WisniewskiView Answer on Stackoverflow
Solution 9 - JavaAbhijeetMishraView Answer on Stackoverflow
Solution 10 - JavaMaurizio In denmarkView Answer on Stackoverflow
Solution 11 - JavaAndrey ChaschevView Answer on Stackoverflow
Solution 12 - JavaSpacePrezView Answer on Stackoverflow
Solution 13 - Javasimbo1905View Answer on Stackoverflow
Solution 14 - JavaYasir Shabbir ChoudharyView Answer on Stackoverflow
Solution 15 - JavaPratik SherdiwalaView Answer on Stackoverflow
Solution 16 - JavaShehan HasinthaView Answer on Stackoverflow