How does this Java code snippet work? (String pool and reflection)

JavaStringReflectionString Pool

Java Problem Overview


Java string pool coupled with reflection can produce some unimaginable result in Java:

import java.lang.reflect.Field;

class MessingWithString {
    public static void main (String[] args) {
        String str = "Mario";
        toLuigi(str);
        System.out.println(str + " " + "Mario");
    }

    public static void toLuigi(String original) {
        try {
            Field stringValue = String.class.getDeclaredField("value");
            stringValue.setAccessible(true);
            stringValue.set(original, "Luigi".toCharArray());
        } catch (Exception ex) {
            // Ignore exceptions
        }
    }
}

Above code will print:

"Luigi Luigi" 

What happened to Mario?

Java Solutions


Solution 1 - Java

> What happened to Mario ??

You changed it, basically. Yes, with reflection you can violate the immutability of strings... and due to string interning, that means any use of "Mario" (other than in a larger string constant expression, which would have been resolved at compile-time) will end up as "Luigi" in the rest of the program.

This kinds of thing is why reflection requires security permissions...

Note that the expression str + " " + "Mario" does not perform any compile-time concatenation, due to the left-associativity of +. It's effectively (str + " ") + "Mario", which is why you still see Luigi Luigi. If you change the code to:

System.out.println(str + (" " + "Mario"));

... then you'll see Luigi Mario as the compiler will have interned " Mario" to a different string to "Mario".

Solution 2 - Java

It was set to Luigi. Strings in Java are immutable; thus, the compiler can interpret all mentions of "Mario" as references to the same String constant pool item (roughly, "memory location"). You used reflection to change that item; so all "Mario" in your code are now as if you wrote "Luigi".

Solution 3 - Java

To explain the existing answers a bit more, let's take a look at your generated byte code (Only the main() method here).

Byte Code

Now, any changes to the content's of that location will affect both the references (And any other you give too).

Solution 4 - Java

String literals are stored in the string pool and their canonical value is used. Both "Mario" literals aren't just strings with the same value, they are the same object. Manipulating one of them (using reflection) will modify "both" of them, as they are just two references to the same object.

Solution 5 - Java

You just changed the String of String constant pool Mario to Luigi which was referenced by multiple Strings, so every referencing literal Mario is now Luigi.

Field stringValue = String.class.getDeclaredField("value");

You have fetched the char[] named value field from class String

stringValue.setAccessible(true);

Make it accessible.

stringValue.set(original, "Luigi".toCharArray());

You changed original String field to Luigi. But original is Mario the String literal and literal belongs to the String pool and all are interned. Which means all the literals which has same content refers to the same memory address.

String a = "Mario";//Created in String pool
String b = "Mario";//Refers to the same Mario of String pool
a == b//TRUE
//You changed 'a' to Luigi and 'b' don't know that
//'a' has been internally changed and 
//'b' still refers to the same address.

Basically you have changed the Mario of String pool which got reflected in all the referencing fields. If you create String Object (i.e. new String("Mario")) instead of literal you will not face this behavior because than you will have two different Marios .

Solution 6 - Java

The other answers adequately explain what's going on. I just wanted to add the point that this only works if there is no security manager installed. When running code from the command line by default there is not, and you can do things like this. However in an environment where trusted code is mixed with untrusted code, such as an application server in a production environment or an applet sandbox in a browser, there would typically be a security manager present and you would not be allowed these kinds of shenanigans, so this is less of a terrible security hole as it seems.

Solution 7 - Java

Another related point: you can make use of the constant pool to improve the performance of string comparisons in some circumstances, by using the String.intern() method.

That method returns the instance of String with the same contents as the String on which it is invoked from the String constants pool, adding it it if is not yet present. In other words, after using intern(), all Strings with the same contents are guaranteed to be the same String instance as each other and as any String constants with those contents, meaning you can then use the equals operator (==) on them.

This is just an example which is not very useful on its own, but it illustrates the point:

class Key {
    Key(String keyComponent) {
        this.keyComponent = keyComponent.intern();
    }

    public boolean equals(Object o) {
        // String comparison using the equals operator allowed due to the
        // intern() in the constructor, which guarantees that all values
        // of keyComponent with the same content will refer to the same
        // instance of String:
        return (o instanceof Key) && (keyComponent == ((Key) o).keyComponent);
    }

    public int hashCode() {
        return keyComponent.hashCode();
    }

    boolean isSpecialCase() {
        // String comparison using equals operator valid due to use of
        // intern() in constructor, which guarantees that any keyComponent
        // with the same contents as the SPECIAL_CASE constant will
        // refer to the same instance of String:
        return keyComponent == SPECIAL_CASE;
    }

    private final String keyComponent;

    private static final String SPECIAL_CASE = "SpecialCase";
}

This little trick isn't worth designing your code around, but it is worth keeping in mind for the day when you notice a little more speed could be eked out of some bit of performance sensitive code by using the == operator on a string with judicious use of intern().

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionmacrogView Question on Stackoverflow
Solution 1 - JavaJon SkeetView Answer on Stackoverflow
Solution 2 - JavaAmadanView Answer on Stackoverflow
Solution 3 - JavaCodebenderView Answer on Stackoverflow
Solution 4 - JavaMureinikView Answer on Stackoverflow
Solution 5 - JavaakashView Answer on Stackoverflow
Solution 6 - JavaPepijn SchmitzView Answer on Stackoverflow
Solution 7 - JavaPepijn SchmitzView Answer on Stackoverflow