Corrupted string in C#

C#String

C# Problem Overview


I came across “CorruptedString” (Solution). Here is following code of program from the book:

var s = "Hello";
string.Intern(s);
unsafe
{
  fixed (char* c = s)
    for (int i = 0; i < s.Length; i++)
      c[i] = 'a';
}
Console.WriteLine("Hello"); // Displays: "aaaaa"

Why does this program display "aaaaa"? I understand this program as follows:

  1. The CLR reserves "hello" in the intern pool (I image the intern pool as a set of strings).
  2. string.Intern(s) actually does nothing, because the CLR had reserved "Hello" string - it just returns address of reserved "Hello" string (object s has the same address)
  3. The program changes the content of the "Hello" string via a pointer
  4. ??? The Hello string should be absent in the intern pool, and it should be error! But it is OK; the program runs successfully.

As I understand the intern pool, it is like some kind of dictionary of string to string. Or maybe I missed something?

C# Solutions


Solution 1 - C#

When you use "Hello" for the first time, it's interned into the application global store of strings. Based on the fact you're executing in unsafe mode (more about unsafe here) you obtain a direct reference to data stored in the locations originally allocated for the value of string s, so by

for (int i = 0; i < s.Length; i++)
      c[i] = 'a';

you're editing what's in memory. When it accesses the store of interned strings next time, it will use the same address in memory, holding the data you've just changed. That would not be possible without unsafe. string.Intern(s); doesn't play a role here; it behaves the same if you comment it out.

Then by

Console.WriteLine("Hello"); // Displays: "aaaaa"

.NET looks at whether there is an entry for an address obtained for "Hello" and there is: the one which you've just updated to be "aaaaa". The number of 'a' characters is determined by the length of "Hello".

Solution 2 - C#

Even though [@Jaroslav Kadlec][1] answer is correct and complete I would like to add some more information about the behaviour of the code and why string.Intern(s); is useless in this case.

About Intern Pool

Actually .NET automatically execute string interning for all string literals, this is done by using a special table that stores references to all unique strings in our application.

However it's important to notice that only explicitly declared string are interned on the compile stage.

Consider the following code:

var first = "Hello"; //Will be interned
var second = "World"; //Will be interned
var third = first + second; //Will not be interned

Of course in some circumstances we would like to intern some string at run-time and this can be done by String.Intern after checking with String.IsInterned.

So coming back to the snippet of the OP:

//...
var s = "Hello";
string.Intern(s);
//...

In this case string.Intern(s); is useless as it's already interned at compile stage. [1]: https://stackoverflow.com/users/2248454/jaroslav-kadlec

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionLmTinyToonView Question on Stackoverflow
Solution 1 - C#Jaroslav KadlecView Answer on Stackoverflow
Solution 2 - C#SidView Answer on Stackoverflow