String.Replace() vs. StringBuilder.Replace()

C#.Net.Net 4.0Replace

C# Problem Overview


I have a string in which I need to replace markers with values from a dictionary. It has to be as efficient as possible. Doing a loop with a string.replace is just going to consume memory (strings are immutable, remember). Would StringBuilder.Replace() be any better since this is was designed to work with string manipulations?

I was hoping to avoid the expense of RegEx, but if that is going to be a more efficient then so be it.

Note: I don't care about code complexity, only how fast it runs and the memory it consumes.

Average stats: 255-1024 characters in length, 15-30 keys in the dictionary.

C# Solutions


Solution 1 - C#

Using RedGate Profiler using the following code

class Program
    {
        static string data = "abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz";
        static Dictionary<string, string> values;

        static void Main(string[] args)
        {
            Console.WriteLine("Data length: " + data.Length);
            values = new Dictionary<string, string>()
            {
                { "ab", "aa" },
                { "jk", "jj" },
                { "lm", "ll" },
                { "yz", "zz" },
                { "ef", "ff" },
                { "st", "uu" },
                { "op", "pp" },
                { "x", "y" }
            };

            StringReplace(data);
            StringBuilderReplace1(data);
            StringBuilderReplace2(new StringBuilder(data, data.Length * 2));

            Console.ReadKey();
        }

        private static void StringReplace(string data)
        {
            foreach(string k in values.Keys)
            {
                data = data.Replace(k, values[k]);
            }
        }

        private static void StringBuilderReplace1(string data)
        {
            StringBuilder sb = new StringBuilder(data, data.Length * 2);
            foreach (string k in values.Keys)
            {
                sb.Replace(k, values[k]);
            }
        }

        private static void StringBuilderReplace2(StringBuilder data)
        {
            foreach (string k in values.Keys)
            {
                data.Replace(k, values[k]);
            }
        }
    }
  • String.Replace = 5.843ms
  • StringBuilder.Replace #1 = 4.059ms
  • Stringbuilder.Replace #2 = 0.461ms

String length = 1456

stringbuilder #1 creates the stringbuilder in the method while #2 does not so the performance difference will end up being the same most likely since you're just moving that work out of the method. If you start with a stringbuilder instead of a string then #2 might be the way to go instead.

As far as memory, using RedGateMemory profiler, there is nothing to worry about until you get into MANY replace operations in which stringbuilder is going to win overall.

Solution 2 - C#

This may be of help: https://docs.microsoft.com/en-us/archive/blogs/debuggingtoolbox/comparing-regex-replace-string-replace-and-stringbuilder-replace-which-has-better-performance

The short answer appears to be that String.Replace is faster, although it may have a larger impact on your memory footprint / garbage collection overhead.

Solution 3 - C#

Yes, StringBuilder will give you both gain in speed and memory (basically because it won't create an instance of a string each time you will make a manipulation with it - StringBuilder always operates with the same object). Here is an MSDN link with some details.

Solution 4 - C#

> Would stringbuilder.replace be any better [than String.Replace]

Yes, a lot better. And if you can estimate an upper bound for the new string (it looks like you can) then it will probably be fast enough.

When you create it like:

  var sb = new StringBuilder(inputString, pessimisticEstimate);

then the StringBuilder will not have to re-allocate its buffer.

Solution 5 - C#

Converting data from a String to a StringBuilder and back will take some time. If one is only performing a single replace operation, this time may not be recouped by the efficiency improvements inherent in StringBuilder. On the other hand, if one converts a string to a StringBuilder, then performs many Replace operations on it, and converts it back at the end, the StringBuilder approach is apt to be faster.

Solution 6 - C#

Rather than running 15-30 replace operations on the entire string, it might be more efficient to use something like a trie data structure to hold your dictionary. Then you can loop through your input string once to do all your searching/replacing.

Solution 7 - C#

It will depend a lot on how many of the markers are present in a given string on average.

Performance of searching for a key is likely to be similar between StringBuilder and String, but StringBuilder will win if you have to replace many markers in a single string.

If you only expect one or two markers per string on average, and your dictionary is small, I would just go for the String.Replace.

If there are many markers, you might want to define a custom syntax to identify markers - e.g. enclosing in braces with a suitable escaping rule for a literal brace. You can then implement a parsing algorithm that iterates through the characters of the string once, recognizing and replacing each marker that it finds. Or use a regex.

Solution 8 - C#

My two cents here, I just wrote couple of lines of code to test how each method performs and, as expected, result is "it depends".

For longer strings Regex seems to be performing better, for shorter strings, String.Replace it is. I can see that usage of StringBuilder.Replace is not very useful, and if wrongly used, it could be lethal in GC perspective (I tried to share one instance of StringBuilder).

Check my StringReplaceTests GitHub repo.

Solution 9 - C#

The problem with @DustinDavis' answer is that it recursively operates on the same string. Unless you're planning on doing a back-and-forth type of manipulation, you really should have separate objects for each manipulation case in this kind of test.

I decided to create my own test because I found some conflicting answers all over the Web, and I wanted to be completely sure. The program I am working on deals with a lot of text (files with tens of thousands of lines in some cases).

So here's a quick method you can copy and paste and see for yourself which is faster. You may have to create your own text file to test, but you can easily copy and paste text from anywhere and make a large enough file for yourself:

using System;
using System.Diagnostics;
using System.IO;
using System.Text;
using System.Windows;

void StringReplace_vs_StringBuilderReplace( string file, string word1, string word2 )
{
    using( FileStream fileStream = new FileStream( file, FileMode.Open, FileAccess.Read ) )
    using( StreamReader streamReader = new StreamReader( fileStream, Encoding.UTF8 ) )
    {
        string text = streamReader.ReadToEnd(),
               @string = text;
        StringBuilder @StringBuilder = new StringBuilder( text );
        int iterations = 10000;
        
        Stopwatch watch1 = new Stopwatch.StartNew();
        for( int i = 0; i < iterations; i++ )
            if( i % 2 == 0 ) @string = @string.Replace( word1, word2 );
            else @string = @string.Replace( word2, word1 );
        watch1.Stop();
        double stringMilliseconds = watch1.ElapsedMilliseconds;
            
        Stopwatch watch2 = new Stopwatch.StartNew();
        for( int i = 0; i < iterations; i++ )
            if( i % 2 == 0 ) @StringBuilder = @StringBuilder .Replace( word1, word2 );
            else @StringBuilder = @StringBuilder .Replace( word2, word1 );
        watch2.Stop();
        double StringBuilderMilliseconds = watch1.ElapsedMilliseconds;
            
        MessageBox.Show( string.Format( "string.Replace: {0}\nStringBuilder.Replace: {1}",
                                        stringMilliseconds, StringBuilderMilliseconds ) );
    }
}

I got that string.Replace() was faster by about 20% every time swapping out 8-10 letter words. Try it for yourself if you want your own empirical evidence.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionDustin DavisView Question on Stackoverflow
Solution 1 - C#Dustin DavisView Answer on Stackoverflow
Solution 2 - C#RMDView Answer on Stackoverflow
Solution 3 - C#AndreiView Answer on Stackoverflow
Solution 4 - C#Henk HoltermanView Answer on Stackoverflow
Solution 5 - C#supercatView Answer on Stackoverflow
Solution 6 - C#Matt BridgesView Answer on Stackoverflow
Solution 7 - C#JoeView Answer on Stackoverflow
Solution 8 - C#ZdeněkView Answer on Stackoverflow
Solution 9 - C#MelovizView Answer on Stackoverflow