Replace Multiple String Elements in C#

C#StringRefactoringImmutability

C# Problem Overview


Is there a better way of doing this...

MyString.Trim().Replace("&", "and").Replace(",", "").Replace("  ", " ")
         .Replace(" ", "-").Replace("'", "").Replace("/", "").ToLower();

I've extended the string class to keep it down to one job but is there a quicker way?

public static class StringExtension
{
    public static string clean(this string s)
    {
        return s.Replace("&", "and").Replace(",", "").Replace("  ", " ")
                .Replace(" ", "-").Replace("'", "").Replace(".", "")
                .Replace("eacute;", "é").ToLower();
    }
}

Just for fun (and to stop the arguments in the comments) I've shoved a gist up benchmarking the various examples below. > https://gist.github.com/ChrisMcKee/5937656

The regex option scores terribly; the dictionary option comes up the fastest; the long winded version of the stringbuilder replace is slightly faster than the short hand.

C# Solutions


Solution 1 - C#

Quicker - no. More effective - yes, if you will use the StringBuilder class. With your implementation each operation generates a copy of a string which under circumstances may impair performance. Strings are immutable objects so each operation just returns a modified copy.

If you expect this method to be actively called on multiple Strings of significant length, it might be better to "migrate" its implementation onto the StringBuilder class. With it any modification is performed directly on that instance, so you spare unnecessary copy operations.

public static class StringExtention
{
    public static string clean(this string s)
    {
        StringBuilder sb = new StringBuilder (s);

        sb.Replace("&", "and");
        sb.Replace(",", "");
        sb.Replace("  ", " ");
        sb.Replace(" ", "-");
        sb.Replace("'", "");
        sb.Replace(".", "");
        sb.Replace("eacute;", "é");

        return sb.ToString().ToLower();
    }
}

Solution 2 - C#

If you are simply after a pretty solution and don't need to save a few nanoseconds, how about some LINQ sugar?

var input = "test1test2test3";
var replacements = new Dictionary<string, string> { { "1", "*" }, { "2", "_" }, { "3", "&" } };

var output = replacements.Aggregate(input, (current, replacement) => current.Replace(replacement.Key, replacement.Value));

Solution 3 - C#

this will be more efficient:

public static class StringExtension
{
    public static string clean(this string s)
    {
        return new StringBuilder(s)
              .Replace("&", "and")
              .Replace(",", "")
              .Replace("  ", " ")
              .Replace(" ", "-")
              .Replace("'", "")
              .Replace(".", "")
              .Replace("eacute;", "é")
              .ToString()
              .ToLower();
    }
}

Solution 4 - C#

Maybe a little more readable?

    public static class StringExtension {

        private static Dictionary<string, string> _replacements = new Dictionary<string, string>();

        static StringExtension() {
            _replacements["&"] = "and";
            _replacements[","] = "";
            _replacements["  "] = " ";
            // etc...
        }

        public static string clean(this string s) {
            foreach (string to_replace in _replacements.Keys) {
                s = s.Replace(to_replace, _replacements[to_replace]);
            }
            return s;
        }
    }

Also add New In Town's suggestion about StringBuilder...

Solution 5 - C#

There is one thing that may be optimized in the suggested solutions. Having many calls to Replace() makes the code to do multiple passes over the same string. With very long strings the solutions may be slow because of CPU cache capacity misses. May be one should consider replacing multiple strings in a single pass.

The essential content from that link:

static string MultipleReplace(string text, Dictionary replacements) {
            return Regex.Replace(text, 
                                    "(" + String.Join("|", adict.Keys.ToArray()) + ")",
                                    delegate(Match m) { return replacements[m.Value]; }
                                    );
        }
	// somewhere else in code
            string temp = "Jonathan Smith is a developer";
            adict.Add("Jonathan", "David");
            adict.Add("Smith", "Seruyange");
            string rep = MultipleReplace(temp, adict);


Solution 6 - C#

Another option using linq is

[TestMethod]
public void Test()
{
  var input = "it's worth a lot of money, if you can find a buyer.";
  var expected = "its worth a lot of money if you can find a buyer";
  var removeList = new string[] { ".", ",", "'" };
  var result = input;

  removeList.ToList().ForEach(o => result = result.Replace(o, string.Empty));

  Assert.AreEqual(expected, result);
}

Solution 7 - C#

I'm doing something similar, but in my case I'm doing serialization/De-serialization so I need to be able to go both directions. I find using a string[][] works nearly identically to the dictionary, including initialization, but you can go the other direction too, returning the substitutes to their original values, something that the dictionary really isn't set up to do.

Edit: You can use Dictionary<Key,List<Values>> in order to obtain same result as string[][]

Solution 8 - C#

Regular Expression with MatchEvaluator could also be used:

    var pattern = new Regex(@"These|words|are|placed|in|parentheses");
    var input = "The matching words in this text are being placed inside parentheses.";
    var result = pattern.Replace(input , match=> $"({match.Value})");

Note:

  • Obviously different expression (like: \b(\w*test\w*)\b) could be used for words matching.
  • I was hoping it to be more optimized to find the pattern in expression and do the replacements
  • The advantage is the ability to process the matching elements while doing the replacements

Solution 9 - C#

This is essentially Paolo Tedesco's answer, but I wanted to make it re-usable.

    public class StringMultipleReplaceHelper
    {
        private readonly Dictionary<string, string> _replacements;

        public StringMultipleReplaceHelper(Dictionary<string, string> replacements)
        {
            _replacements = replacements;
        }

        public string clean(string s)
        {
            foreach (string to_replace in _replacements.Keys)
            {
                s = s.Replace(to_replace, _replacements[to_replace]);
            }
            return s;
        }
    }

One thing to note that I had to stop it being an extension, remove the static modifiers, and remove this from clean(this string s). I'm open to suggestions as to how to implement this better.

Solution 10 - C#

string input = "it's worth a lot of money, if you can find a buyer.";
for (dynamic i = 0, repl = new string[,] { { "'", "''" }, { "money", "$" }, { "find", "locate" } }; i < repl.Length / 2; i++) {
    input = input.Replace(repl[i, 0], repl[i, 1]);
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionChris McKeeView Question on Stackoverflow
Solution 1 - C#user151323View Answer on Stackoverflow
Solution 2 - C#TimSView Answer on Stackoverflow
Solution 3 - C#TheVillageIdiotView Answer on Stackoverflow
Solution 4 - C#Paolo TedescoView Answer on Stackoverflow
Solution 5 - C#Andrej AdamenkoView Answer on Stackoverflow
Solution 6 - C#Luiz FelipeView Answer on Stackoverflow
Solution 7 - C#sidDemureView Answer on Stackoverflow
Solution 8 - C#babakansariView Answer on Stackoverflow
Solution 9 - C#red_dorianView Answer on Stackoverflow
Solution 10 - C#user7718176View Answer on Stackoverflow