Replace Multiple String Elements in C#
C#StringRefactoringImmutabilityC# Problem Overview
Is there a better way of doing this...
MyString.Trim().Replace("&", "and").Replace(",", "").Replace(" ", " ")
.Replace(" ", "-").Replace("'", "").Replace("/", "").ToLower();
I've extended the string class to keep it down to one job but is there a quicker way?
public static class StringExtension
{
public static string clean(this string s)
{
return s.Replace("&", "and").Replace(",", "").Replace(" ", " ")
.Replace(" ", "-").Replace("'", "").Replace(".", "")
.Replace("eacute;", "é").ToLower();
}
}
Just for fun (and to stop the arguments in the comments) I've shoved a gist up benchmarking the various examples below. > https://gist.github.com/ChrisMcKee/5937656
The regex option scores terribly; the dictionary option comes up the fastest; the long winded version of the stringbuilder replace is slightly faster than the short hand.
C# Solutions
Solution 1 - C#
Quicker - no. More effective - yes, if you will use the StringBuilder
class. With your implementation each operation generates a copy of a string which under circumstances may impair performance. Strings are immutable objects so each operation just returns a modified copy.
If you expect this method to be actively called on multiple Strings
of significant length, it might be better to "migrate" its implementation onto the StringBuilder
class. With it any modification is performed directly on that instance, so you spare unnecessary copy operations.
public static class StringExtention
{
public static string clean(this string s)
{
StringBuilder sb = new StringBuilder (s);
sb.Replace("&", "and");
sb.Replace(",", "");
sb.Replace(" ", " ");
sb.Replace(" ", "-");
sb.Replace("'", "");
sb.Replace(".", "");
sb.Replace("eacute;", "é");
return sb.ToString().ToLower();
}
}
Solution 2 - C#
If you are simply after a pretty solution and don't need to save a few nanoseconds, how about some LINQ sugar?
var input = "test1test2test3";
var replacements = new Dictionary<string, string> { { "1", "*" }, { "2", "_" }, { "3", "&" } };
var output = replacements.Aggregate(input, (current, replacement) => current.Replace(replacement.Key, replacement.Value));
Solution 3 - C#
this will be more efficient:
public static class StringExtension
{
public static string clean(this string s)
{
return new StringBuilder(s)
.Replace("&", "and")
.Replace(",", "")
.Replace(" ", " ")
.Replace(" ", "-")
.Replace("'", "")
.Replace(".", "")
.Replace("eacute;", "é")
.ToString()
.ToLower();
}
}
Solution 4 - C#
Maybe a little more readable?
public static class StringExtension {
private static Dictionary<string, string> _replacements = new Dictionary<string, string>();
static StringExtension() {
_replacements["&"] = "and";
_replacements[","] = "";
_replacements[" "] = " ";
// etc...
}
public static string clean(this string s) {
foreach (string to_replace in _replacements.Keys) {
s = s.Replace(to_replace, _replacements[to_replace]);
}
return s;
}
}
Also add New In Town's suggestion about StringBuilder...
Solution 5 - C#
There is one thing that may be optimized in the suggested solutions. Having many calls to Replace()
makes the code to do multiple passes over the same string. With very long strings the solutions may be slow because of CPU cache capacity misses. May be one should consider replacing multiple strings in a single pass.
The essential content from that link:
static string MultipleReplace(string text, Dictionary replacements) {
return Regex.Replace(text,
"(" + String.Join("|", adict.Keys.ToArray()) + ")",
delegate(Match m) { return replacements[m.Value]; }
);
}
// somewhere else in code
string temp = "Jonathan Smith is a developer";
adict.Add("Jonathan", "David");
adict.Add("Smith", "Seruyange");
string rep = MultipleReplace(temp, adict);
Solution 6 - C#
Another option using linq is
[TestMethod]
public void Test()
{
var input = "it's worth a lot of money, if you can find a buyer.";
var expected = "its worth a lot of money if you can find a buyer";
var removeList = new string[] { ".", ",", "'" };
var result = input;
removeList.ToList().ForEach(o => result = result.Replace(o, string.Empty));
Assert.AreEqual(expected, result);
}
Solution 7 - C#
I'm doing something similar, but in my case I'm doing serialization/De-serialization so I need to be able to go both directions. I find using a string[][] works nearly identically to the dictionary, including initialization, but you can go the other direction too, returning the substitutes to their original values, something that the dictionary really isn't set up to do.
Edit: You can use Dictionary<Key,List<Values>>
in order to obtain same result as string[][]
Solution 8 - C#
Regular Expression with MatchEvaluator
could also be used:
var pattern = new Regex(@"These|words|are|placed|in|parentheses");
var input = "The matching words in this text are being placed inside parentheses.";
var result = pattern.Replace(input , match=> $"({match.Value})");
Note:
- Obviously different expression (like:
\b(\w*test\w*)\b
) could be used for words matching. - I was hoping it to be more optimized to find the pattern in expression and do the replacements
- The advantage is the ability to process the matching elements while doing the replacements
Solution 9 - C#
This is essentially Paolo Tedesco's answer, but I wanted to make it re-usable.
public class StringMultipleReplaceHelper
{
private readonly Dictionary<string, string> _replacements;
public StringMultipleReplaceHelper(Dictionary<string, string> replacements)
{
_replacements = replacements;
}
public string clean(string s)
{
foreach (string to_replace in _replacements.Keys)
{
s = s.Replace(to_replace, _replacements[to_replace]);
}
return s;
}
}
One thing to note that I had to stop it being an extension, remove the static
modifiers, and remove this
from clean(this string s)
. I'm open to suggestions as to how to implement this better.
Solution 10 - C#
string input = "it's worth a lot of money, if you can find a buyer.";
for (dynamic i = 0, repl = new string[,] { { "'", "''" }, { "money", "$" }, { "find", "locate" } }; i < repl.Length / 2; i++) {
input = input.Replace(repl[i, 0], repl[i, 1]);
}