Split a string by capital letters
C#StringC# Problem Overview
> Possible Duplicate:
> Regular expression, split string by capital letter but ignore TLA
I have a string which is a combination of several words, each word is capitalized.
For example: SeveralWordsString
Using C#, how do I split the string into "Several Words String" in a smart way?
Thanks!
C# Solutions
Solution 1 - C#
Use this regex (I forgot from which stackoverflow answer I sourced it, will search it now):
public static string ToLowercaseNamingConvention(this string s, bool toLowercase)
{
if (toLowercase)
{
var r = new Regex(@"
(?<=[A-Z])(?=[A-Z][a-z]) |
(?<=[^A-Z])(?=[A-Z]) |
(?<=[A-Za-z])(?=[^A-Za-z])", RegexOptions.IgnorePatternWhitespace);
return r.Replace(s, "_").ToLower();
}
else
return s;
}
I use it in this project: http://www.ienablemuch.com/2010/12/intelligent-brownfield-mapping-system.html
[EDIT]
I found it now: https://stackoverflow.com/questions/2559759/how-do-i-convert-camelcase-into-human-readable-names-in-java
Nicely split "TodayILiveInTheUSAWithSimon", no space on front of " Today":
using System;
using System.Text.RegularExpressions;
namespace TestSplit
{
class MainClass
{
public static void Main (string[] args)
{
Console.WriteLine ("Hello World!");
var r = new Regex(@"
(?<=[A-Z])(?=[A-Z][a-z]) |
(?<=[^A-Z])(?=[A-Z]) |
(?<=[A-Za-z])(?=[^A-Za-z])", RegexOptions.IgnorePatternWhitespace);
string s = "TodayILiveInTheUSAWithSimon";
Console.WriteLine( "YYY{0}ZZZ", r.Replace(s, " "));
}
}
}
Output:
YYYToday I Live In The USA With SimonZZZ
Solution 2 - C#
string[] SplitCamelCase(string source) {
return Regex.Split(source, @"(?<!^)(?=[A-Z])");
}
Sample:
Solution 3 - C#
You can just loop through the characters, and add spaces where needed:
string theString = "SeveralWordsString";
StringBuilder builder = new StringBuilder();
foreach (char c in theString) {
if (Char.IsUpper(c) && builder.Length > 0) builder.Append(' ');
builder.Append(c);
}
theString = builder.ToString();
Solution 4 - C#
public static IEnumerable<string> SplitOnCapitals(string text)
{
Regex regex = new Regex(@"\p{Lu}\p{Ll}*");
foreach (Match match in regex.Matches(text))
{
yield return match.Value;
}
}
This will handle Unicode properly.
Solution 5 - C#
string str1 = "SeveralWordsString";
string newstring = "";
for (int i = 0; i < str1.Length; i++)
{
if (char.IsUpper(str1[i]))
newstring += " ";
newstring += str1[i].ToString();
}