Matching strings with wildcard

C#RegexStringWildcard

C# Problem Overview


I would like to match strings with a wildcard (*), where the wildcard means "any". For example:

*X = string must end with X
X* = string must start with X
*X* = string must contain X

Also, some compound uses such as:

*X*YZ* = string contains X and contains YZ
X*YZ*P = string starts with X, contains YZ and ends with P.

Is there a simple algorithm to do this? I'm unsure about using regex (though it is a possibility).

To clarify, the users will type in the above to a filter box (as simple a filter as possible), I don't want them to have to write regular expressions themselves. So something I can easily transform from the above notation would be good.

C# Solutions


Solution 1 - C#

Often, wild cards operate with two type of jokers:

  ? - any character  (one and only one)
  * - any characters (zero or more)

so you can easily convert these rules into appropriate regular expression:

// If you want to implement both "*" and "?"
private static String WildCardToRegular(String value) {
  return "^" + Regex.Escape(value).Replace("\\?", ".").Replace("\\*", ".*") + "$"; 
}

// If you want to implement "*" only
private static String WildCardToRegular(String value) {
  return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$"; 
}

And then you can use Regex as usual:

  String test = "Some Data X";

  Boolean endsWithEx = Regex.IsMatch(test, WildCardToRegular("*X"));
  Boolean startsWithS = Regex.IsMatch(test, WildCardToRegular("S*"));
  Boolean containsD = Regex.IsMatch(test, WildCardToRegular("*D*"));

  // Starts with S, ends with X, contains "me" and "a" (in that order) 
  Boolean complex = Regex.IsMatch(test, WildCardToRegular("S*me*a*X"));

Solution 2 - C#

You could use the VB.NET Like-Operator:

string text = "x is not the same as X and yz not the same as YZ";
bool contains = LikeOperator.LikeString(text,"*X*YZ*", Microsoft.VisualBasic.CompareMethod.Binary);  

Use CompareMethod.Text if you want to ignore the case.

You need to add using Microsoft.VisualBasic.CompilerServices; and add a reference to the Microsoft.VisualBasic.dll.

Since it's part of the .NET framework and will always be, it's not a problem to use this class.

Solution 3 - C#

Using of WildcardPattern from System.Management.Automation may be an option.

pattern = new WildcardPattern(patternString);
pattern.IsMatch(stringToMatch);

Visual Studio UI may not allow you to add System.Management.Automation assembly to References of your project. Feel free to add it manually, as described here.

Solution 4 - C#

For those using .NET Core 2.1+ or .NET 5+, you can use the FileSystemName.MatchesSimpleExpression method in the System.IO.Enumeration namespace.

string text = "X is a string with ZY in the middle and at the end is P";
bool isMatch = FileSystemName.MatchesSimpleExpression("X*ZY*P", text);

Both parameters are actually ReadOnlySpan<char> but you can use string arguments too. There's also an overloaded method if you want to turn on/off case matching. It is case insensitive by default as Chris mentioned in the comments.

Solution 5 - C#

A wildcard * can be translated as .* or .*? regex pattern.

You might need to use a singleline mode to match newline symbols, and in this case, you can use (?s) as part of the regex pattern.

You can set it for the whole or part of the pattern:

X* = > @"X(?s:.*)"
*X = > @"(?s:.*)X"
*X* = > @"(?s).*X.*"
*X*YZ* = > @"(?s).*X.*YZ.*"
X*YZ*P = > @"(?s:X.*YZ.*P)"

Solution 6 - C#

*X*YZ* = string contains X and contains YZ

@".*X.*YZ"

X*YZ*P = string starts with X, contains YZ and ends with P.

@"^X.*YZ.*P$"

Solution 7 - C#

It is necessary to take into consideration, that Regex IsMatch gives true with XYZ, when checking match with Y*. To avoid it, I use "^" anchor

isMatch(str1, "^" + str2.Replace("*", ".*?"));  

So, full code to solve your problem is

    bool isMatchStr(string str1, string str2)
    {
        string s1 = str1.Replace("*", ".*?");
        string s2 = str2.Replace("*", ".*?");
        bool r1 = Regex.IsMatch(s1, "^" + s2);
        bool r2 = Regex.IsMatch(s2, "^" + s1);
        return r1 || r2;
    }

Solution 8 - C#

public class Wildcard
{
	private readonly string _pattern;

	public Wildcard(string pattern)
	{
		_pattern = pattern;
	}

	public static bool Match(string value, string pattern)
	{
		int start = -1;
		int end = -1;
		return Match(value, pattern, ref start, ref end);
	}

	public static bool Match(string value, string pattern, char[] toLowerTable)
	{
		int start = -1;
		int end = -1;
		return Match(value, pattern, ref start, ref end, toLowerTable);
	}

	public static bool Match(string value, string pattern, ref int start, ref int end)
	{
		return new Wildcard(pattern).IsMatch(value, ref start, ref end);
	}

	public static bool Match(string value, string pattern, ref int start, ref int end, char[] toLowerTable)
	{
		return new Wildcard(pattern).IsMatch(value, ref start, ref end, toLowerTable);
	}

	public bool IsMatch(string str)
	{
		int start = -1;
		int end = -1;
		return IsMatch(str, ref start, ref end);
	}

	public bool IsMatch(string str, char[] toLowerTable)
	{
		int start = -1;
		int end = -1;
		return IsMatch(str, ref start, ref end, toLowerTable);
	}

	public bool IsMatch(string str, ref int start, ref int end)
	{
		if (_pattern.Length == 0) return false;
		int pindex = 0;
		int sindex = 0;
		int pattern_len = _pattern.Length;
		int str_len = str.Length;
		start = -1;
		while (true)
		{
			bool star = false;
			if (_pattern[pindex] == '*')
			{
				star = true;
				do
				{
					pindex++;
				}
				while (pindex < pattern_len && _pattern[pindex] == '*');
			}
			end = sindex;
			int i;
			while (true)
			{
				int si = 0;
				bool breakLoops = false;
				for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
				{
					si = sindex + i;
					if (si == str_len)
					{
						return false;
					}
					if (str[si] == _pattern[pindex + i])
					{
						continue;
					}
					if (si == str_len)
					{
						return false;
					}
					if (_pattern[pindex + i] == '?' && str[si] != '.')
					{
						continue;
					}
					breakLoops = true;
					break;
				}
				if (breakLoops)
				{
					if (!star)
					{
						return false;
					}
					sindex++;
					if (si == str_len)
					{
						return false;
					}
				}
				else
				{
					if (start == -1)
					{
						start = sindex;
					}
					if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
					{
						break;
					}
					if (sindex + i == str_len)
					{
						if (end <= start)
						{
							end = str_len;
						}
						return true;
					}
					if (i != 0 && _pattern[pindex + i - 1] == '*')
					{
						return true;
					}
					if (!star)
					{
						return false;
					}
					sindex++;
				}
			}
			sindex += i;
			pindex += i;
			if (start == -1)
			{
				start = sindex;
			}
		}
	}

	public bool IsMatch(string str, ref int start, ref int end, char[] toLowerTable)
	{
		if (_pattern.Length == 0) return false;

		int pindex = 0;
		int sindex = 0;
		int pattern_len = _pattern.Length;
		int str_len = str.Length;
		start = -1;
		while (true)
		{
			bool star = false;
			if (_pattern[pindex] == '*')
			{
				star = true;
				do
				{
					pindex++;
				}
				while (pindex < pattern_len && _pattern[pindex] == '*');
			}
			end = sindex;
			int i;
			while (true)
			{
				int si = 0;
				bool breakLoops = false;

				for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
				{
					si = sindex + i;
					if (si == str_len)
					{
						return false;
					}
					char c = toLowerTable[str[si]];
					if (c == _pattern[pindex + i])
					{
						continue;
					}
					if (si == str_len)
					{
						return false;
					}
					if (_pattern[pindex + i] == '?' && c != '.')
					{
						continue;
					}
					breakLoops = true;
					break;
				}
				if (breakLoops)
				{
					if (!star)
					{
						return false;
					}
					sindex++;
					if (si == str_len)
					{
						return false;
					}
				}
				else
				{
					if (start == -1)
					{
						start = sindex;
					}
					if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
					{
						break;
					}
					if (sindex + i == str_len)
					{
						if (end <= start)
						{
							end = str_len;
						}
						return true;
					}
					if (i != 0 && _pattern[pindex + i - 1] == '*')
					{
						return true;
					}
					if (!star)
					{
						return false;
					}
					sindex++;
					continue;
				}
			}
			sindex += i;
			pindex += i;
			if (start == -1)
			{
				start = sindex;
			}
		}
	}
}

Solution 9 - C#

To support those one with C#+Excel (for partial known WS name) but not only - here's my code with wildcard (ddd*). Briefly: the code gets all WS names and if today's weekday(ddd) matches the first 3 letters of WS name (bool=true) then it turn it to string that gets extracted out of the loop.

using System;
using Microsoft.Office.Interop.Excel;
using System.Runtime.InteropServices;
using Range = Microsoft.Office.Interop.Excel.Range;
using System.Diagnostics;
using System.Reflection;
using System.IO;
using System.Text.RegularExpressions;

...
string weekDay = DateTime.Now.ToString("ddd*");

Workbook sourceWorkbook4 = xlApp.Workbooks.Open(LrsIdWorkbook, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
Workbook destinationWorkbook = xlApp.Workbooks.Open(masterWB, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);

            static String WildCardToRegular(String value)
            {
                return "^" + Regex.Escape(value).Replace("\\*", ".*") + "$";
            }

            string wsName = null;
            foreach (Worksheet works in sourceWorkbook4.Worksheets)
            {
                Boolean startsWithddd = Regex.IsMatch(works.Name, WildCardToRegular(weekDay + "*"));

                    if (startsWithddd == true)
                    {
                        wsName = works.Name.ToString();
                    }
            }

            Worksheet sourceWorksheet4 = (Worksheet)sourceWorkbook4.Worksheets.get_Item(wsName);

...

Solution 10 - C#

C# Console application sample > Command line Sample:
> C:/> App_Exe -Opy PythonFile.py 1 2 3
> Console output:
> Argument list: -Opy PythonFile.py 1 2 3
> Found python filename: PythonFile.py

using System;
using System.Text.RegularExpressions;           //Regex

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            string cmdLine = String.Join(" ", args);

            bool bFileExtFlag = false;
            int argIndex = 0;
            Regex regex;
            foreach (string s in args)
            {
                //Search for the 1st occurrence of the "*.py" pattern
                regex = new Regex(@"(?s:.*)\056py", RegexOptions.IgnoreCase);
                bFileExtFlag = regex.IsMatch(s);
                if (bFileExtFlag == true)
                    break;
                argIndex++;
            };

            Console.WriteLine("Argument list: " + cmdLine);
            if (bFileExtFlag == true)
                Console.WriteLine("Found python filename: " + args[argIndex]);
            else
                Console.WriteLine("Python file with extension <.py> not found!");
        }


    }
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRobinsonView Question on Stackoverflow
Solution 1 - C#Dmitry BychenkoView Answer on Stackoverflow
Solution 2 - C#Tim SchmelterView Answer on Stackoverflow
Solution 3 - C#VirtualVDXView Answer on Stackoverflow
Solution 4 - C#Jamie LesterView Answer on Stackoverflow
Solution 5 - C#Wiktor StribiżewView Answer on Stackoverflow
Solution 6 - C#Avinash RajView Answer on Stackoverflow
Solution 7 - C#Pavel KhrapkinView Answer on Stackoverflow
Solution 8 - C#nb.duongView Answer on Stackoverflow
Solution 9 - C#ZIELIKView Answer on Stackoverflow
Solution 10 - C#geo leView Answer on Stackoverflow