使用通配符匹配字符串

Matching strings with wildcard

我想用通配符 (*) 匹配字符串,其中通配符表示 "any"。例如:

*X = string must end with X
X* = string must start with X
*X* = string must contain X

此外,还有一些复合用途,例如:

*X*YZ* = string contains X and contains YZ
X*YZ*P = string starts with X, contains YZ and ends with P.

有没有简单的算法可以做到这一点?我不确定是否使用正则表达式(尽管有可能)。

需要说明的是,用户会将上面的内容输入过滤框(尽可能简单的过滤器),我不希望他们必须自己编写正则表达式。所以我可以很容易地从上面的符号转换出来的东西会很好。

*X*YZ* = string contains X and contains YZ

@".*X.*YZ"

X*YZ*P = string starts with X, contains YZ and ends with P.

@"^X.*YZ.*P$"

您可以使用 VB.NET Like-Operator:

string text = "x is not the same as X and yz not the same as YZ";
bool contains = LikeOperator.LikeString(text,"*X*YZ*", Microsoft.VisualBasic.CompareMethod.Binary);  

如果您想忽略大小写,请使用 CompareMethod.Text

您需要添加 using Microsoft.VisualBasic.CompilerServices;add a reference to the Microsoft.VisualBasic.dll

因为它是 .NET 框架的一部分,而且永远都是,所以使用它不是问题 class。

通配符 * 可以翻译为 .*.*? 正则表达式模式。

您可能需要使用单行模式来匹配换行符,在这种情况下,您可以使用 (?s) 作为正则表达式模式的一部分。

您可以为整个或部分模式设置:

X* = > @"X(?s:.*)"
*X = > @"(?s:.*)X"
*X* = > @"(?s).*X.*"
*X*YZ* = > @"(?s).*X.*YZ.*"
X*YZ*P = > @"(?s:X.*YZ.*P)"

通配符通常与两种类型的王牌一​​起运作:

  ? - any character  (one and only one)
  * - any characters (zero or more)

因此您可以轻松地将这些规则转换为适当的正则表达式n:

// If you want to implement both "*" and "?"
private static String WildCardToRegular(String value) {
  return "^" + Regex.Escape(value).Replace("\?", ".").Replace("\*", ".*") + "$"; 
}

// If you want to implement "*" only
private static String WildCardToRegular(String value) {
  return "^" + Regex.Escape(value).Replace("\*", ".*") + "$"; 
}

然后你可以像往常一样使用 Regex:

  String test = "Some Data X";

  Boolean endsWithEx = Regex.IsMatch(test, WildCardToRegular("*X"));
  Boolean startsWithS = Regex.IsMatch(test, WildCardToRegular("S*"));
  Boolean containsD = Regex.IsMatch(test, WildCardToRegular("*D*"));

  // Starts with S, ends with X, contains "me" and "a" (in that order) 
  Boolean complex = Regex.IsMatch(test, WildCardToRegular("S*me*a*X"));

使用 System.Management.Automation 中的 WildcardPattern 可能是一种选择。

pattern = new WildcardPattern(patternString);
pattern.IsMatch(stringToMatch);

Visual Studio UI 可能不允许您将 System.Management.Automation 程序集添加到项目的引用中。随意手动添加它,如here.

所述

有必要考虑到,当检查与 Y* 的匹配时,Regex IsMatch 给出 XYZ 为真。为了避免它,我使用“^”锚

isMatch(str1, "^" + str2.Replace("*", ".*?"));  

所以,解决您问题的完整代码是

    bool isMatchStr(string str1, string str2)
    {
        string s1 = str1.Replace("*", ".*?");
        string s2 = str2.Replace("*", ".*?");
        bool r1 = Regex.IsMatch(s1, "^" + s2);
        bool r2 = Regex.IsMatch(s2, "^" + s1);
        return r1 || r2;
    }

C# 控制台应用程序示例

Command line Sample:
C:/> App_Exe -Opy PythonFile.py 1 2 3
Console output:
Argument list: -Opy PythonFile.py 1 2 3
Found python filename: PythonFile.py

using System;
using System.Text.RegularExpressions;           //Regex

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            string cmdLine = String.Join(" ", args);

            bool bFileExtFlag = false;
            int argIndex = 0;
            Regex regex;
            foreach (string s in args)
            {
                //Search for the 1st occurrence of the "*.py" pattern
                regex = new Regex(@"(?s:.*)6py", RegexOptions.IgnoreCase);
                bFileExtFlag = regex.IsMatch(s);
                if (bFileExtFlag == true)
                    break;
                argIndex++;
            };

            Console.WriteLine("Argument list: " + cmdLine);
            if (bFileExtFlag == true)
                Console.WriteLine("Found python filename: " + args[argIndex]);
            else
                Console.WriteLine("Python file with extension <.py> not found!");
        }


    }
}
public class Wildcard
{
    private readonly string _pattern;

    public Wildcard(string pattern)
    {
        _pattern = pattern;
    }

    public static bool Match(string value, string pattern)
    {
        int start = -1;
        int end = -1;
        return Match(value, pattern, ref start, ref end);
    }

    public static bool Match(string value, string pattern, char[] toLowerTable)
    {
        int start = -1;
        int end = -1;
        return Match(value, pattern, ref start, ref end, toLowerTable);
    }

    public static bool Match(string value, string pattern, ref int start, ref int end)
    {
        return new Wildcard(pattern).IsMatch(value, ref start, ref end);
    }

    public static bool Match(string value, string pattern, ref int start, ref int end, char[] toLowerTable)
    {
        return new Wildcard(pattern).IsMatch(value, ref start, ref end, toLowerTable);
    }

    public bool IsMatch(string str)
    {
        int start = -1;
        int end = -1;
        return IsMatch(str, ref start, ref end);
    }

    public bool IsMatch(string str, char[] toLowerTable)
    {
        int start = -1;
        int end = -1;
        return IsMatch(str, ref start, ref end, toLowerTable);
    }

    public bool IsMatch(string str, ref int start, ref int end)
    {
        if (_pattern.Length == 0) return false;
        int pindex = 0;
        int sindex = 0;
        int pattern_len = _pattern.Length;
        int str_len = str.Length;
        start = -1;
        while (true)
        {
            bool star = false;
            if (_pattern[pindex] == '*')
            {
                star = true;
                do
                {
                    pindex++;
                }
                while (pindex < pattern_len && _pattern[pindex] == '*');
            }
            end = sindex;
            int i;
            while (true)
            {
                int si = 0;
                bool breakLoops = false;
                for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
                {
                    si = sindex + i;
                    if (si == str_len)
                    {
                        return false;
                    }
                    if (str[si] == _pattern[pindex + i])
                    {
                        continue;
                    }
                    if (si == str_len)
                    {
                        return false;
                    }
                    if (_pattern[pindex + i] == '?' && str[si] != '.')
                    {
                        continue;
                    }
                    breakLoops = true;
                    break;
                }
                if (breakLoops)
                {
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                    if (si == str_len)
                    {
                        return false;
                    }
                }
                else
                {
                    if (start == -1)
                    {
                        start = sindex;
                    }
                    if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
                    {
                        break;
                    }
                    if (sindex + i == str_len)
                    {
                        if (end <= start)
                        {
                            end = str_len;
                        }
                        return true;
                    }
                    if (i != 0 && _pattern[pindex + i - 1] == '*')
                    {
                        return true;
                    }
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                }
            }
            sindex += i;
            pindex += i;
            if (start == -1)
            {
                start = sindex;
            }
        }
    }

    public bool IsMatch(string str, ref int start, ref int end, char[] toLowerTable)
    {
        if (_pattern.Length == 0) return false;

        int pindex = 0;
        int sindex = 0;
        int pattern_len = _pattern.Length;
        int str_len = str.Length;
        start = -1;
        while (true)
        {
            bool star = false;
            if (_pattern[pindex] == '*')
            {
                star = true;
                do
                {
                    pindex++;
                }
                while (pindex < pattern_len && _pattern[pindex] == '*');
            }
            end = sindex;
            int i;
            while (true)
            {
                int si = 0;
                bool breakLoops = false;

                for (i = 0; pindex + i < pattern_len && _pattern[pindex + i] != '*'; i++)
                {
                    si = sindex + i;
                    if (si == str_len)
                    {
                        return false;
                    }
                    char c = toLowerTable[str[si]];
                    if (c == _pattern[pindex + i])
                    {
                        continue;
                    }
                    if (si == str_len)
                    {
                        return false;
                    }
                    if (_pattern[pindex + i] == '?' && c != '.')
                    {
                        continue;
                    }
                    breakLoops = true;
                    break;
                }
                if (breakLoops)
                {
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                    if (si == str_len)
                    {
                        return false;
                    }
                }
                else
                {
                    if (start == -1)
                    {
                        start = sindex;
                    }
                    if (pindex + i < pattern_len && _pattern[pindex + i] == '*')
                    {
                        break;
                    }
                    if (sindex + i == str_len)
                    {
                        if (end <= start)
                        {
                            end = str_len;
                        }
                        return true;
                    }
                    if (i != 0 && _pattern[pindex + i - 1] == '*')
                    {
                        return true;
                    }
                    if (!star)
                    {
                        return false;
                    }
                    sindex++;
                    continue;
                }
            }
            sindex += i;
            pindex += i;
            if (start == -1)
            {
                start = sindex;
            }
        }
    }
}

对于使用 .NET Core 2.1+ 或 .NET 5+ 的用户,您可以在 System.IO.Enumeration 命名空间中使用 FileSystemName.MatchesSimpleExpression 方法。

string text = "X is a string with ZY in the middle and at the end is P";
bool isMatch = FileSystemName.MatchesSimpleExpression("X*ZY*P", text);

两个参数实际上都是 ReadOnlySpan<char>,但您也可以使用字符串参数。如果你想打开 on/off 大小写匹配,还有一个重载方法。正如 Chris 在评论中提到的,它默认不区分大小写。

支持那些带有 C#+Excel 的代码(对于部分已知的 WS 名称)但不仅如此 - 这是我的带有通配符 (ddd*) 的代码。 简而言之:代码获取所有 WS 名称,如果今天的工作日 (ddd) 与 WS 名称的前 3 个字母 (bool=true) 匹配,那么它会将其转换为从循环中提取的字符串。

using System;
using Microsoft.Office.Interop.Excel;
using System.Runtime.InteropServices;
using Range = Microsoft.Office.Interop.Excel.Range;
using System.Diagnostics;
using System.Reflection;
using System.IO;
using System.Text.RegularExpressions;

...
string weekDay = DateTime.Now.ToString("ddd*");

Workbook sourceWorkbook4 = xlApp.Workbooks.Open(LrsIdWorkbook, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);
Workbook destinationWorkbook = xlApp.Workbooks.Open(masterWB, 0, false, 5, "", "", true, XlPlatform.xlWindows, "\t", false, false, 0, true, 1, 0);

            static String WildCardToRegular(String value)
            {
                return "^" + Regex.Escape(value).Replace("\*", ".*") + "$";
            }

            string wsName = null;
            foreach (Worksheet works in sourceWorkbook4.Worksheets)
            {
                Boolean startsWithddd = Regex.IsMatch(works.Name, WildCardToRegular(weekDay + "*"));

                    if (startsWithddd == true)
                    {
                        wsName = works.Name.ToString();
                    }
            }

            Worksheet sourceWorksheet4 = (Worksheet)sourceWorkbook4.Worksheets.get_Item(wsName);

...