C# 从消息格式字符串中查找特定值

C# Finding specific values from messageformat strings

给定一个消息格式字符串,例如下面的 str。我希望能够获得用于显示文本值的 "notifications" & "name" 值。

var str = @"You have {notifications, plural,
          zero {no notifications}
           one {one notification}
           =42 {a universal amount of notifications}
         other {# notifications}
        }. Have a nice day, {name}!";

我试过使用 regex 例如:

var matches = Regex.Matches(str, @"{(.*?)}");
//var matches = Regex.Matches(str, @"(?<=\{)[^}{]*(?=\})");
var results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct().ToList();

但是上面没有考虑到 {notifications,.. 本身用大括号括起来并且包括不需要的内部值,这些值也用大括号括起来。

所以简而言之,我只想能够解析一个字符串,例如上面的 str,并在 returned 值处获得 notifications & name

var str2 = @"Hello {name}" 这样的字符串应该只是 return name 作为值。

编辑

notificationsname 不会事先知道 - 我只是以此为例,因为我需要从字符串中 return 的值。

一种方法是编写一个方法,根据输入 count 和字符串的单数(和复数)形式为您格式化字符串:

private static string FormatWord(int count, string singluar)
{
    return Format(count, singluar, singluar + "s");
}

private static string FormatWord(int count, string singular, string plural)
{
    return count == 0 ? "no " + plural
        : count == 1 ? "one " + singular
        : count == 42 ? "a universal number of " + plural
        : count + " " + plural;
}

然后在使用中它可能看起来像:

private static void Main()
{
    var name = "User";

    while (true)
    {
        var count = GetIntFromUser("Enter notification count: ");
        Console.WriteLine($"You have {FormatWord(count, "notification")}. " + 
            $"Have a nice day, {name}");
    }
}

请注意,此方法还使用辅助方法从用户那里获取 strongly-typed 整数:

private static int GetIntFromUser(string prompt, Func<int, bool> validator = null)
{
    int result;
    var cursorTop = Console.CursorTop;

    do
    {
        ClearSpecificLineAndWrite(cursorTop, prompt);
    } while (!int.TryParse(Console.ReadLine(), out result) ||
             !(validator?.Invoke(result) ?? true));

    return result;
}

private static void ClearSpecificLineAndWrite(int cursorTop, string message)
{
    Console.SetCursorPosition(0, cursorTop);
    Console.Write(new string(' ', Console.WindowWidth));
    Console.SetCursorPosition(0, cursorTop);
    Console.Write(message);
}

TL;DR:这是一个可选的解决方案

var str = @"You have {notifications, plural,
          zero {no notifications}
           one {one notification}
           =42 {a universal amount of notifications}
         other {# notifications}
        }. Have a nice day, {name}!";

// get matches skipping nested curly braces
var matches = 
    Regex.Matches(str, @"{((?:[^{}]|(?<counter>{)|(?<-counter>}))+(?(counter)(?!)))}");

var results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct()
    .Select(v => Regex.Match(v, @"^\w+").Value) // take 1st word
    .ToList();

结果为(调试时从Visual Studio Locals window复制)

results Count = 2   System.Collections.Generic.List<string>
    [0] "notifications"
    [1] "name"

...原始答案如下...


原题当前解法需要注意一点:

  • . 的使用不匹配换行符,因此这是它当前匹配嵌套值的原因之一(参见此 source

如果我理解你的目标,这篇文章是对相关问题和解决方案的很好的解释和演示:

(本文解决了原始问题中指出的主要挑战——嵌套花括号

https://blogs.msdn.microsoft.com/timart/2013/05/14/nestedrecursive-regex-and-net-balancing-groups-detect-a-function-with-a-regex/

根据那篇文章,我建议将以下模式作为可选解决方案:

var str = @"You have {notifications, plural,
          zero {no notifications}
           one {one notification}
           =42 {a universal amount of notifications}
         other {# notifications}
        }. Have a nice day, {name}!";

// get matches skipping nested curly braces
var matches = 
    Regex.Matches(str, @"{((?:[^{}]|(?<counter>{)|(?<-counter>}))+(?(counter)(?!)))}");
var results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct().ToList();

结果为(调试时从Visual Studio Locals window复制)

results Count = 2   System.Collections.Generic.List<string>
    [0] "notifications, plural,\r\n          zero {no notifications}\r\n           one {one notification}\r\n           =42 {a universal amount of notifications}\r\n         other {# notifications}\r\n        "
    [1] "name"

(或者如果您要将这些结果打印到控制台):

// Result 0 would look like:
notifications, plural,
          zero {no notifications}
           one {one notification}
           =42 {a universal amount of notifications}
         other {# notifications}


// Result 1 would look like:
name

更新

我回过头来发现这个问题只要求单个单词作为结果。

然后从每个结果中取出第一个词

(我用附加的 select 语句重复上面的片段以显示完整的解决方案)

var str = @"You have {notifications, plural,
          zero {no notifications}
           one {one notification}
           =42 {a universal amount of notifications}
         other {# notifications}
        }. Have a nice day, {name}!";

// get matches skipping nested curly braces
var matches = 
    Regex.Matches(str, @"{((?:[^{}]|(?<counter>{)|(?<-counter>}))+(?(counter)(?!)))}");

var results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct()
    .Select(v => Regex.Match(v, @"^\w+").Value) // take 1st word
    .ToList();

结果为(调试时从Visual Studio Locals window复制)

results Count = 2   System.Collections.Generic.List<string>
    [0] "notifications"
    [1] "name"

多一点信息

(我只是觉得这很有趣,花了更多时间 researching/learning 并且认为值得包含更多相关信息)

对话 here and here 包括一些支持和反对使用正则表达式解决此类问题的意见。

  • 我认为阅读这些意见并获得更多 well-rounded 观点很有趣

无论上述意见如何,.NET 创建者认为实施平衡组定义是合适的——此答案使用的功能: