有没有更好的方法来编写这个非常长的正则表达式,或者执行这个错误检查?

Is there a better way to write this incredibly long regex, or perform this error check?

我试图找到一种方法来 error/style 更正我在 Notepad++ 中开发的十年前的视频游戏中的非标准自定义菜单文件,这是我能想到的最好的方法。

下面 returns 任何没有后跟 EOL 字符的大括号 之前除了行首和 1-4 制表符之外的任何内容,它工作正常,但看起来它可能更优雅。 任何返回的括号都是不正确的,除非它们是文件中的第一个或最后一个。更多标签在技术上是可以的,极不可能。

    (?<!^\t)(?<!^\t\t)(?<!^\t\t\t)(?<!^\t\t\t\t)[{}]|[{}](?!\R)

格式正确:

    Menu "MenuName"
    {
        Menu "SubMenuName"
        {
            Option "OptionName" "Command"
            Option "OptionName" "Command"
        }
    }
// This is a comment line
// [ curly brackets in comment lines are made square so they don't get counted as balancing

所有花括号都应该单独放在一行上,只有前面的制表符。它们也应该配对,但我有一个插件可以处理。

格式不正确:

    Menu "MenuName"{
        Menu "SubMenuName"
        {
            Option "OptionName" "Command"
            Option "OptionName" "Command"   }
    }Menu "That bracket would be wrong since the line should end afterwards.
    {   //this would also be wrong
// Nothing should ever follow a bracket except a End Of Line character.

考虑到 Notepad++ 使用 Boost 正则表达式并且不允许可变长度回顾,是否有更好的方法来实现此 search/check?或许还请记住,我昨晚学到了我所知道的关于正则表达式的一切。

表达式也是 returns 第一个(没有前面的制表符)和最后一个(没有 EOL 字符),但我同意这种特殊行为。


我用作模板的文件的完整内容:

它从数据文件夹中的一个松散文件加载,完全按原样。

//DO NOT DELETE, needs a line here for some reason.
Menu "MenuNameTEMPLATE"
{
    Title "TitleName"
    Option "OptionName" "Command"
    Divider
    LockedOption
    {
        DisplayName "OptionName"
        Command "Command"
        Icon "IconName"
        PowerReady "PowerIdentifiers"
    }
    LockedOption
    {
        DisplayName "OptionName"
        Command "Command"
        Icon "IconName"
        Badge "BadgeIdentifiers"
    }
    LockedOption
    {
        DisplayName "OptionName"
        Command "Command"
        Icon "IconName"
    }
    Menu "SubMenuName"
    {
        Title "TitleName"
        Option "OptionName" "Command"
        Option "OptionName" "Command"
    }
}

首先我想说正则表达式 100% 是错误的工具,您需要一个自定义解析器来处理验证您的文件并将其解析为您可以使用的模型。

但是,由于您的问题所施加的限制,应该执行以下操作:

^(?:[^{}]*|\t{1,4}[{}])$

不必担心 look-arounds,只需匹配您期望找到的内容即可。在此处查看实际效果:https://regex101.com/r/nYNqHw/1

  • Ctrl+F
  • 查找内容:^\h+[{}](?=\h*\R)(*SKIP)(*FAIL)|[{}]
  • 检查 环绕
  • 检查 正则表达式
  • 在当前文档中查找所有内容

解释:

  ^                   # beginning of line
    \h+                 # 1 or more horizontal spaces, you can use \t{1,4} if you only want tabulations
    [{}]                # open or close brace
    (?=                 # positive lookahead, make sure we have after:
        \h*                 # 0 or more horizontal spaces
        \R                  # any kind of linebreak
    )                   # end lookahead
    (*SKIP)(*FAIL)      # skip this match and consider that fails
|                   # OR
    [{}]                # open or close brace

截图:

由于您标记了这个 Boost,并且其他人正确地指出您不需要将正则表达式与此处的某些编辑器工具一起使用,这里是使用 Boost Spirit 的正确解析器的起点。

我们将解析成一些类型:

struct Option { std::string name, command; };
struct Divider { };

struct LockedOption {
    struct Property { std::string key, value; };
    using Properties = std::vector<Property>;
    Properties properties; // DisplayName, Command, Icon, PowerReady, Badge...?
};

struct Menu;
using MenuItem = boost::variant<Option, Divider, LockedOption,
                                boost::recursive_wrapper<Menu>>;
struct Menu {
    std::string                  id;
    boost::optional<std::string> title;
    std::vector<MenuItem>        items;
};

现在,我们可以定义一个解析器:

namespace Parser {
    using namespace boost::spirit::x3;
    rule<struct menu_rule, Menu> const menu{"menu"};

    auto const qstring = rule<void, std::string>{"quoted string"} = //
        lexeme['"' > *('"' >> char_('"') | ~char_('"')) > '"'];

    auto const option = rule<void, Option>{"option"} = //
        "Option" > qstring > qstring;

    auto property   = (!lit('}')) > lexeme[+graph] > qstring;
    auto properties = rule<void, LockedOption::Properties>{"properties"} =
        *(property > eol);

    auto const lockedoption = rule<void, LockedOption>{"lockedoption"} = //
        "LockedOption" > eol                                             //
        > '{' > eol                                                      //
        > properties                                                     //
        > '}';

    auto divider = rule<void, Divider>{"divider"} = //
        lit("Divider") >> attr(Divider{});

    auto item = rule<void, MenuItem>{"menu|option|lockedoption|divider"} =
        menu | option | lockedoption | divider;

    auto title = "Title" > qstring;

    auto menu_def =            //
        "Menu" > qstring > eol //
        > '{' > eol            //
        > -(title > eol)       //
        > *(item > eol)        //
        > '}';

    auto ignore = blank | "//" >> *~char_("\r\n") >> (eol|eoi);

    BOOST_SPIRIT_DEFINE(menu)

    Menu parseMenu(std::string const& text) try {
        Menu result;
        parse(begin(text), end(text), skip(ignore)[expect[menu] > *eol > eoi],
              result);
        return result;
    } catch (expectation_failure<std::string::const_iterator> const& ef) {
        throw std::runtime_error(
            "At " + std::to_string(std::distance(begin(text), ef.where())) +
            ": Expected " + ef.which() + " (Got '" +
            std::string(ef.where(), std::find(ef.where(), end(text), '\n')) +
            "')");
    }
} // namespace Parser

我们只需要使用 parseMenu 函数,该函数 returns 已解析 Menu。因此,如果我们将您问题中的示例连接起来:

for (std::string const& sample :
     {
        // ... 
     })
{
    static int i = 0;
    fmt::print("----- {}\n", ++i);
    try {
        fmt::print("Parsed: {}\n", Parser::parseMenu(sample));
    } catch (std::exception const& e) {
        std::cout << "Parse failed: " << e.what() << "\n";
    }
}

我们可以得到输出(见下面的现场演示):

----- 1
Parsed: Menu "MenuName"
{
    Title ""
    Menu "SubMenuName"
{
    Title ""
    Option "OptionName" "Command"
    Option "OptionName" "Command"
}
}
----- 2
Parse failed: At 19: Expected eol (Got '{')
----- 3
Parsed: Menu "MenuNameTEMPLATE"
{
    Title "TitleName"
    Option "OptionName" "Command"
    Divider
    LockedOption
{
    DisplayName "OptionName"
    Command "Command"
    Icon "IconName"
    PowerReady "PowerIdentifiers"
}
    LockedOption
{
    DisplayName "OptionName"
    Command "Command"
    Icon "IconName"
    Badge "BadgeIdentifiers"
}
    LockedOption
{
    DisplayName "OptionName"
    Command "Command"
    Icon "IconName"
}
    Menu "SubMenuName"
{
    Title "TitleName"
    Option "OptionName" "Command"
    Option "OptionName" "Command"
}
}
----- 4
Parse failed: At 70: Expected menu|option|lockedoption|divider (Got '                    Road Rage')

现场演示

在编译器资源管理器上:https://godbolt.org/z/sW3Y5z9nq

//#define BOOST_SPIRIT_X3_DEBUG
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
#include <map>
#include <fmt/ranges.h>

struct Option { std::string name, command; };
struct Divider { };

struct LockedOption {
#if 1
    struct Property { std::string key, value; };
    using Properties = std::vector<Property>;
#else
    using Properties = std::map<std::string, std::string>;
#endif
    Properties properties; // DisplayName, Command, Icon, PowerReady, Badge...?
};

struct Menu;
using MenuItem = boost::variant<Option, Divider, LockedOption,
                                boost::recursive_wrapper<Menu>>;
struct Menu {
    std::string                  id;
    boost::optional<std::string> title;
    std::vector<MenuItem>        items;
};

#ifdef BOOST_SPIRIT_X3_DEBUG
    [[maybe_unused]] std::ostream& operator<<(std::ostream& os, Option)       { return os << "Option";       }
    [[maybe_unused]] std::ostream& operator<<(std::ostream& os, Divider)      { return os << "Divider";      }
    [[maybe_unused]] std::ostream& operator<<(std::ostream& os, LockedOption) { return os << "LockedOption"; }
    [[maybe_unused]] std::ostream& operator<<(std::ostream& os, Menu)         { return os << "Menu";         }
#endif

struct MenuItemFormatter : fmt::formatter<std::string> {
    template <typename... Ts>
    auto format(boost::variant<Ts...> const& var, auto& ctx) const {
        return boost::apply_visitor(
            [&](auto const& el) {
                return format_to(ctx.out(), fmt::runtime("{}"), el);
            }, var);
    }

    auto format(LockedOption const& lo, auto& ctx) const {
        auto out = fmt::format_to(ctx.out(), "LockedOption\n{{\n");
        for (auto const& [k, v] : lo.properties)
            out = fmt::format_to(out, "    {} \"{}\"\n", k, v);

        return fmt::format_to(out, "}}");
    }

    auto format(Divider const&, auto& ctx) const {
        return fmt::format_to(ctx.out(), "Divider");
    }

    auto format(Option const& o, auto& ctx) const {
        return fmt::format_to(ctx.out(), "Option \"{}\" \"{}\"", o.name, o.command);
    }

    auto format(Menu const& m, auto& ctx) const {
        return fmt::format_to(
            ctx.out(), "Menu \"{}\"\n{{\n    Title \"{}\"\n    {}\n}}", m.id,
            m.title.value_or(""), fmt::join(m.items, "\n    "));
    }
};

template <> struct fmt::formatter<MenuItem>     : MenuItemFormatter{};
template <> struct fmt::formatter<LockedOption> : MenuItemFormatter{};
template <> struct fmt::formatter<Divider>      : MenuItemFormatter{};
template <> struct fmt::formatter<Option>       : MenuItemFormatter{};
template <> struct fmt::formatter<Menu>         : MenuItemFormatter{};

BOOST_FUSION_ADAPT_STRUCT(Option, name, command)
BOOST_FUSION_ADAPT_STRUCT(LockedOption, properties)
BOOST_FUSION_ADAPT_STRUCT(LockedOption::Property, key, value)
BOOST_FUSION_ADAPT_STRUCT(Menu, id, title, items)

    namespace Parser {
        using namespace boost::spirit::x3;
        rule<struct menu_rule, Menu> const menu{"menu"};

        auto const qstring = rule<void, std::string>{"quoted string"} = //
            lexeme['"' > *('"' >> char_('"') | ~char_('"')) > '"'];

        auto const option = rule<void, Option>{"option"} = //
            "Option" > qstring > qstring;

        auto property   = lexeme[+graph] > qstring;
        auto properties = rule<void, LockedOption::Properties>{"properties"} =
            *((!lit('}')) > property > eol);

        auto const lockedoption = rule<void, LockedOption>{"lockedoption"} = //
            "LockedOption" > eol                                             //
            > '{' > eol                                                      //
            > properties                                                     //
            > '}';

        auto divider = rule<void, Divider>{"divider"} = //
            lit("Divider") >> attr(Divider{});

        auto item = rule<void, MenuItem>{"menu|option|lockedoption|divider"} =
            menu | option | lockedoption | divider;

        auto title = "Title" > qstring;

        auto menu_def =                   //
            "Menu" > qstring > eol        //
            > '{' > eol                   //
            > -(title > eol)              //
            > *((!lit('}')) > item > eol) //
            > '}';

        auto ignore = blank | "//" >> *~char_("\r\n") >> (eol|eoi);

        BOOST_SPIRIT_DEFINE(menu)

        Menu parseMenu(std::string const& text) try {
            Menu result;
            parse(begin(text), end(text), skip(ignore)[expect[menu] > *eol > eoi],
                  result);
            return result;
        } catch (expectation_failure<std::string::const_iterator> const& ef) {
            throw std::runtime_error(
                "At " + std::to_string(std::distance(begin(text), ef.where())) +
                ": Expected " + ef.which() + " (Got '" +
                std::string(ef.where(), std::find(ef.where(), end(text), '\n')) +
                "')");
        }
    } // namespace Parser

int main() {
    for (std::string const& sample :
         {
             R"(    Menu "MenuName"
                        {
                            Menu "SubMenuName"
                            {
                                Option "OptionName" "Command"
                                Option "OptionName" "Command"
                            }
                        }
                    // This is a comment line
                    // // [ curly brackets in comment lines are made square so they don't get counted as balancing)",
             R"(    Menu "MenuName"{
                            Menu "SubMenuName"
                            {
                                Option "OptionName" "Command"
                                Option "OptionName" "Command"   }
                        }Menu "That bracket would be wrong since the line should end afterwards.
                        {   //this would also be wrong
                    // Nothing should ever follow a bracket except a End Of Line character.)",
             R"(//DO NOT DELETE, needs a line here for some reason.
                    Menu "MenuNameTEMPLATE"
                    {
                        Title "TitleName"
                        Option "OptionName" "Command"
                        Divider
                        LockedOption
                        {
                            DisplayName "OptionName"
                            Command "Command"
                            Icon "IconName"
                            PowerReady "PowerIdentifiers"
                        }
                        LockedOption
                        {
                            DisplayName "OptionName"
                            Command "Command"
                            Icon "IconName"
                            Badge "BadgeIdentifiers"
                        }
                        LockedOption
                        {
                            DisplayName "OptionName"
                            Command "Command"
                            Icon "IconName"
                        }
                        Menu "SubMenuName"
                        {
                            Title "TitleName"
                            Option "OptionName" "Command"
                            Option "OptionName" "Command"
                        }
             })",
             R"(Menu "Not So Good"
                {
                    Title "Uhoh"
                    Road Rage
                }
             )",
         }) //
    {
        static int i = 0;
        fmt::print("----- {}\n", ++i);
        try {
            fmt::print("Parsed: {}\n", Parser::parseMenu(sample));
        } catch (std::exception const& e) {
            std::cout << "Parse failed: " << e.what() << "\n";
        }
    }
}