有没有办法让字符匹配字符 类 的连词?
Is there a way to have a character match a conjunction of character classes?
我正在尝试让正则表达式描述单引号分隔的字符串。
在字符串中,我可以有 任何可打印(或空白)字符 (这不是单引号),或者一系列两个单引号,这将是一个“转义”单引号报价。
[[:print:]] 字符 class(也写为 \p{XPosixPrint})符合我想要允许的字符的要求...除了它还允许单个字符“单引号”(')。我不想发生这种情况。
那么,有没有一种简单的方法可以做到这一点,比如描述一个字符来同时匹配两个表达式(比如 [[:print:]] 和 [^'] ),或者我必须创建一个自定义角色 class 枚举我允许(或禁止)的所有内容 ?
/(?!')\p{Print}/ # Worst performance and kinda yuck?
/\p{Print}(?<!')/ # Better performance but yuckier?
/[^\P{Print}']/ # Best performance, but hard to parse.[1]
use experimental qw( regex_sets ); # No idea why still experimental.
/(?[ \p{Print} - ['] ])/ # Best performance and clearest.
/[^\p{Cn}\p{Co}\p{Cs}\p{Cc}']/ # Non-general solution.
# Best performance but fragile.[2]
\p{Print}
是 \p{XPosixPrint}
.
的别名
-
char that is (printable and not('))
= char that is (not(not(printable and not('))))
= char that is (not(not(printable) or not(not('))))
= char that is (not(not(printable) or '))
= [^\P{Print}']
\p{Print}
包括除未分配、专用、代理和控制字符之外的所有字符。
/[^\p{Cn}\p{Co}\p{Cs}\p{Cc}']/
是
的缩写
/[^\p{General_Category=Unassigned}\p{General_Category=Private_Use}\p{General_Category=Surrogates}\p{General_Category=Control}']/
或
use experimental qw( regex_sets ); # No idea why still experimental.
/(?[ !(
\p{General_Category=Unassigned}
+ \p{General_Category=Private_Use}
+ \p{General_Category=Surrogates}
+ \p{General_Category=Control}
+ [']
) ])/
我正在尝试让正则表达式描述单引号分隔的字符串。 在字符串中,我可以有 任何可打印(或空白)字符 (这不是单引号),或者一系列两个单引号,这将是一个“转义”单引号报价。
[[:print:]] 字符 class(也写为 \p{XPosixPrint})符合我想要允许的字符的要求...除了它还允许单个字符“单引号”(')。我不想发生这种情况。
那么,有没有一种简单的方法可以做到这一点,比如描述一个字符来同时匹配两个表达式(比如 [[:print:]] 和 [^'] ),或者我必须创建一个自定义角色 class 枚举我允许(或禁止)的所有内容 ?
/(?!')\p{Print}/ # Worst performance and kinda yuck?
/\p{Print}(?<!')/ # Better performance but yuckier?
/[^\P{Print}']/ # Best performance, but hard to parse.[1]
use experimental qw( regex_sets ); # No idea why still experimental.
/(?[ \p{Print} - ['] ])/ # Best performance and clearest.
/[^\p{Cn}\p{Co}\p{Cs}\p{Cc}']/ # Non-general solution.
# Best performance but fragile.[2]
\p{Print}
是 \p{XPosixPrint}
.
-
char that is (printable and not(')) = char that is (not(not(printable and not(')))) = char that is (not(not(printable) or not(not(')))) = char that is (not(not(printable) or ')) = [^\P{Print}']
\p{Print}
包括除未分配、专用、代理和控制字符之外的所有字符。/[^\p{Cn}\p{Co}\p{Cs}\p{Cc}']/
是
的缩写/[^\p{General_Category=Unassigned}\p{General_Category=Private_Use}\p{General_Category=Surrogates}\p{General_Category=Control}']/
或
use experimental qw( regex_sets ); # No idea why still experimental. /(?[ !( \p{General_Category=Unassigned} + \p{General_Category=Private_Use} + \p{General_Category=Surrogates} + \p{General_Category=Control} + ['] ) ])/