regsub/regex 正在解析 tcl 中的元素列表

Question

我需要转换一个包含多个元素 (<>,abcd1,1,1) 的列表的字符串，如下所示。

发件人：

test={abc([(<>,yifow3,1,1),(abc,yifow3,2,2,20140920,20151021),(<>,yifow3,3,3,20140920,20151021),(<>,yifow3,4,4)])}

收件人：

abc([(yifow3,1,1),(yifow3,2,2),(yifow3,3,3),(yifow3,4,4)])

我尝试使用下面的 regsub 提取 abc([]) 中的列表。它总是以 "abc([" 开头，以“])”结尾。

regsub -all {(abc\(\[)([a-z0-9\<\>\(\),]+)(\)\])} $test {} test2

然后从test2开始，使用for循环从每个元素（<>,abcd1,1,1）中提取第二、三、四项。

有什么简单的方法可以使用 regsub/regex 而不是 for 循环来提取吗？

正则表达式应该提取第二、第三和第四项，如果它们存在则忽略第一、第五和第六项。

Answer 1

好的，严格根据你的问题，如果你已经确定字符串以 abc([ 开头并以 ]):

set test {abc([(<>,yifow3,1,1),(abc,yifow3,2,2,20140920,20151021),(<>,yifow3,3,3,20140920,20151021),(<>,yifow3,4,4)])}
set items [regexp -all -inline -- {\([^()]+\)} $test]
# (<>,yifow3,1,1) (abc,yifow3,2,2,20140920,20151021) (<>,yifow3,3,3,20140920,20151021) (<>,yifow3,4,4)

然后您可以遍历每个元素（以逗号分隔，获取第 2 到第 4 个元素并将它们连接回来，等等）。

我认为如果您想保持简单就无法避免使用循环。我猜你可以跳过一些更详细的步骤（不再简单！）正则表达式：

set test {abc([(<>,yifow3,1,1),(abc,yifow3,2,2,20140920,20151021),(<>,yifow3,3,3,20140920,20151021),(<>,yifow3,4,4)])}
set items [regexp -all -inline -- {\([^,()]+((?:,[^,()]+){3})} $test]
set results [lmap {a b} $items {list [string trim $b ,]}]
# yifow3,1,1 yifow3,2,2 yifow3,3,3 yifow3,4,4

这里的正则表达式 \([^,()]+((?:,[^,()]+){3}) 匹配如下：

\(                 # Literal opening paren
[^,()]+            # Any character except ',', '(' and ')'
(
  (?:,[^,()]+){3}  # A comma followed by any character except ',', '(' and ')',
                   # the whole thing 3 times
)

我在这里使用了lmap (Tcl8.6)，这基本上是一种循环。您可以稍微更改它以获取您要查找的字符串：

set results [lmap {a b} $items {list "([string trim $b ,])"}]
set output "abc(\[[join $results ,]])"
# abc([(yifow3,1,1),(yifow3,2,2),(yifow3,3,3),(yifow3,4,4)])

Answer 2

regsub -all -expanded {
    \(                        # a literal parenthesis
    [^(,]+ ,                  # 1 or more non-(parenthesis or comma)s and comma
    ( [^,]+ , \d+ , \d+ )     # the 3 fields to keep with commas
    [^)]*                     # 0 or more non-parenthesis chars
    \)                        # a literal parenthesis
} $test {()}

returns

abc([(yifow3,1,1),(yifow3,2,2),(yifow3,3,3),(yifow3,4,4)])

regsub/regex 正在解析 tcl 中的元素列表

regsub/regex parsing on list of elements in tcl

tcl