用 Excel 中的正则表达式替换

Replace by regular expresion in Excel

我在 Excel 中有一个列表,如下所示:

1 / 6 / 45
123
1546
123 456 
1247 /% 456 /

我想创建一个新列,将所有连续的非数字序列替换为一个字符。在 Google 表格中,使用 =REGEXREPLACE(A1&"/","\D+",",") 很容易,结果是:

1,6,45,
123,
1546,
123,456 
1247,456,

在该公式中,需要 A1&"/" 才能使 REGEXREPLACE 处理数字。没什么大不了的,只是在最后加一个逗号。

我们如何在 Excel 中做到这一点?非常鼓励使用 Pure Power Query(不是 R,不是 Python,只是 M)。 VBA 和其他可点击的 Excel 功能是不可接受的(如查找和替换)

如果你有 Excel 365:

B1中:

=LET(X,MID(A1,SEQUENCE(LEN(A1)),1),SUBSTITUTE(TRIM(CONCAT(IF(ISNUMBER(--X),X," ")))," ",","))

或者,如果连续的数字始终至少由 space:

分隔
=TEXTJOIN(",",,FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[.*0=0]"))

另一个选项,如果您有权访问它,是 LAMBDA()。创建一个函数来替换所有类型的字符,类似于 。没有 LAMBDA()TEXTJOIN() 我认为你最好的选择是开始嵌套 SUBSTITUTE() 函数。

如果您有 TEXTJOIN 功能可用,这是另一种变体。

=SUBSTITUTE(TRIM(TEXTJOIN("",TRUE,IFERROR(MID(A2,ROW($A:INDEX(A:A,LEN(A2))),1)+0," ")))," ",",")

这是一个 Power Query 解决方案。 它利用 List.Accumulate 函数来确定是否向字符串添加数字或逗号:

请注意,代码复制了您显示的结果。如果您希望避免尾随(and/or 前导)逗号,可以轻松修改它。

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "textToList", each List.Combine({Text.ToList([Column1]),{","}})),
    
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "commaTerminators", each List.Accumulate(
      [textToList],"", (state,current) => 
            if List.Contains({"0".."9"},current)
                then state & current
            else if Text.EndsWith(state,",") 
                then state  
            else state & ",")),
        
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"textToList"})
in
    #"Removed Columns"

Edit 为了消除 leading/trailing 逗号,我们添加了 Text.Trim 函数,该函数在 Power Query 中允许定义特定文本从 start/end 到 Trim:

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "textToList", each List.Combine({Text.ToList([Column1]),{","}})),
    
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "commaTerminators", each 
    Text.Trim(
        List.Accumulate(
            [textToList],"", (state,current) => 
                if List.Contains({"0".."9"},current)
                    then state & current
                else if Text.EndsWith(state,",") 
                    then state  
                else state & ","),
        ",")),
        
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"textToList"})
in
    #"Removed Columns"

VBA UDF 你提到你不想要 VBA,但不清楚你是否将其限制为“可点击”。这是一个用户定义的函数,您可以直接在工作表上使用它。它使用 VBA 正则表达式引擎,可以轻松提取多个匹配项

您可以在工作表上输入一个公式,例如 =commaSep(cell_ref) 以获得与上面我的第二个 PQ 示例中所示相同的结果

Option Explicit
Function commaSep(S As String) As String
    Dim RE As Object, MC As Object, M As Object
    Dim sTemp As String
    
Set RE = CreateObject("vbscript.regexp")
With RE
    .Global = True
    .Pattern = "\d+"
    If .test(S) Then
        Set MC = .Execute(S)
        sTemp = ""
        For Each M In MC
            sTemp = sTemp & "," & M
        Next M
        commaSep = Mid(sTemp, 2)
    Else
        commaSep = "no digits"
    End If
End With

还有 Power Query 中的另一个选项。

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTQVzADYhNTpVgdINfIGEKbmpjBBIByZgpQjom5gr4qWEBfKTYWAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
    x1 = Table.AddColumn(#"Changed Type", "x1", each Text.ToList([Column1])),
    x2 = Table.AddColumn(x1, "x2", each List.Transform([x1], each if Text.Contains("0123456789", _) then _ else " "  )),
    x3 = Table.AddColumn(x2, "x3", each Text.Split(Text.Combine([x2])," ")),
    x4 = Table.AddColumn(x3, "x4", each List.Transform([x3], each if Text.Contains("0123456789", try Text.At(_,0) otherwise " ") then _&"," else "" )),
    x5 = Table.AddColumn(x4, "x5", each Text.Combine([x4])),
    #"Removed Columns" = Table.RemoveColumns(x5,{"x1", "x2", "x3", "x4"})
in
    #"Removed Columns"