如何使用 Power Query 删除文本字符串中的重复项

How to remove duplicates in a text string with Power Query

正如我的主题所提到的,在执行了几个步骤(groupby、filter、合并文本...)之后,我遇到了在 power 查询中删除同一单元格中的重复项的问题。 示例:“cc_emails”列有很多行,但由于 Text.Combine 之前的步骤:

,每行都有一些重复的电子邮件
sth like that:  "Giang.Phan@abc.com, thao.tran@abc.com, Khoa.Vu@abc.com, Vn.Offset@abc.com, Giang.Phan@abc.com, thao.tran@abc.com, Khoa.Vu@abc.com, Vn.Offset@abc.com" 

我想 1 封邮件在列表中只出现一次?有人可以帮忙看看这个吗? 预期输出:

"Giang.Phan@abc.com,thao.tran@abc.com, Khoa.Vu@abc.com, Vn.Offset@abc.com"

##更新我的查询编辑器:

let
    Source = Exchange.Contents("giang.phan@abc.com"),
    Mail1 = Source{[Name="Mail"]}[Data],
    #"Reordered Columns" = Table.ReorderColumns(Mail1,{"DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "Sender", "DisplayTo", "DisplayCc", "ToRecipients", "CcRecipients", "BccRecipients", "Importance", "Categories", "IsRead", "HasAttachments", "Attachments", "Preview", "Attributes", "Body", "Id"}),
    #"Filtered Rows" = Table.SelectRows(#"Reordered Columns", each [DateTimeReceived] > #datetime(2021, 12, 29, 0, 0, 0) and [DateTimeReceived] < #datetime(2022, 1, 4, 0, 0, 0)),
    #"Expanded ToRecipients" = Table.ExpandTableColumn(#"Filtered Rows", "ToRecipients", {"Address"}, {"ToRecipients.Address"}),
    #"Expanded CcRecipients" = Table.ExpandTableColumn(#"Expanded ToRecipients", "CcRecipients", {"Address"}, {"CcRecipients.Address"}),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded CcRecipients",{"BccRecipients", "Importance", "Categories", "IsRead", "HasAttachments", "Attachments", "Preview", "Attributes"}),
    #"Reordered Columns1" = Table.ReorderColumns(#"Removed Columns",{"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "Sender", "DisplayTo", "DisplayCc", "ToRecipients.Address", "CcRecipients.Address", "Body"}),
    #"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine([CcRecipients.Address],", "), type text}, {"to_address", each Text.Combine([ToRecipients.Address],", "), type text}}),
    #"Grouped Rows1" = Table.Group(#"Grouped Rows", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "DisplayTo", "DisplayCc", "cc_address", "to_address"}, {{"Last_time receive", each List.Max([DateTimeReceived]), type datetime}, {"Last_subject", each List.Max([Subject]), type nullable text}}),
    #"Removed Other Columns" = Table.SelectColumns(#"Grouped Rows1",{"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "DisplayTo", "DisplayCc", "cc_address", "to_address", "Last_time receive", "Last_subject"})
in
    #"Removed Other Columns"

您可以按分隔符拆分文本,select 不同的列表值,然后重新组合为字符串:

let
    Source = "Giang.Phan@abc.com, thao.tran@abc.com, Khoa.Vu@abc.com, Vn.Offset@abc.com, Giang.Phan@abc.com, thao.tran@abc.com, Khoa.Vu@abc.com, Vn.Offset@abc.com",
    #"Distinct Values" = Text.Combine(List.Distinct(Text.Split(Source, ", ")),", ")
in
    #"Distinct Values"

问题更新后编辑:

在你的情况下,你可以简单地改变这一行:

#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine([CcRecipients.Address],", "), type text}, {"to_address", each Text.Combine([ToRecipients.Address],", "), type text}}),

包含 List.Distinct 函数,return 仅包含不同的地址值:

#"Grouped Rows" = Table.Group(#"Reordered Columns1", {"Id", "DateTimeSent", "DateTimeReceived", "Folder Path", "Subject", "DisplayTo", "DisplayCc"}, {{"cc_address", each Text.Combine(List.Distinct([CcRecipients.Address]),", "), type text}, {"to_address", each Text.Combine(List.Distinct([ToRecipients.Address]),", "), type text}}),