如何在 C# 中将内容为 html 的链接剥离为纯文本

How to strip links with html contents to plain text in c#

要求是:

"link: <http://www.google.com|www.google.com> link1: <http://www.jira.com|www.jira.com>\n\n\n"

需要显示为:

"link: www.google.com link1: www.jira.com"

任何解决方案。

您可以尝试正则表达式:

Regex.Replace(
    input: "link: <http://www.google.com|www.google.com> link1: <http://www.jira.com|www.jira.com>\n\n\n",
    pattern: @"<[^|]*\|([^>]*)>",
    replacement: "")

输出:

link: www.google.com link1: www.jira.com

工作示例:https://dotnetfiddle.net/fn99jz

正则表达式细分:

<       // Match literal '<'.
[^|]*   // Match all chars until reaching '|'.
\|      // Match literal '|' (needs escaping).
(       // Start capturing all chars matching the following expr.
  [^>]* // Match all chars until reaching '>'. 
)       // Stop capturing & store previous match in reference ''.
>       // Match literal '>'.