如何在 C# 中将内容为 html 的链接剥离为纯文本

Question

要求是：

"link: <http://www.google.com|www.google.com> link1: <http://www.jira.com|www.jira.com>\n\n\n"

需要显示为：

"link: www.google.com link1: www.jira.com"

任何解决方案。

Answer 1

您可以尝试正则表达式：

Regex.Replace(
    input: "link: <http://www.google.com|www.google.com> link1: <http://www.jira.com|www.jira.com>\n\n\n",
    pattern: @"<[^|]*\|([^>]*)>",
    replacement: "")

输出：

link: www.google.com link1: www.jira.com

工作示例：https://dotnetfiddle.net/fn99jz

正则表达式细分：

<       // Match literal '<'.
[^|]*   // Match all chars until reaching '|'.
\|      // Match literal '|' (needs escaping).
(       // Start capturing all chars matching the following expr.
  [^>]* // Match all chars until reaching '>'. 
)       // Stop capturing & store previous match in reference ''.
>       // Match literal '>'.

如何在 C# 中将内容为 html 的链接剥离为纯文本

How to strip links with html contents to plain text in c#

c#

asp.net

hyperlink

asp.net-core