如何从字符串中删除所有 html 标签,包括 ' '?
How to remove all html tags including ' ' from string?
我在我的一个模块中使用了 CKEDITOR。它使用 HTML 标签存储数据,如下所示:
<p>Lorem Ipsum&nbsp;is simply dummy text of the printing and
typesetting industry.Lorem Ipsum has
been the industry&#39;s standard
dummy text ever since the 1500s, when
an unknown printer took a galley of
type and scrambled it to make a type
specimen book. It has survived not
only five centuries, but also the leap
into electronic typesetting,remaining
essentially unchanged. It was
popularised in the 1960s with the
release of Letraset sheets containing
Lorem Ipsum passages, and more
recently with desktop publishing
software like Aldus PageMaker
including versions of Lorem Ipsum.
</p>\n\n<p>
</p>\n\n<p>TItle </p>\n
我尝试使用此正则表达式转换为纯文本:
str.replace(/(<([^>]+)>)/ig ,'');
但是我没有得到预期的输出。
我想要这个输出:
'Lorem Ipsum & is simply dummy text of the printing and typeseting industry.Lorem Ipsum 已被行业 &'s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting,remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.TItle.'
注意: 此正则表达式删除除 "\n ,  " 之外的所有 html 标签。所以请帮帮我...如何从字符串中也删除“\n, ”?
文本看起来是double-escaped,有点——先把所有的&
都变成&
,这样HTML实体才能被正确识别。然后 .text()
将为您提供 HTML 标记的纯文本版本。
const input = `<p>Lorem Ipsum&nbsp;is simply dummy text of the printing and typesetting industry.Lorem Ipsum has been the industry&#39;s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting,remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>\n\n<p> </p>\n\n<p>TItle </p>\n`;
const inputWithProperEntities = input.replaceAll('&', '&');
console.log($(inputWithProperEntities).text());
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
\n
不是 HTML 标签,而是换行符的表示。如果你也想删除所有换行符,那么:
const input = `<p>Lorem Ipsum&nbsp;is simply dummy text of the printing and typesetting industry.Lorem Ipsum has been the industry&#39;s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting,remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>\n\n<p> </p>\n\n<p>TItle </p>\n`;
const inputWithProperEntities = input.replaceAll('&', '&');
console.log($(inputWithProperEntities).text().replaceAll('\n', ''));
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
我在我的一个模块中使用了 CKEDITOR。它使用 HTML 标签存储数据,如下所示:
<p>Lorem Ipsum&nbsp;is simply dummy text of the printing and
typesetting industry.Lorem Ipsum has
been the industry&#39;s standard
dummy text ever since the 1500s, when
an unknown printer took a galley of
type and scrambled it to make a type
specimen book. It has survived not
only five centuries, but also the leap
into electronic typesetting,remaining
essentially unchanged. It was
popularised in the 1960s with the
release of Letraset sheets containing
Lorem Ipsum passages, and more
recently with desktop publishing
software like Aldus PageMaker
including versions of Lorem Ipsum.
</p>\n\n<p>
</p>\n\n<p>TItle </p>\n
我尝试使用此正则表达式转换为纯文本:
str.replace(/(<([^>]+)>)/ig ,'');
但是我没有得到预期的输出。
我想要这个输出:
'Lorem Ipsum & is simply dummy text of the printing and typeseting industry.Lorem Ipsum 已被行业 &'s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting,remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.TItle.'
注意: 此正则表达式删除除 "\n ,  " 之外的所有 html 标签。所以请帮帮我...如何从字符串中也删除“\n, ”?
文本看起来是double-escaped,有点——先把所有的&
都变成&
,这样HTML实体才能被正确识别。然后 .text()
将为您提供 HTML 标记的纯文本版本。
const input = `<p>Lorem Ipsum&nbsp;is simply dummy text of the printing and typesetting industry.Lorem Ipsum has been the industry&#39;s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting,remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>\n\n<p> </p>\n\n<p>TItle </p>\n`;
const inputWithProperEntities = input.replaceAll('&', '&');
console.log($(inputWithProperEntities).text());
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
\n
不是 HTML 标签,而是换行符的表示。如果你也想删除所有换行符,那么:
const input = `<p>Lorem Ipsum&nbsp;is simply dummy text of the printing and typesetting industry.Lorem Ipsum has been the industry&#39;s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting,remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>\n\n<p> </p>\n\n<p>TItle </p>\n`;
const inputWithProperEntities = input.replaceAll('&', '&');
console.log($(inputWithProperEntities).text().replaceAll('\n', ''));
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>