本机 JavaScript 或 ES6 编码和解码 HTML 实体的方式？

Question

是否有一种本地方法可以为 Node.js 编码或解码 HTML entities using JavaScript or ES6? For example, < would be encoded as <. There are libraries like html-entities，但感觉 JavaScript 中应该内置了一些已经可以满足这种常见需求的东西。

Answer 1

JavaScript API 中没有将 ASCII 字符转换为 "html-entities" 等效字符的本机函数。这里有一个beginning of a solution and an easy trick你可能会喜欢

Answer 2

一个使用es6转义的好函数html:

const escapeHTML = str => str.replace(/[&<>'"]/g, 
  tag => ({
      '&': '&amp;',
      '<': '&lt;',
      '>': '&gt;',
      "'": '&#39;',
      '"': '&quot;'
    }[tag]));

Answer 3

自己动手^{（注意 - 大多数用例都使用 HE）}

对于没有lib的纯JS，你可以Encode and Decode HTML entities using pure Javascript这样：

let encode = str => {
  let buf = [];

  for (var i = str.length - 1; i >= 0; i--) {
    buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
  }

  return buf.join('');
}

let decode = str => {
  return str.replace(/&#(\d+);/g, function(match, dec) {
    return String.fromCharCode(dec);
  });
}

用法:

encode("Hello > © <") // "&#72;&#101;&#108;&#108;&#111;&#32;&#62;&#32;&#169;&#32;&#60;"
decode("Hello &gt; &copy; &#169; &lt;") // "Hello &gt; &copy; © &lt;"

但是，您可以看到这种方法有几个缺点：

它甚至可以编码安全字符 H → H
它可以解码数字代码（不是在星光层），但对 full list of html entities / named character codes supported by browsers 一无所知，比如 >

使用 HE Library（Html 实体）

支持all standardized named character references
支持 unicode
适用于 ambiguous ampersands
作者Mathias Bynens

用法:

he.encode('foo © bar ≠ baz  qux'); 
// Output : 'foo &#xA9; bar &#x2260; baz &#x1D306; qux'

he.decode('foo &copy; bar &ne; baz &#x1D306; qux');
// Output : 'foo © bar ≠ baz  qux'

相关问题

How to convert characters to HTML entities using plain JavaScript
Encode html entities in javascript
Unescape HTML entities in Javascript?
HTML Entity Decode
What's the right way to decode a string that has special HTML entities in it?
Strip HTML from Text JavaScript

Answer 4

致 unescape HTML 个实体，您的浏览器很聪明，会为您完成

方式1

_unescape(html: string) :string { 
   const divElement = document.createElement("div");
   divElement.innerHTML = html;
   return divElement.textContent || tmp.innerText || "";
}

方式2

_unescape(html: string) :string {
     let returnText = html;
     returnText = returnText.replace(/&nbsp;/gi, " ");
     returnText = returnText.replace(/&amp;/gi, "&");
     returnText = returnText.replace(/&quot;/gi, `"`);
     returnText = returnText.replace(/&lt;/gi, "<");
     returnText = returnText.replace(/&gt;/gi, ">");
     return returnText;
}

您也可以使用 underscore or lodash 的 unescape 方法，但这会忽略   并仅处理 &、<、>、"，和 ' 个字符。

Answer 5

@rasafel 提供的答案（编码）的反向（解码）：

const decodeEscapedHTML = (str) =>
  str.replace(
    /&(\D+);/gi,
    (tag) =>
      ({
        '&amp;': '&',
        '&lt;': '<',
        '&gt;': '>',
        '&#39;': "'",
        '&quot;': '"',
      }[tag]),
  )

本机 JavaScript 或 ES6 编码和解码 HTML 实体的方式？

Native JavaScript or ES6 way to encode and decode HTML entities?

javascript

html-entities

node.js

自己动手^{（注意 - 大多数用例都使用 HE）}

使用 HE Library（Html 实体）

相关问题

本机 JavaScript 或 ES6 编码和解码 HTML 实体的方式？

Native JavaScript or ES6 way to encode and decode HTML entities?

javascript

html-entities

node.js

自己动手（注意 - 大多数用例都使用 HE）

使用 HE Library（Html 实体）

相关问题

自己动手^{（注意 - 大多数用例都使用 HE）}