PHP 如何使用quot; XML 中包含 DOMdocument 的实体
PHP How to use quot; entities in XML with DOMdocument
我正在修改由其他库生成的 XML 文件的内容。我正在使用 PHP (5.3.10) 进行一些 DOM 修改并重新插入替换节点。
我正在处理的 XML 数据有 "
;在我进行操作之前添加元素,我想在完成修改后按照 http://www.w3.org/TR/REC-xml/ 保留这些元素。
但是我在 PHP 更改 "
元素时遇到问题。看我的例子。
$temp = 'Hello "XML".';
$doc = new DOMDocument('1.0', 'utf-8');
$newelement = $doc->createElement('description', $temp);
$doc->appendChild($newelement);
echo $doc->saveXML() . PHP_EOL; // shows " instead of element
$node = $doc->getElementsByTagName('description')->item(0);
echo $node->nodeValue . PHP_EOL; // also shows "
输出
<?xml version="1.0" encoding="utf-8"?>
<description>Hello "XML".</description>
Hello "XML".
这是 PHP 错误还是我做错了什么?我希望不必在每个字符位置都使用 createEntityReference。
类似问题:
PHP XML Entity Encoding issue
编辑:作为一个例子来展示保存XML不应该像&
那样转换"
实体,它的行为是正确的。这个 $temp 字符串实际上应该被输出,因为它最初是在 saveXML().
期间与实体一起输入的
$temp = 'Hello "XML" &.';
$doc = new DOMDocument('1.0', 'utf-8');
$newelement = $doc->createElement('description', $temp);
$doc->appendChild($newelement);
echo $doc->saveXML() . PHP_EOL; // shows " instead of element like &
$node = $doc->getElementsByTagName('description')->item(0);
echo $node->nodeValue . PHP_EOL; // also shows " &
输出
<?xml version="1.0" encoding="utf-8"?>
<description>Hello "XML" &.</description>
Hello "XML" &.
答案是根据 spec(跳过 CDATA 的提及),它实际上不需要任何转义:
The ampersand character (&) and the left angle bracket (<) must not appear in their literal form (...) If they are needed elsewhere, they must be escaped using either numeric character references or the strings " & "
and " < "
respectively. The right angle bracket (>) may be represented using the string " > "
(...)
To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as " ' "
, and the double-quote character (") as " " "
.
您可以通过使用 createTextNode()
执行正确的转义来轻松验证这一点:
$dom = new DOMDocument;
$e = $dom->createElement('description');
$content = 'single quote: \', double quote: ", opening tag: <, ampersand: &, closing tag: >';
$t = $dom->createTextNode($content);
$e->appendChild($t);
$dom->appendChild($e);
echo $dom->saveXML();
输出:
<?xml version="1.0"?>
<description>single quote: ', double quote: ", opening tag: <, ampersand: &, closing tag: ></description>
我正在修改由其他库生成的 XML 文件的内容。我正在使用 PHP (5.3.10) 进行一些 DOM 修改并重新插入替换节点。
我正在处理的 XML 数据有 "
;在我进行操作之前添加元素,我想在完成修改后按照 http://www.w3.org/TR/REC-xml/ 保留这些元素。
但是我在 PHP 更改 "
元素时遇到问题。看我的例子。
$temp = 'Hello "XML".';
$doc = new DOMDocument('1.0', 'utf-8');
$newelement = $doc->createElement('description', $temp);
$doc->appendChild($newelement);
echo $doc->saveXML() . PHP_EOL; // shows " instead of element
$node = $doc->getElementsByTagName('description')->item(0);
echo $node->nodeValue . PHP_EOL; // also shows "
输出
<?xml version="1.0" encoding="utf-8"?>
<description>Hello "XML".</description>
Hello "XML".
这是 PHP 错误还是我做错了什么?我希望不必在每个字符位置都使用 createEntityReference。
类似问题: PHP XML Entity Encoding issue
编辑:作为一个例子来展示保存XML不应该像&
那样转换"
实体,它的行为是正确的。这个 $temp 字符串实际上应该被输出,因为它最初是在 saveXML().
$temp = 'Hello "XML" &.';
$doc = new DOMDocument('1.0', 'utf-8');
$newelement = $doc->createElement('description', $temp);
$doc->appendChild($newelement);
echo $doc->saveXML() . PHP_EOL; // shows " instead of element like &
$node = $doc->getElementsByTagName('description')->item(0);
echo $node->nodeValue . PHP_EOL; // also shows " &
输出
<?xml version="1.0" encoding="utf-8"?>
<description>Hello "XML" &.</description>
Hello "XML" &.
答案是根据 spec(跳过 CDATA 的提及),它实际上不需要任何转义:
The ampersand character (&) and the left angle bracket (<) must not appear in their literal form (...) If they are needed elsewhere, they must be escaped using either numeric character references or the strings
" & "
and" < "
respectively. The right angle bracket (>) may be represented using the string" > "
(...)To allow attribute values to contain both single and double quotes, the apostrophe or single-quote character (') may be represented as
" ' "
, and the double-quote character (") as" " "
.
您可以通过使用 createTextNode()
执行正确的转义来轻松验证这一点:
$dom = new DOMDocument;
$e = $dom->createElement('description');
$content = 'single quote: \', double quote: ", opening tag: <, ampersand: &, closing tag: >';
$t = $dom->createTextNode($content);
$e->appendChild($t);
$dom->appendChild($e);
echo $dom->saveXML();
输出:
<?xml version="1.0"?>
<description>single quote: ', double quote: ", opening tag: <, ampersand: &, closing tag: ></description>