php preg_replace 怎么不改网址
php preg_replace how not changing urls
ts in html内容应替换为s.
例如:
sanktsya
=> sanksya
这就是它工作正常的方式。
但是 a
或 img
urls
正在改变。
如何不更改网址?
我的代码
$str='<p>sanktsya boshlandi.</p>
<p><img src="https://img.com/Sqstso.jpeg" /></p>';
$result = preg_replace("/([qwrtpsdfghklzxcvbnmQWRTPSDFGHKLZXCVBNM]+)[tT]s/", "${1}s", $str);
正如@nick 所说,这是 DOM 的任务。 DOM 结构由不同的节点(元素、注释、文本节点...)组成。使用 Xpath,您可以按类型寻址节点。
您的字符串看起来像一个 XHTML 片段字符串,所以这里有一个例子:
$xhtml = <<<'XHTML'
<p>sanktsya boshlandi.</p>
<p><img src="https://img.com/Sqstso.jpeg" /></p>
XHTML;
// bootstrap DOM
$document = new DOMDocument();
$xpath = new DOMXpath($document);
// create a fragment node and append XML content
$fragment = $document->createDocumentFragment();
$fragment->appendXML($xhtml);
// iterate any descendant text node
foreach($xpath->evaluate('.//text()', $fragment) as $textNode) {
// modify node text content
$textNode->textContent = preg_replace(
'(([qwrtpsdfghklzxcvbnmQWRTPSDFGHKLZXCVBNM]+)[tT]s)',
's',
$textNode->textContent
);
}
// save XHTML fragment string
echo $document->saveXML($fragment);
使用
$result = preg_replace("/<[^>]*>(*SKIP)(*FAIL)|[qwrtpsdfghklzxcvbnm]+\Kts/i", "s", $str);
参见regex proof。
解释
NODE EXPLANATION
--------------------------------------------------------------------------------
< '<'
--------------------------------------------------------------------------------
[^>]* any character except: '>' (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
> '>'
--------------------------------------------------------------------------------
(*SKIP)(*FAIL) Skip current match, resume search from current position
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[qwrtpsdfghklzxcvbnm any character of: 'q', 'w', 'r', 't', 'p',
]+ 's', 'd', 'f', 'g', 'h', 'k', 'l', 'z',
'x', 'c', 'v', 'b', 'n', 'm' (1 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
\K Match reset operator, discards text from match
--------------------------------------------------------------------------------
ts 'ts'
ts in html内容应替换为s.
例如:
sanktsya
=> sanksya
这就是它工作正常的方式。
但是 a
或 img
urls
正在改变。
如何不更改网址?
我的代码
$str='<p>sanktsya boshlandi.</p>
<p><img src="https://img.com/Sqstso.jpeg" /></p>';
$result = preg_replace("/([qwrtpsdfghklzxcvbnmQWRTPSDFGHKLZXCVBNM]+)[tT]s/", "${1}s", $str);
正如@nick 所说,这是 DOM 的任务。 DOM 结构由不同的节点(元素、注释、文本节点...)组成。使用 Xpath,您可以按类型寻址节点。
您的字符串看起来像一个 XHTML 片段字符串,所以这里有一个例子:
$xhtml = <<<'XHTML'
<p>sanktsya boshlandi.</p>
<p><img src="https://img.com/Sqstso.jpeg" /></p>
XHTML;
// bootstrap DOM
$document = new DOMDocument();
$xpath = new DOMXpath($document);
// create a fragment node and append XML content
$fragment = $document->createDocumentFragment();
$fragment->appendXML($xhtml);
// iterate any descendant text node
foreach($xpath->evaluate('.//text()', $fragment) as $textNode) {
// modify node text content
$textNode->textContent = preg_replace(
'(([qwrtpsdfghklzxcvbnmQWRTPSDFGHKLZXCVBNM]+)[tT]s)',
's',
$textNode->textContent
);
}
// save XHTML fragment string
echo $document->saveXML($fragment);
使用
$result = preg_replace("/<[^>]*>(*SKIP)(*FAIL)|[qwrtpsdfghklzxcvbnm]+\Kts/i", "s", $str);
参见regex proof。
解释
NODE EXPLANATION
--------------------------------------------------------------------------------
< '<'
--------------------------------------------------------------------------------
[^>]* any character except: '>' (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
> '>'
--------------------------------------------------------------------------------
(*SKIP)(*FAIL) Skip current match, resume search from current position
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
[qwrtpsdfghklzxcvbnm any character of: 'q', 'w', 'r', 't', 'p',
]+ 's', 'd', 'f', 'g', 'h', 'k', 'l', 'z',
'x', 'c', 'v', 'b', 'n', 'm' (1 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
\K Match reset operator, discards text from match
--------------------------------------------------------------------------------
ts 'ts'