preg 匹配多行文本以字符串结尾

Question

我有一些 html 文本也包含常规文本，我需要将其删除。

文本示例为

<h3>title</h3>
this is some text
This some more

This more more, Continue reading.

我尝试了一百万种组合，我只是没有得到 preg 的东西，我正在使用 regex101 网站测试每一个，我所有的测试都没有用。

像这样尝试过（以及我认为的其他数百万种组合）

<h3>.*?<\/h3>.*?{$\'Continue reading.\'}

我只是没有得到 preg 的奇怪堆积，我的意思是我知道我需要从 <h3> 开始并在“继续阅读”的第一个字符串结束。在 <h3> 之后，但如何在多行中匹配它超出了我的范围。

有人可以做例子并解释为什么吗？我阅读了有关多行等的内容，但对我没有任何用处，我看不到它..

谢谢

Answer 1

要删除整个字符串（或实际上的任何字符串），您可以使用 preg_replace 并将第二个参数设置为 ""；此外，您需要使用 s modifier/flag 来使 . 选择器匹配新行（或使用其他匹配 anything 的选择器）。

正则表达式

/<h3>.*?Continue reading\./s
/                            : Starting delimiter
 <h3>                        : Matches a literal <h3>
     .*?                     : Non-greedy match any character 0 or more times
        Continue reading     : Matches literally the text "Continue reading"
                        \.   : Match a literal period/full-stop
                          /  : Ending delimiter
                           s : Modifier/flag to make `.` match new lines

代码示例

在此示例中，我们使用 "Erased" 而不是 ""，以明确发生了什么：

$string = "<h3>title</h3>
this is some text
This some more

This more more, Continue reading.";

$string = preg_replace('/<h3>.*?Continue reading\./s', "Erased", $string);

echo $string; // Output: Erased

使用""而不是"Erased"来完全删除文本：

$string = preg_replace('/<h3>.*?Continue reading\./s', "", $string);

preg 匹配多行文本以字符串结尾

preg match multiline text ends with a string

php

regex

multiline

preg-match