HTML 文档中文本替换的正则表达式更正
Regex correction for text replacement in HTML document
我有以下正则表达式:
/<(?:textarea|select)[\s\S]*?>[\s\S]*?(\{\{\{variable:(.+?)\}\}\})[\s\S]*?<\/(?:textarea|select)>|<(?:input)[\s\S]+?(value=[\s\S]+?)(\{\{\{variable:(.+?)\}\}\})[\s\S]+?>|(\{\{\{variable:(.+?)\}\}\})/im
还有这个(缩短的)HTML 文档:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Test</title>
</head>
<body>
<section id="about">
<div class="container about-container">
<div class="row">
<div class="col-md-12">
{{{block:welcome-intro}}}
</div>
</div>
</div>
</section>
<section id="services">
<div class="container">
<div class="row">
<div class="col-md-12">
<p>You are using system version: {{{variable:system_version}}}</p>
<p>Your address: {{{variable:contact-email-address}}}</p>
<form action="http://k.loc/content/view/welcome" class="default-form" enctype="multipart/form-data" method="post" accept-charset="utf-8">
<input type="hidden" name="csrfkcmstoken" value="94ee71ada809b9a79d1b723c81020c78" />
<div class="row">
<div class="col-sm-12 form-error"></div>
</div>
<div class="row"><div class="col-sm-12"><fieldset id="personalinfo"><legend>Personal information</legend><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testinput">Name<span class="form-validation-required"> * </span></label>
</div>
<div class="hint-text">Enter at least 2 characters and a maximum of 12 characters.</div><input id="testinput" name="testinput" placeholder="Enter your name here." class="input-group width-50" type="text" value="{{{variable:system_name}}} {{{variable:system_login}}}"><div class="row"><div class="col-sm-12"><div class="form-error"></div></div></div></div></div><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testpassword">Password</label>
</div>
<div class="hint-text">Your password must be at least 12 characters long, contain 1 special character, 1 nunber, 1 lower case character and 1 upper case character.</div><input id="testpassword" name="testpassword" placeholder="Enter your password here." class="input-group width-50" type="password"><div class="row"><div class="col-sm-12"><div class="form-error"></div></div></div></div></div></fieldset></div></div><div class="row"><div class="col-sm-12"><fieldset id="bioinfo"><legend>Biographical information</legend><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testtextarea">Biography</label>
<span class="hint-text">A minimum of 40 characters and a maximum of 255 is allowed. This hint is displayed inline.</span>
</div>
<textarea id="testtextarea" name="testtextarea" placeholder="Please enter your biography here." class="input-group-wide width-100" rows="5" cols="80">{{{variable:system_name}}}
{{{variable:system_login}}}</textarea><div class="row"><div class="col-sm-12"><div class="form-error"></div></div></div></div></div><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testsummernote">Interests</label>
<span class="hint-text">A minimum of 40 characters is required. This hint is displayed inline.</span>
</div>
<textarea id="testsummernote" name="testsummernote" class="wysiwyg-editor" placeholder="Please enter your interests here."><p>{{{variable:system_name}}}<br></p><p>{{{variable:system_login}}}</p><p>{{{variable:activate_url}}}<br></p></textarea></div></div></fieldset></div></div><div class="row"><div class="col-sm-12"><button name="testsubmit" id="testsubmit" type="submit" class="btn primary">Submit<i class="zmdi zmdi-arrow-forward"></i></button></div></div>
</form> </div>
</div>
</div>
</section>
</body>
</html>
解析上述 HTML 文档以查找 {{{variable:whatever}}}
产生此结果:
Array
(
[0] => Array
(
[0] => {{{variable:system_version}}}
[1] => {{{variable:contact-email-address}}}
[2] => <input type="hidden" name="csrfkcmstoken" value="94ee71ada809b9a79d1b723c81020c78" />
<div class="row"><div class="col-sm-12 form-error"></div></div>
<div class="row"><div class="col-sm-12"><fieldset id="personalinfo"><legend>Personal information</legend><div class="row"><div class="col-sm-12">
<div class="control-label"><label for="testinput">Name<span class="form-validation-required"> * </span></label></div>
<div class="hint-text">Enter at least 2 characters and a maximum of 12 characters.</div>
<input id="testinput" name="testinput" placeholder="Enter your name here." class="input-group width-50" type="text" value="{{{variable:system_name}}} {{{variable:system_login}}}">
[3] => <textarea id="testtextarea" name="testtextarea" placeholder="Please enter your biography here." class="input-group-wide width-100" rows="5" cols="80">{{{variable:system_name}}} {{{variable:system_login}}}</textarea>
[4] => <textarea id="testsummernote" name="testsummernote" class="wysiwyg-editor" placeholder="Please enter your interests here."><p>{{{variable:system_name}}}<br></p><p>{{{variable:system_login}}}</p><p>{{{variable:activate_url}}}<br></p></textarea>
)
)
- 索引
[0]
和 [1]
是正确的,因为它们没有出现在 select/textarea/input 标签中。
- 索引
[3]
和[4]
是正确的,因为它们只被一个select/textarea/input标签封装。
我正在学习正则表达式,但仍然不理解所有概念,但我正在变得更好,所以如果我的术语有误,请原谅,但它确实出现了某种贪婪匹配。我希望只在索引 [2]
.
处看到 <input id="testinput"...{{{variable:...}}}">
最终目标是只用不同的数据替换这些占位符,如果它们不在 textarea/select/input。
为什么索引 [2]
会匹配这么多元素,如何解决?
这是不受欢迎的,但我猜这个表达方式可能更接近您的想法,但不太确定:
<(?:textarea|select).*?>.*?(\{\{\{variable:(.*?)\}\}\}).*?<\/(?:textarea|select)>|<(?:input).+?(value=.*?)(\{\{\{variable:(.+?)\}\}\})?.*?>|(\{\{\{variable:(.*?)\}\}\})
还可以改进,比如不需要转义:
<(?:textarea|select).*?>.*?({{{variable:(.*?)}}}).*?</(?:textarea|select)>|<(?:input).+?(value=.*?)({{{variable:(.+?)}}})?.*?>|({{{variable:(.*?)}}})
在这里,我们将尝试为我们的 input
元素添加一个可选组,以便它可以区分具有和不具有现有变量的元素。
Demo
测试
$re = '/<(?:textarea|select).*?>.*?(\{\{\{variable:(.*?)\}\}\}).*?<\/(?:textarea|select)>|<(?:input).+?(value=.*?)(\{\{\{variable:(.+?)\}\}\})?.*?>|(\{\{\{variable:(.*?)\}\}\})/si';
$str = '<section id="services">
<div class="container">
<div class="row">
<div class="col-md-12">
<p>You are using system version: {{{variable:system_version}}}</p>
<p>Your address: {{{variable:contact-email-address}}}</p>
<form action="http://k.loc/content/view/welcome" class="default-form" enctype="multipart/form-data" method="post" accept-charset="utf-8">
<input type="hidden" name="csrfkcmstoken" value="94ee71ada809b9a79d1b723c81020c78" />
<div class="row">
<div class="col-sm-12 form-error"></div>
</div>
<div class="row"><div class="col-sm-12"><fieldset id="personalinfo"><legend>Personal information</legend><div class="row"><div class="col-sm-12">
<div class="control-label">';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
var_dump($matches);
我有以下正则表达式:
/<(?:textarea|select)[\s\S]*?>[\s\S]*?(\{\{\{variable:(.+?)\}\}\})[\s\S]*?<\/(?:textarea|select)>|<(?:input)[\s\S]+?(value=[\s\S]+?)(\{\{\{variable:(.+?)\}\}\})[\s\S]+?>|(\{\{\{variable:(.+?)\}\}\})/im
还有这个(缩短的)HTML 文档:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Test</title>
</head>
<body>
<section id="about">
<div class="container about-container">
<div class="row">
<div class="col-md-12">
{{{block:welcome-intro}}}
</div>
</div>
</div>
</section>
<section id="services">
<div class="container">
<div class="row">
<div class="col-md-12">
<p>You are using system version: {{{variable:system_version}}}</p>
<p>Your address: {{{variable:contact-email-address}}}</p>
<form action="http://k.loc/content/view/welcome" class="default-form" enctype="multipart/form-data" method="post" accept-charset="utf-8">
<input type="hidden" name="csrfkcmstoken" value="94ee71ada809b9a79d1b723c81020c78" />
<div class="row">
<div class="col-sm-12 form-error"></div>
</div>
<div class="row"><div class="col-sm-12"><fieldset id="personalinfo"><legend>Personal information</legend><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testinput">Name<span class="form-validation-required"> * </span></label>
</div>
<div class="hint-text">Enter at least 2 characters and a maximum of 12 characters.</div><input id="testinput" name="testinput" placeholder="Enter your name here." class="input-group width-50" type="text" value="{{{variable:system_name}}} {{{variable:system_login}}}"><div class="row"><div class="col-sm-12"><div class="form-error"></div></div></div></div></div><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testpassword">Password</label>
</div>
<div class="hint-text">Your password must be at least 12 characters long, contain 1 special character, 1 nunber, 1 lower case character and 1 upper case character.</div><input id="testpassword" name="testpassword" placeholder="Enter your password here." class="input-group width-50" type="password"><div class="row"><div class="col-sm-12"><div class="form-error"></div></div></div></div></div></fieldset></div></div><div class="row"><div class="col-sm-12"><fieldset id="bioinfo"><legend>Biographical information</legend><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testtextarea">Biography</label>
<span class="hint-text">A minimum of 40 characters and a maximum of 255 is allowed. This hint is displayed inline.</span>
</div>
<textarea id="testtextarea" name="testtextarea" placeholder="Please enter your biography here." class="input-group-wide width-100" rows="5" cols="80">{{{variable:system_name}}}
{{{variable:system_login}}}</textarea><div class="row"><div class="col-sm-12"><div class="form-error"></div></div></div></div></div><div class="row"><div class="col-sm-12">
<div class="control-label">
<label for="testsummernote">Interests</label>
<span class="hint-text">A minimum of 40 characters is required. This hint is displayed inline.</span>
</div>
<textarea id="testsummernote" name="testsummernote" class="wysiwyg-editor" placeholder="Please enter your interests here."><p>{{{variable:system_name}}}<br></p><p>{{{variable:system_login}}}</p><p>{{{variable:activate_url}}}<br></p></textarea></div></div></fieldset></div></div><div class="row"><div class="col-sm-12"><button name="testsubmit" id="testsubmit" type="submit" class="btn primary">Submit<i class="zmdi zmdi-arrow-forward"></i></button></div></div>
</form> </div>
</div>
</div>
</section>
</body>
</html>
解析上述 HTML 文档以查找 {{{variable:whatever}}}
产生此结果:
Array
(
[0] => Array
(
[0] => {{{variable:system_version}}}
[1] => {{{variable:contact-email-address}}}
[2] => <input type="hidden" name="csrfkcmstoken" value="94ee71ada809b9a79d1b723c81020c78" />
<div class="row"><div class="col-sm-12 form-error"></div></div>
<div class="row"><div class="col-sm-12"><fieldset id="personalinfo"><legend>Personal information</legend><div class="row"><div class="col-sm-12">
<div class="control-label"><label for="testinput">Name<span class="form-validation-required"> * </span></label></div>
<div class="hint-text">Enter at least 2 characters and a maximum of 12 characters.</div>
<input id="testinput" name="testinput" placeholder="Enter your name here." class="input-group width-50" type="text" value="{{{variable:system_name}}} {{{variable:system_login}}}">
[3] => <textarea id="testtextarea" name="testtextarea" placeholder="Please enter your biography here." class="input-group-wide width-100" rows="5" cols="80">{{{variable:system_name}}} {{{variable:system_login}}}</textarea>
[4] => <textarea id="testsummernote" name="testsummernote" class="wysiwyg-editor" placeholder="Please enter your interests here."><p>{{{variable:system_name}}}<br></p><p>{{{variable:system_login}}}</p><p>{{{variable:activate_url}}}<br></p></textarea>
)
)
- 索引
[0]
和[1]
是正确的,因为它们没有出现在 select/textarea/input 标签中。 - 索引
[3]
和[4]
是正确的,因为它们只被一个select/textarea/input标签封装。
我正在学习正则表达式,但仍然不理解所有概念,但我正在变得更好,所以如果我的术语有误,请原谅,但它确实出现了某种贪婪匹配。我希望只在索引 [2]
.
<input id="testinput"...{{{variable:...}}}">
最终目标是只用不同的数据替换这些占位符,如果它们不在 textarea/select/input。
为什么索引 [2]
会匹配这么多元素,如何解决?
这是不受欢迎的,但我猜这个表达方式可能更接近您的想法,但不太确定:
<(?:textarea|select).*?>.*?(\{\{\{variable:(.*?)\}\}\}).*?<\/(?:textarea|select)>|<(?:input).+?(value=.*?)(\{\{\{variable:(.+?)\}\}\})?.*?>|(\{\{\{variable:(.*?)\}\}\})
还可以改进,比如不需要转义:
<(?:textarea|select).*?>.*?({{{variable:(.*?)}}}).*?</(?:textarea|select)>|<(?:input).+?(value=.*?)({{{variable:(.+?)}}})?.*?>|({{{variable:(.*?)}}})
在这里,我们将尝试为我们的 input
元素添加一个可选组,以便它可以区分具有和不具有现有变量的元素。
Demo
测试
$re = '/<(?:textarea|select).*?>.*?(\{\{\{variable:(.*?)\}\}\}).*?<\/(?:textarea|select)>|<(?:input).+?(value=.*?)(\{\{\{variable:(.+?)\}\}\})?.*?>|(\{\{\{variable:(.*?)\}\}\})/si';
$str = '<section id="services">
<div class="container">
<div class="row">
<div class="col-md-12">
<p>You are using system version: {{{variable:system_version}}}</p>
<p>Your address: {{{variable:contact-email-address}}}</p>
<form action="http://k.loc/content/view/welcome" class="default-form" enctype="multipart/form-data" method="post" accept-charset="utf-8">
<input type="hidden" name="csrfkcmstoken" value="94ee71ada809b9a79d1b723c81020c78" />
<div class="row">
<div class="col-sm-12 form-error"></div>
</div>
<div class="row"><div class="col-sm-12"><fieldset id="personalinfo"><legend>Personal information</legend><div class="row"><div class="col-sm-12">
<div class="control-label">';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
var_dump($matches);