preg_match 表单并获取字段名称
preg_match form and get name of field
我有多种这样的形式:
$string = '
Form number 1
<form class="form-search" method="post" action="/index.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="pn" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 2
<form class="form-search" method="post" action="/home.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="y" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 3
<form class="form-search" method="post" action="/index.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="x" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 4
<form class="form-search" method="post" action="/contact.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="c" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 5
<form class="form-search" method="post" action="/index.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="v" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 6
<form class="form-search" method="post" action="/index.php?a=v">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="k" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
';
我想:
Preg_match:
START = <form
WHERE action CONTAIN /index.php but nothing after it
EX: action="/index.php" or action="http://whatever.com/index.php"
can't be action="/index.php?s=w"
FIND name="[A-Za-z]{1}"
END = </form>
对每个表单重复此操作,直到找到(第一个)匹配的表单然后输出 [A-Za-z]{1} 匹配项
代码如下:
$pat = '~<form[^>]+action="[^"]*/(?:index.php)"[^>]*>.*?name="([a-zA-Z]{1})".*?</form>~s';
preg_match($pat,$string,$match);
echo $match[1];
它应该 select 匹配的形式(数字 3)和输出 = x
但我得到输出 = y(表格编号 2)
有什么帮助吗?
谢谢。
XPath 方式:
$dom = new DOMDocument;
@$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$query = '//form[substring(@action, string-length(@action) - 10) = "/index.php"]'
. '/div/input/@name[string-length(.)=1]';
$nameList = $xp->query($query);
foreach($nameList as $nameNode) {
$char = $nameNode->nodeValue;
$ascii = ord(strtolower($char));
// check if it is a letter with its ascii code
if ($ascii < 123 && $ascii > 60) {
$result = $char;
break;
}
}
echo $result;
XPath 旨在针对 DOM 树(html 文档的树表示)中的一个或多个元素。所以,//elt1/elt2/elt3
定义了路径(其中 elt1、elt2... 是标签),方括号之间的所有内容都是当前节点的条件。
// # from everywhere in the DOM tree
form # a form tag
[ # condition for the current element (the form tag):
# must have an attribute "action" that ends with "/index.php".
# In other words: the last 10 characters of the "action" attribute
# must be "/index.php"
substring(@action, string-length(@action) - 10) = "/index.php"
]
# lets continue the path until the name attribute of the input tag
/div/input/@name
# condition for the name attribute
# . is the current node, it must be one character length
[string-length(.)=1]'
我有多种这样的形式:
$string = '
Form number 1
<form class="form-search" method="post" action="/index.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="pn" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 2
<form class="form-search" method="post" action="/home.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="y" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 3
<form class="form-search" method="post" action="/index.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="x" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 4
<form class="form-search" method="post" action="/contact.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="c" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 5
<form class="form-search" method="post" action="/index.php">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="v" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 6
<form class="form-search" method="post" action="/index.php?a=v">
<div class="form-group">
<input id="address_box" type="text" class="form-control" name="k" value="" onfocus="this.select()" />
</div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
';
我想:
Preg_match:
START = <form
WHERE action CONTAIN /index.php but nothing after it
EX: action="/index.php" or action="http://whatever.com/index.php"
can't be action="/index.php?s=w"
FIND name="[A-Za-z]{1}"
END = </form>
对每个表单重复此操作,直到找到(第一个)匹配的表单然后输出 [A-Za-z]{1} 匹配项
代码如下:
$pat = '~<form[^>]+action="[^"]*/(?:index.php)"[^>]*>.*?name="([a-zA-Z]{1})".*?</form>~s';
preg_match($pat,$string,$match);
echo $match[1];
它应该 select 匹配的形式(数字 3)和输出 = x
但我得到输出 = y(表格编号 2)
有什么帮助吗?
谢谢。
XPath 方式:
$dom = new DOMDocument;
@$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$query = '//form[substring(@action, string-length(@action) - 10) = "/index.php"]'
. '/div/input/@name[string-length(.)=1]';
$nameList = $xp->query($query);
foreach($nameList as $nameNode) {
$char = $nameNode->nodeValue;
$ascii = ord(strtolower($char));
// check if it is a letter with its ascii code
if ($ascii < 123 && $ascii > 60) {
$result = $char;
break;
}
}
echo $result;
XPath 旨在针对 DOM 树(html 文档的树表示)中的一个或多个元素。所以,//elt1/elt2/elt3
定义了路径(其中 elt1、elt2... 是标签),方括号之间的所有内容都是当前节点的条件。
// # from everywhere in the DOM tree
form # a form tag
[ # condition for the current element (the form tag):
# must have an attribute "action" that ends with "/index.php".
# In other words: the last 10 characters of the "action" attribute
# must be "/index.php"
substring(@action, string-length(@action) - 10) = "/index.php"
]
# lets continue the path until the name attribute of the input tag
/div/input/@name
# condition for the name attribute
# . is the current node, it must be one character length
[string-length(.)=1]'