preg_match 表单并获取字段名称

preg_match form and get name of field

我有多种这样的形式:

$string = '

Form number 1
<form class="form-search" method="post" action="/index.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="pn" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 2
<form class="form-search" method="post" action="/home.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="y" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 3
<form class="form-search" method="post" action="/index.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="x" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 4
<form class="form-search" method="post" action="/contact.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="c" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 5
<form class="form-search" method="post" action="/index.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="v" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 6
<form class="form-search" method="post" action="/index.php?a=v">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="k" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
';

我想:

Preg_match:
START = <form

WHERE action CONTAIN /index.php but nothing after it
EX: action="/index.php" or action="http://whatever.com/index.php"
    can't be action="/index.php?s=w"

FIND name="[A-Za-z]{1}"

END = </form>

对每个表单重复此操作,直到找到(第一个)匹配的表单然后输出 [A-Za-z]{1} 匹配项

代码如下:

$pat = '~<form[^>]+action="[^"]*/(?:index.php)"[^>]*>.*?name="([a-zA-Z]{1})".*?</form>~s';
preg_match($pat,$string,$match);
echo $match[1];

它应该 select 匹配的形式(数字 3)和输出 = x

但我得到输出 = y(表格编号 2)

有什么帮助吗?

谢谢。

XPath 方式:

$dom = new DOMDocument;
@$dom->loadHTML($html);

$xp = new DOMXPath($dom);

$query = '//form[substring(@action, string-length(@action) - 10) = "/index.php"]'
       . '/div/input/@name[string-length(.)=1]';

$nameList = $xp->query($query);

foreach($nameList as $nameNode) {
    $char = $nameNode->nodeValue;
    $ascii = ord(strtolower($char));
    // check if it is a letter with its ascii code
    if ($ascii < 123 && $ascii > 60) {
        $result = $char;
        break;
    }
}

echo $result;

XPath 旨在针对 DOM 树(html 文档的树表示)中的一个或多个元素。所以,//elt1/elt2/elt3 定义了路径(其中 elt1、elt2... 是标签),方括号之间的所有内容都是当前节点的条件。

//    # from everywhere in the DOM tree
form  # a form tag
[     # condition for the current element (the form tag):
      # must have an attribute "action" that ends with "/index.php".
      # In other words: the last 10 characters of the "action" attribute
      # must be "/index.php"
  substring(@action, string-length(@action) - 10) = "/index.php"
]

      # lets continue the path until the name attribute of the input tag
/div/input/@name
      # condition for the name attribute
      # . is the current node, it must be one character length
[string-length(.)=1]'