在 preg_replace_callback 中获取到当前匹配项的子字符串

Get a substring up to the current match inside preg_replace_callback

我简化了这个问题,因为它变得很长。基本上我想获得 $subject 的子字符串,它从 $subject 的开始到当前匹配回调函数是 运行。这是一些输入的示例 (javascript):

$subject = "var myUrl = '<a href=\"http://google.co.uk\">click me</a>';";

我在 preg_replace_callback 中使用 url 匹配正则表达式,因此它会匹配 http://google.co.uk。我想获得 $subject 的子字符串,直到该匹配开始: var myUrl = '<a href=" 应该包含在子字符串中。我该怎么做?

$subject = "var myUrl = '<a href=\"http://google.co.uk\">click me</a>';";
preg_replace_callback("MY URL MATCHING PATTERN", function($matches) {
  // Get length of $subject up to the current match
  $length = ?; // this is the bit I can't work out
  // Get substring
  $before = substr($subject, 0, $length);
  // Work out whether or not to escape the single quotes
  $quotes = array();
  preg_match_all("/'/", $before, $quotes);
  $quotecount = count($quotes);
  $escape = ($quotecount % 2 == 0 ? "" : "\");
  // Return the binary value
  return "javascript:parent.query(".$escape."'".textToBinary($matches[0]).$escape."')";
}, $subject);

- 首先,我建议使用 DOM 功能,例如 PHP DOMDocument or DOMXPath.

-其次,最好修改你的正则表达式。 (\S 是罪魁祸首)

- 第三,快速解决您的问题的方法是:

return "javascript:open('".str_replace("'", "\'", $matches[0])."')";

已更新:

$subject = "var myUrl = '<a href=\"http://google.co.uk\">click me</a>';";

$pattern = "@(https?://([-\w\.]+)+(:\d+)?(/([-\w/_\.]*(\?\S+)?)?)?)@";
$result = preg_replace_callback($pattern, function($matches) use ($subject) {
  $pos = strpos($subject, $matches[0]);
  $str = substr($subject, 0, $pos);
  $escape = (strpos($str, "'") == false) ? "'" : "\'";
  return "javascript:parent.query({$escape}".textToBinary($matches[0])."{$escape})";
}, $subject);