在单引号和双引号内捕获 __('<string>')

catching __('<string>') inside simple and double quote

我使用一个函数 __() 来翻译字符串,我添加了一个界面来自动查找所有文件中的所有这些翻译。这是(应该)使用以下正则表达式完成的:

<?php
$pattern = <<<'LOD'
`
  __\(
    (?<quote>               # GET THE QUOTE
    (?<simplequote>')       # catch the opening simple quote
    |
    (?<doublequote>")       # catch the opening double quote
    )
    (?<param1>              # the string will be saved in param1
      (?(?=\k{simplequote}) # if condition "simplequote" is ok
        (\'|"|[^'"])+      # allow escaped simple quotes or anything else
        |                   #
        (\"|'|[^'"])+      # allow escaped double quotes or anything else
      )
    )
    \k{quote}             # find the closing quote
    (?:,.*){0,1}          # catch any type of 2nd parameter
  \)
  # modifiers:
  #  x to allow comments :)
  #  m for multiline,
  #  s for dotall
  #  U for ungreedy
`smUx
LOD;
 $files = array('/path/to/file1',);
 foreach($files as $filepath)
 {
   $content = file_get_contents($filepath);
   if (preg_match_all($pattern, $content, $matches))
   {
     foreach($matches['param1'] as $found)
     {
       // do things
     }
   }
 }

正则表达式不适用于某些包含转义单引号的双引号字符串 (\')。事实上,无论字符串是简单的还是双引号的,条件都被认为是假的,所以总是使用 "else"。

<?php
// content of '/path/to/file1'
echo __('simple quoted: I don\'t "see" what is wrong'); // do not work.
echo __("double quoted: I don't \"see\" what is wrong");// works.

对于文件 1,我希望找到两个字符串,但只有双引号有效

编辑 添加了更多php 代码以便于测试

使用下面的正则表达式并从组索引 2 中获取所需的字符串。

__\((['"])((?:\|(?!).)*)\)

DEMO

解释:

  • __\( 匹配文字 __( 个字符。

  • (['"]) 捕获以下双引号或单引号。

  • (?:\|(?!).)* 匹配转义的双引号或单引号(引号基于组索引 1 内的字符)或 | 不是捕获组中出现的字符 (?!). 零次或多次。

  • 指的是第一个捕获组里面的字符。

Avinash Raj 的解决方案更优雅并且可能更高效(因此我验证了它),但我刚刚发现了我的错误,所以我post这里的解决方案:

<?php
$pattern = <<<'LOD'
`
  __\(
    (?<quote>               # GET THE QUOTE
    (?<simplequote>')       # catch the opening simple quote
    |
    (?<doublequote>")       # catch the opening double quote
    )
    (?<param1>              # the string will be saved in param1
      (?(simplequote)       # if condition "simplequote" 
        (\'|[^'])+         # allow escaped simple quotes or anything else
        |                   #
        (\"|[^"])+         # allow escaped double quotes or anything else
      )
    )
    \k{quote}               # find the closing quote
    (?:,.*){0,1}            # catch any type of 2nd parameter
  \)
  # modifiers:
  #  x to allow comments :)
  #  m for multiline,
  #  s for dotall
  #  U for ungreedy
`smUx
LOD;