在单引号和双引号内捕获 __('<string>')
catching __('<string>') inside simple and double quote
我使用一个函数 __()
来翻译字符串,我添加了一个界面来自动查找所有文件中的所有这些翻译。这是(应该)使用以下正则表达式完成的:
<?php
$pattern = <<<'LOD'
`
__\(
(?<quote> # GET THE QUOTE
(?<simplequote>') # catch the opening simple quote
|
(?<doublequote>") # catch the opening double quote
)
(?<param1> # the string will be saved in param1
(?(?=\k{simplequote}) # if condition "simplequote" is ok
(\'|"|[^'"])+ # allow escaped simple quotes or anything else
| #
(\"|'|[^'"])+ # allow escaped double quotes or anything else
)
)
\k{quote} # find the closing quote
(?:,.*){0,1} # catch any type of 2nd parameter
\)
# modifiers:
# x to allow comments :)
# m for multiline,
# s for dotall
# U for ungreedy
`smUx
LOD;
$files = array('/path/to/file1',);
foreach($files as $filepath)
{
$content = file_get_contents($filepath);
if (preg_match_all($pattern, $content, $matches))
{
foreach($matches['param1'] as $found)
{
// do things
}
}
}
正则表达式不适用于某些包含转义单引号的双引号字符串 (\'
)。事实上,无论字符串是简单的还是双引号的,条件都被认为是假的,所以总是使用 "else"。
<?php
// content of '/path/to/file1'
echo __('simple quoted: I don\'t "see" what is wrong'); // do not work.
echo __("double quoted: I don't \"see\" what is wrong");// works.
对于文件 1,我希望找到两个字符串,但只有双引号有效
编辑 添加了更多php 代码以便于测试
使用下面的正则表达式并从组索引 2 中获取所需的字符串。
__\((['"])((?:\|(?!).)*)\)
解释:
__\(
匹配文字 __(
个字符。
(['"])
捕获以下双引号或单引号。
(?:\|(?!).)*
匹配转义的双引号或单引号(引号基于组索引 1 内的字符)或 |
不是捕获组中出现的字符 (?!).
零次或多次。
指的是第一个捕获组里面的字符。
Avinash Raj 的解决方案更优雅并且可能更高效(因此我验证了它),但我刚刚发现了我的错误,所以我post这里的解决方案:
<?php
$pattern = <<<'LOD'
`
__\(
(?<quote> # GET THE QUOTE
(?<simplequote>') # catch the opening simple quote
|
(?<doublequote>") # catch the opening double quote
)
(?<param1> # the string will be saved in param1
(?(simplequote) # if condition "simplequote"
(\'|[^'])+ # allow escaped simple quotes or anything else
| #
(\"|[^"])+ # allow escaped double quotes or anything else
)
)
\k{quote} # find the closing quote
(?:,.*){0,1} # catch any type of 2nd parameter
\)
# modifiers:
# x to allow comments :)
# m for multiline,
# s for dotall
# U for ungreedy
`smUx
LOD;
我使用一个函数 __()
来翻译字符串,我添加了一个界面来自动查找所有文件中的所有这些翻译。这是(应该)使用以下正则表达式完成的:
<?php
$pattern = <<<'LOD'
`
__\(
(?<quote> # GET THE QUOTE
(?<simplequote>') # catch the opening simple quote
|
(?<doublequote>") # catch the opening double quote
)
(?<param1> # the string will be saved in param1
(?(?=\k{simplequote}) # if condition "simplequote" is ok
(\'|"|[^'"])+ # allow escaped simple quotes or anything else
| #
(\"|'|[^'"])+ # allow escaped double quotes or anything else
)
)
\k{quote} # find the closing quote
(?:,.*){0,1} # catch any type of 2nd parameter
\)
# modifiers:
# x to allow comments :)
# m for multiline,
# s for dotall
# U for ungreedy
`smUx
LOD;
$files = array('/path/to/file1',);
foreach($files as $filepath)
{
$content = file_get_contents($filepath);
if (preg_match_all($pattern, $content, $matches))
{
foreach($matches['param1'] as $found)
{
// do things
}
}
}
正则表达式不适用于某些包含转义单引号的双引号字符串 (\'
)。事实上,无论字符串是简单的还是双引号的,条件都被认为是假的,所以总是使用 "else"。
<?php
// content of '/path/to/file1'
echo __('simple quoted: I don\'t "see" what is wrong'); // do not work.
echo __("double quoted: I don't \"see\" what is wrong");// works.
对于文件 1,我希望找到两个字符串,但只有双引号有效
编辑 添加了更多php 代码以便于测试
使用下面的正则表达式并从组索引 2 中获取所需的字符串。
__\((['"])((?:\|(?!).)*)\)
解释:
__\(
匹配文字__(
个字符。(['"])
捕获以下双引号或单引号。(?:\|(?!).)*
匹配转义的双引号或单引号(引号基于组索引 1 内的字符)或|
不是捕获组中出现的字符(?!).
零次或多次。指的是第一个捕获组里面的字符。
Avinash Raj 的解决方案更优雅并且可能更高效(因此我验证了它),但我刚刚发现了我的错误,所以我post这里的解决方案:
<?php
$pattern = <<<'LOD'
`
__\(
(?<quote> # GET THE QUOTE
(?<simplequote>') # catch the opening simple quote
|
(?<doublequote>") # catch the opening double quote
)
(?<param1> # the string will be saved in param1
(?(simplequote) # if condition "simplequote"
(\'|[^'])+ # allow escaped simple quotes or anything else
| #
(\"|[^"])+ # allow escaped double quotes or anything else
)
)
\k{quote} # find the closing quote
(?:,.*){0,1} # catch any type of 2nd parameter
\)
# modifiers:
# x to allow comments :)
# m for multiline,
# s for dotall
# U for ungreedy
`smUx
LOD;