在这种情况下,为什么 perl 正则表达式匹配 2 个模式而不是 1 个?
Why is perl regex matching 2 patterns instead of 1 in this case?
我正在尝试匹配 json 文件中的这个重复模式:
{
"date":1568381400,
"open":301.7799987792969,
"high":302.1700134277344,
"low":300.67999267578125,
"close":301.0899963378906,
"volume":61426700,
"adjclose":301.0899963378906
}
注:以上为格式化版本。实际的 json 都是一行(删除了可选的空格)。
有一堆用逗号隔开,没有空格。我使用以下代码:
while ( $Page =~ /{"date":(.+?),.+?"high":(.+?),"low":(.+?),"close":(.+?),"volume":(.+?),.+?"adjclose":(.+?)}/g )
正则表达式 returns 每个此类调用的模式的两个示例。例如 $& returns:
at84: matched:{"date":1623182400,"open":91.48999786376953,"high":92.379997253417
97,"low":90.77999877929688,"close":92,"volume":15404,"adjclose":92},{"date":1623072600,"open":89.80999755859375,"high":91.3499984741211,"low":89.80999755859375,"close":90.75,"volume":36200,"adjclose":90.75}
它绝不会恰好匹配 2 个模式。
我试过添加“?”在模式的末尾,什么都不做。
我想我可以更改循环以索引逗号或 {} 块,但这会增加一层我想避免的混乱。
有人有什么建议吗?
您的正则表达式强制要求不存在:
"date":(.+?),.+?"high":(.+?),"low":(.+?),"close":(.+?),"volume":(.+?),.+?"adjclose":(.+?)
↑
│
"+" requires characters but there are none ─────┘
您的输入在“volume”和“adjclose”之后的逗号之间没有字符,因此它必须一直消耗输入直到 next 预期匹配结束匹配。
变化:
"volume":(.+?),.+?"adjclose":(.+?)
收件人:
"volume":(.+?),.*?"adjclose":(.+?)
我会将每个 (.+?)
更改为 (.*?)
。
有问题的OP指的是JSON格式,那么也许输入数据可以用use JSON
模块处理。该问题没有提供有关输入数据的足够信息;
请检查以下代码片段是否符合您的问题。
注意:已发布问题的输入数据子集非常有限
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my %data;
my $input = do { local $/; <DATA> };
my($symbol,$block) = $input =~ /(.+?): matched:(.*)/;
for ( $block =~ /{(.*?)}/g ) {
my %day = $_ =~ /"(.+?)":([\d\.]+)/g;
$day{date} = localtime($day{date});
push @{$data{$symbol}}, \%day;
}
say Dumper(\%data);
exit 0;
__DATA__
at84: matched:{"date":1623182400,"open":91.48999786376953,"high":92.37999725341797,"low":90.77999877929688,"close":92,"volume":15404,"adjclose":92},{"date":1623072600,"open":89.80999755859375,"high":91.3499984741211,"low":89.80999755859375,"close":90.75,"volume":36200,"adjclose":90.75}
输出
$VAR1 = {
'at84' => [
{
'low' => '90.77999877929688',
'volume' => '15404',
'date' => 'Tue Jun 8 16:00:00 2021',
'high' => '92.37999725341797',
'adjclose' => '92',
'open' => '91.48999786376953',
'close' => '92'
},
{
'adjclose' => '90.75',
'open' => '89.80999755859375',
'close' => '90.75',
'low' => '89.80999755859375',
'volume' => '36200',
'date' => 'Mon Jun 7 09:30:00 2021',
'high' => '91.3499984741211'
}
]
};
我正在尝试匹配 json 文件中的这个重复模式:
{
"date":1568381400,
"open":301.7799987792969,
"high":302.1700134277344,
"low":300.67999267578125,
"close":301.0899963378906,
"volume":61426700,
"adjclose":301.0899963378906
}
注:以上为格式化版本。实际的 json 都是一行(删除了可选的空格)。
有一堆用逗号隔开,没有空格。我使用以下代码:
while ( $Page =~ /{"date":(.+?),.+?"high":(.+?),"low":(.+?),"close":(.+?),"volume":(.+?),.+?"adjclose":(.+?)}/g )
正则表达式 returns 每个此类调用的模式的两个示例。例如 $& returns:
at84: matched:{"date":1623182400,"open":91.48999786376953,"high":92.379997253417
97,"low":90.77999877929688,"close":92,"volume":15404,"adjclose":92},{"date":1623072600,"open":89.80999755859375,"high":91.3499984741211,"low":89.80999755859375,"close":90.75,"volume":36200,"adjclose":90.75}
它绝不会恰好匹配 2 个模式。
我试过添加“?”在模式的末尾,什么都不做。
我想我可以更改循环以索引逗号或 {} 块,但这会增加一层我想避免的混乱。
有人有什么建议吗?
您的正则表达式强制要求不存在:
"date":(.+?),.+?"high":(.+?),"low":(.+?),"close":(.+?),"volume":(.+?),.+?"adjclose":(.+?)
↑
│
"+" requires characters but there are none ─────┘
您的输入在“volume”和“adjclose”之后的逗号之间没有字符,因此它必须一直消耗输入直到 next 预期匹配结束匹配。
变化:
"volume":(.+?),.+?"adjclose":(.+?)
收件人:
"volume":(.+?),.*?"adjclose":(.+?)
我会将每个 (.+?)
更改为 (.*?)
。
有问题的OP指的是JSON格式,那么也许输入数据可以用use JSON
模块处理。该问题没有提供有关输入数据的足够信息;
请检查以下代码片段是否符合您的问题。
注意:已发布问题的输入数据子集非常有限
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
my %data;
my $input = do { local $/; <DATA> };
my($symbol,$block) = $input =~ /(.+?): matched:(.*)/;
for ( $block =~ /{(.*?)}/g ) {
my %day = $_ =~ /"(.+?)":([\d\.]+)/g;
$day{date} = localtime($day{date});
push @{$data{$symbol}}, \%day;
}
say Dumper(\%data);
exit 0;
__DATA__
at84: matched:{"date":1623182400,"open":91.48999786376953,"high":92.37999725341797,"low":90.77999877929688,"close":92,"volume":15404,"adjclose":92},{"date":1623072600,"open":89.80999755859375,"high":91.3499984741211,"low":89.80999755859375,"close":90.75,"volume":36200,"adjclose":90.75}
输出
$VAR1 = {
'at84' => [
{
'low' => '90.77999877929688',
'volume' => '15404',
'date' => 'Tue Jun 8 16:00:00 2021',
'high' => '92.37999725341797',
'adjclose' => '92',
'open' => '91.48999786376953',
'close' => '92'
},
{
'adjclose' => '90.75',
'open' => '89.80999755859375',
'close' => '90.75',
'low' => '89.80999755859375',
'volume' => '36200',
'date' => 'Mon Jun 7 09:30:00 2021',
'high' => '91.3499984741211'
}
]
};