在这种情况下,为什么 perl 正则表达式匹配 2 个模式而不是 1 个?

Why is perl regex matching 2 patterns instead of 1 in this case?

我正在尝试匹配 json 文件中的这个重复模式:

{ 
    "date":1568381400,
    "open":301.7799987792969,
    "high":302.1700134277344,
    "low":300.67999267578125,
    "close":301.0899963378906,
    "volume":61426700,
    "adjclose":301.0899963378906
}

注:以上为格式化版本。实际的 json 都是一行(删除了可选的空格)。

有一堆用逗号隔开,没有空格。我使用以下代码:

while ( $Page =~ /{"date":(.+?),.+?"high":(.+?),"low":(.+?),"close":(.+?),"volume":(.+?),.+?"adjclose":(.+?)}/g )

正则表达式 returns 每个此类调用的模式的两个示例。例如 $& returns:

at84: matched:{"date":1623182400,"open":91.48999786376953,"high":92.379997253417
97,"low":90.77999877929688,"close":92,"volume":15404,"adjclose":92},{"date":1623072600,"open":89.80999755859375,"high":91.3499984741211,"low":89.80999755859375,"close":90.75,"volume":36200,"adjclose":90.75}

它绝不会恰好匹配 2 个模式。

我试过添加“?”在模式的末尾,什么都不做。

我想我可以更改循环以索引逗号或 {} 块,但这会增加一层我想避免的混乱。

有人有什么建议吗?

您的正则表达式强制要求不存在:

"date":(.+?),.+?"high":(.+?),"low":(.+?),"close":(.+?),"volume":(.+?),.+?"adjclose":(.+?)
                                                                       ↑
                                                                       │
                       "+" requires characters but there are none ─────┘

您的输入在“volume”和“adjclose”之后的逗号之间没有字符,因此它必须一直消耗输入直到 next 预期匹配结束匹配。

变化:

"volume":(.+?),.+?"adjclose":(.+?)

收件人:

"volume":(.+?),.*?"adjclose":(.+?)

我会将每个 (.+?) 更改为 (.*?)

有问题的OP指的是JSON格式,那么也许输入数据可以用use JSON模块处理。该问题没有提供有关输入数据的足够信息;

请检查以下代码片段是否符合您的问题。

注意:已发布问题的输入数据子集非常有限

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my %data;
my $input = do { local $/; <DATA> };

my($symbol,$block) = $input =~ /(.+?): matched:(.*)/;

for ( $block =~ /{(.*?)}/g ) {
    my %day = $_ =~ /"(.+?)":([\d\.]+)/g;
    $day{date} = localtime($day{date});
    push @{$data{$symbol}}, \%day;
}

say Dumper(\%data);

exit 0;

__DATA__
at84: matched:{"date":1623182400,"open":91.48999786376953,"high":92.37999725341797,"low":90.77999877929688,"close":92,"volume":15404,"adjclose":92},{"date":1623072600,"open":89.80999755859375,"high":91.3499984741211,"low":89.80999755859375,"close":90.75,"volume":36200,"adjclose":90.75}

输出

$VAR1 = {
          'at84' => [
                      {
                        'low' => '90.77999877929688',
                        'volume' => '15404',
                        'date' => 'Tue Jun  8 16:00:00 2021',
                        'high' => '92.37999725341797',
                        'adjclose' => '92',
                        'open' => '91.48999786376953',
                        'close' => '92'
                      },
                      {
                        'adjclose' => '90.75',
                        'open' => '89.80999755859375',
                        'close' => '90.75',
                        'low' => '89.80999755859375',
                        'volume' => '36200',
                        'date' => 'Mon Jun  7 09:30:00 2021',
                        'high' => '91.3499984741211'
                      }
                    ]
        };