使用 Perl 解析 JavaScript:从 JE->parse 获取 "undef"

Parsing JavaScript using Perl: getting "undef" from JE->parse

我有 Javascript 代码,我需要将 javascript 变量解析为 perl 散列。有现成的模块吗?我尝试了 JE::parse() and JavaScript::HashRef::Decode 但都没有用。

预期行为:

use Data::Dumper;
use SomeModule::ParseJSVariables qw/decode_js/;

my $str = qq/
var data = {
    'abc': 1,
    'def' : 2
    'xyz' : { 'foo' : 'bar' }
}
/;

my $res = decode_js($str);
warn Dumper $res; #

# expected result: 
# { 
#   name => 'data', 
#   value => {
#     'abc' => 1,
#     'def' => 2
#     'xyz' => { 'foo' => 'bar' }
#     }
# }


use JE;
my $j = new JE;
my $parsed = $j->parse($str);
warn Dumper $parsed; # undef :(

如果没有现成的模块,如果有人建议正确的正则表达式或解析方法,我将很高兴。

更新,澄清。我有 ~ 千行 javascript 代码,我只需要获取在全局范围内明确给出的变量的内容,例如 var x = { 'foo' : 'bar' }。其他代码可以跳过解析。

我的环境:

$ perl --version

This is perl 5, version 22, subversion 1 (v5.22.1) built for x86_64-linux-gnu-thread-multi

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.5 LTS
Release:    16.04
Codename:   xenial

$ uname -r
4.19.24-041924-generic

好吧,我有好消息和坏消息,它们是同一个消息:)。你的 JS 有语法错误,所以 JE returns undef per the JE docs。具体来说,def 行中缺少 ,。以下测试对我有用:

代码:

use Data::Dumper;

# Note: use q[ ] instead of qq/ /.  q instead of qq so Perl doesn't interpolate
# into the contents, and [ ] instead of / / so that JS comments can appear
# in the block.
my $str = q[
var data = {
    'abc': 1,
    'def' : 2,   // <==== There was a comma missing here!
    'xyz' : { 'foo' : 'bar' }
}
];

use JE;
my $j = new JE;
my $parsed = $j->parse($str);
warn Dumper $parsed;

输出:太大,无法在此处显示 :)。但它确实包括你想要的!

提取输出

这将是一个挑战。希望以下内容能让您入门。

代码:

use Data::Dumper::Compact 'ddc';   # <== for briefer output
use JE;

# Note: use q[ ] instead of qq/ /.  q instead of qq so Perl doesn't interpolate
# into the contents, and [ ] instead of / / so that JS comments can appear
# in the block.
my $str = q[
var data = {
    'abc': 1,
    'def' : 2,   // <==== There was a comma missing here!
    'xyz' : { 'foo' : 'bar' }
}
];

my $j = new JE;
my $parsed = $j->parse($str);
print ddc $parsed->{tree};     # <== {tree} holds the parsed source

输出(带注释):

bless( [
  [
    0,
    118,
  ],
  "statements",
  bless( [
    [
      1,
      118,
    ],
    "var",
    [
      "data",
      bless( [
        [
          12,
          117,
        ],
        "hash",    <== here's where your hash starts
        "abc",     <== 'abc': 1
        1,
        "def",     <== 'def': 2
        2,
        "xyz",     <== 'xyz': nested hash
        bless( [
          [
            98,
            115,
          ],
          "hash",
          "foo",
          "sbar",
        ], 'JE::Code::Expression' ),
      ], 'JE::Code::Expression' ),
    ],
  ], 'JE::Code::Statement' ),
], 'JE::Code::Statement' )

我找到了最简单的解决方案:)

关键思想是使用 JavaScript::V8 or JavaScript::Any 和模拟 console.log 函数在新上下文中执行 javascript 代码。

my $str = qq/
var data = {
    'abc': 1,
    'def' : 2,
    'xyz' : { 'foo' : 'bar' }
};
/;

use Data::Dumper;
use JavaScript::V8;

sub extract_js_glob_var {
    my ( $code, $var_name ) = @_;
    my $res;
    my $context = JavaScript::V8::Context->new();
    $context->eval($str);
    $context->bind( console_log => sub { $res = @_[0] } );
    $context->eval('console_log('.$var_name.')');
    undef $context;
    return $res;
}

warn Dumper extract_js_glob_var($str, 'data');  # 'data.xyz' is also supported ;)