逐块解析 JSON

Parsing JSON block by block

我有一个包含客户和日期列表的 JSON 文件。

文件如下所示:

{
"Customers": [
{
  "Customer": "Customer Name Here",
  "Company": "Super Coffee",
  "First Name": "First Name Here",
  "Main Phone": "777-777-7777",
  "Fax": "777-777-7777",
  "Bill to 1": "Billing Address One",
  "Bill to 2": "Billing Address Two",
  "Bill to 3": "Billing Address Three",
  "Ship to 1": "Shipping Address One",
  "Ship to 2": "Shipping Address Two",
  "Ship to 3": "Shipping Address Three",
  "Customer Type": "Dealer/Retail"
},
{
  "Customer": "Customer Name Here",
  "Company": "Turtle Mountain Welding",
  "First Name": "First Name Here",
  "Main Phone": "777-777-7777",
  "Fax": "777-777-7777",
  "Bill to 1": "Billing Address One",
  "Bill to 2": "Billing Address Two",
  "Bill to 3": "Billing Address Three",
  "Ship to 1": "Shipping Address One",
  "Ship to 2": "Shipping Address Two",
  "Ship to 3": "Shipping Address Three",
  "Customer Type": "Dealer/Retail"
},
{
  "Customer": "Customer Name Here",
  "Company": "Mountain Equipment Coop",
  "First Name": "First Name Here",
  "Main Phone": "777-777-7777",
  "Fax": "777-777-7777",
  "Bill to 1": "Billing Address One",
  "Bill to 2": "Billing Address Two",
  "Bill to 3": "Billing Address Three",
  "Ship to 1": "Shipping Address One",
  "Ship to 2": "Shipping Address Two",
  "Ship to 3": "Shipping Address Three",
  "Customer Type": "Dealer/Retail"
},
{
  "Customer": "Customer Name Here",
  "Company": "Best Soup Inc.",
  "First Name": "First Name Here",
  "Main Phone": "777-777-7777",
  "Fax": "777-777-7777",
  "Bill to 1": "Billing Address One",
  "Bill to 2": "Billing Address Two",
  "Bill to 3": "Billing Address Three",
  "Ship to 1": "Shipping Address One",
  "Ship to 2": "Shipping Address Two",
  "Ship to 3": "Shipping Address Three",
  "Customer Type": "Dealer/Retail"
}
]
}

我需要能够逐块而不是逐行地从文件中提取数据。

我习惯于逐行解析文件来获取数据,但是使用 JSON,我需要以某种方式逐块读取它(或者更准确地说,逐个对象?)。我需要为每个客户阅读括号内的内容。这样我就可以编写一个脚本来提取我需要的数据,并从中构建一个 CSV 文件。

例如:

i="1"
for file in *.json; do
     customername=$(jsonblock$i:customername);
     customerAddress=$(jsonblock$i:customeraddress);
     etc...
     i=$[i+1]
done

我理解逐行读取文件时是如何完成的,但是我如何才能读取每个 JSON 块,就好像它是一行一样?

使用 perl 和 JSON 库,您可以逐步解析 JSON 列表中的每个项目,但您需要修改 json 以便它实际上不是 json 而不是用逗号分隔的 json 个对象的列表。

#!/usr/bin/perl
use strict;
use warnings;
use feature qw(say);
use JSON;
my $json = JSON->new;
while (<>) {
    my $obj_or_undef = eval { $json->incr_parse( $_ ); };
    # Wait until its found a whole object
    if (ref $obj_or_undef) {
        say join ",", map {$obj_or_undef->{$_}} sort keys %$obj_or_undef;
    }
}

对于customers.json(不再是json):

{ 
    "some key" : "some value"
} {
    "other key" : "other value"
}

至运行:

$ perl demo.pl < customers.json
some value
other value
$ perl demo.pl < customers.json > customer.csv

在上述 JSON 的情况下(由于提供的数据无效而被修改)以下脚本将解析并打印每个块的 "Company:" 部分:

#!/usr/bin/env perl

use JSON;   
use IO::All;     
use v5.16;

my $data < io 'Our_Customers.json';
my $customers_list = decode_json($data)->{"Customers"};                

for my $customer (@$customers_list) {
   say $customer->{"Company"} ;
}

输出:

Super Coffee
Turtle Mountain Welding
Mountain Equipment Coop
Best Soup Inc.

脚本使用 IO::All and JSON 读取和解析 (decode_json) 文件。

在这个例子中,JSON 数据被简单地映射到一个 Perl 数据结构(Array of Hashes),它与 JSON 数据完全对应。然后我们可以访问每个数组元素(i.e每个hash in the array)然后访问数据通过键名在散列中。 Perl 具有非常灵活的数据处理和访问功能,这使得使用 JSON 数据非常愉快。

每个数据块的键来自 JSON 文件的等效部分。如果我们将一个元素移出数组,它将是一个散列,我们访问可以看到元素的 keysvalues,如下所示:

say for keys shift $customers_list ;

Customer Type
First Name
Bill to 2
Main Phone
...

使用您在 for 循环中看到的 $element->{"key"} 语法访问每个键的值。


最好在将数据发布到 SO 之前验证 JSON 数据 - JSON Lint 类似的服务可以提供帮助。

如果您只是想以 CSV 格式打印 JSON 数据,那么您问错了问题。您应该解析整个 JSON 文档并逐项处理 Customers 数组。

使用 Perl 的 JSON and Text::CSV 模块,看起来像这样

use strict;
use warnings;

use JSON 'from_json';
use Text::CSV ();

my @columns = (
  'Bill to 1',  'Bill to 2',     'Bill to 3', 'Company',
  'Customer',   'Customer Type', 'Fax',       'First Name',
  'Main Phone', 'Ship to 1',     'Ship to 2', 'Ship to 3',
);

my $out_fh = \*STDOUT;
my $json_file = 'customers.json';

my $data = do {
  open my $fh, '<', $json_file or die qq{Unable to open "$json_file" for input: $!};
  local $/;
  from_json(<$fh>);
};
my $customers = $data->{Customers};

my $csv = Text::CSV->new({ eol => $/ });
$csv->print($out_fh, \@columns);

for my $customer ( @$customers ) {
  $csv->print($out_fh, [ @{$customer}{@columns} ]);
}

输出

"Bill to 1","Bill to 2","Bill to 3",Company,Customer,"Customer Type",Fax,"First Name","Main Phone","Ship to 1","Ship to 2","Ship to 3"
"Billing Address One","Billing Address Two","Billing Address Three","Super Coffee","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
"Billing Address One","Billing Address Two","Billing Address Three","Turtle Mountain Welding","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
"Billing Address One","Billing Address Two","Billing Address Three","Mountain Equipment Coop","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
"Billing Address One","Billing Address Two","Billing Address Three","Best Soup Inc.","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"