逐块解析 JSON
Parsing JSON block by block
我有一个包含客户和日期列表的 JSON 文件。
文件如下所示:
{
"Customers": [
{
"Customer": "Customer Name Here",
"Company": "Super Coffee",
"First Name": "First Name Here",
"Main Phone": "777-777-7777",
"Fax": "777-777-7777",
"Bill to 1": "Billing Address One",
"Bill to 2": "Billing Address Two",
"Bill to 3": "Billing Address Three",
"Ship to 1": "Shipping Address One",
"Ship to 2": "Shipping Address Two",
"Ship to 3": "Shipping Address Three",
"Customer Type": "Dealer/Retail"
},
{
"Customer": "Customer Name Here",
"Company": "Turtle Mountain Welding",
"First Name": "First Name Here",
"Main Phone": "777-777-7777",
"Fax": "777-777-7777",
"Bill to 1": "Billing Address One",
"Bill to 2": "Billing Address Two",
"Bill to 3": "Billing Address Three",
"Ship to 1": "Shipping Address One",
"Ship to 2": "Shipping Address Two",
"Ship to 3": "Shipping Address Three",
"Customer Type": "Dealer/Retail"
},
{
"Customer": "Customer Name Here",
"Company": "Mountain Equipment Coop",
"First Name": "First Name Here",
"Main Phone": "777-777-7777",
"Fax": "777-777-7777",
"Bill to 1": "Billing Address One",
"Bill to 2": "Billing Address Two",
"Bill to 3": "Billing Address Three",
"Ship to 1": "Shipping Address One",
"Ship to 2": "Shipping Address Two",
"Ship to 3": "Shipping Address Three",
"Customer Type": "Dealer/Retail"
},
{
"Customer": "Customer Name Here",
"Company": "Best Soup Inc.",
"First Name": "First Name Here",
"Main Phone": "777-777-7777",
"Fax": "777-777-7777",
"Bill to 1": "Billing Address One",
"Bill to 2": "Billing Address Two",
"Bill to 3": "Billing Address Three",
"Ship to 1": "Shipping Address One",
"Ship to 2": "Shipping Address Two",
"Ship to 3": "Shipping Address Three",
"Customer Type": "Dealer/Retail"
}
]
}
我需要能够逐块而不是逐行地从文件中提取数据。
我习惯于逐行解析文件来获取数据,但是使用 JSON,我需要以某种方式逐块读取它(或者更准确地说,逐个对象?)。我需要为每个客户阅读括号内的内容。这样我就可以编写一个脚本来提取我需要的数据,并从中构建一个 CSV 文件。
例如:
i="1"
for file in *.json; do
customername=$(jsonblock$i:customername);
customerAddress=$(jsonblock$i:customeraddress);
etc...
i=$[i+1]
done
我理解逐行读取文件时是如何完成的,但是我如何才能读取每个 JSON 块,就好像它是一行一样?
使用 perl 和 JSON 库,您可以逐步解析 JSON 列表中的每个项目,但您需要修改 json 以便它实际上不是 json 而不是用逗号分隔的 json 个对象的列表。
#!/usr/bin/perl
use strict;
use warnings;
use feature qw(say);
use JSON;
my $json = JSON->new;
while (<>) {
my $obj_or_undef = eval { $json->incr_parse( $_ ); };
# Wait until its found a whole object
if (ref $obj_or_undef) {
say join ",", map {$obj_or_undef->{$_}} sort keys %$obj_or_undef;
}
}
对于customers.json(不再是json):
{
"some key" : "some value"
} {
"other key" : "other value"
}
至运行:
$ perl demo.pl < customers.json
some value
other value
$ perl demo.pl < customers.json > customer.csv
在上述 JSON 的情况下(由于提供的数据无效而被修改)以下脚本将解析并打印每个块的 "Company:"
部分:
#!/usr/bin/env perl
use JSON;
use IO::All;
use v5.16;
my $data < io 'Our_Customers.json';
my $customers_list = decode_json($data)->{"Customers"};
for my $customer (@$customers_list) {
say $customer->{"Company"} ;
}
输出:
Super Coffee
Turtle Mountain Welding
Mountain Equipment Coop
Best Soup Inc.
脚本使用 IO::All
and JSON
读取和解析 (decode_json
) 文件。
在这个例子中,JSON 数据被简单地映射到一个 Perl 数据结构(Array of Hashes),它与 JSON 数据完全对应。然后我们可以访问每个数组元素(i.e每个hash in the array)然后访问数据通过键名在散列中。 Perl 具有非常灵活的数据处理和访问功能,这使得使用 JSON 数据非常愉快。
每个数据块的键来自 JSON 文件的等效部分。如果我们将一个元素移出数组,它将是一个散列,我们访问可以看到元素的 keys
和 values
,如下所示:
say for keys shift $customers_list ;
Customer Type
First Name
Bill to 2
Main Phone
...
使用您在 for
循环中看到的 $element->{"key"}
语法访问每个键的值。
最好在将数据发布到 SO 之前验证 JSON 数据 - JSON Lint 类似的服务可以提供帮助。
如果您只是想以 CSV 格式打印 JSON 数据,那么您问错了问题。您应该解析整个 JSON 文档并逐项处理 Customers
数组。
使用 Perl 的 JSON
and Text::CSV
模块,看起来像这样
use strict;
use warnings;
use JSON 'from_json';
use Text::CSV ();
my @columns = (
'Bill to 1', 'Bill to 2', 'Bill to 3', 'Company',
'Customer', 'Customer Type', 'Fax', 'First Name',
'Main Phone', 'Ship to 1', 'Ship to 2', 'Ship to 3',
);
my $out_fh = \*STDOUT;
my $json_file = 'customers.json';
my $data = do {
open my $fh, '<', $json_file or die qq{Unable to open "$json_file" for input: $!};
local $/;
from_json(<$fh>);
};
my $customers = $data->{Customers};
my $csv = Text::CSV->new({ eol => $/ });
$csv->print($out_fh, \@columns);
for my $customer ( @$customers ) {
$csv->print($out_fh, [ @{$customer}{@columns} ]);
}
输出
"Bill to 1","Bill to 2","Bill to 3",Company,Customer,"Customer Type",Fax,"First Name","Main Phone","Ship to 1","Ship to 2","Ship to 3"
"Billing Address One","Billing Address Two","Billing Address Three","Super Coffee","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
"Billing Address One","Billing Address Two","Billing Address Three","Turtle Mountain Welding","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
"Billing Address One","Billing Address Two","Billing Address Three","Mountain Equipment Coop","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
"Billing Address One","Billing Address Two","Billing Address Three","Best Soup Inc.","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
我有一个包含客户和日期列表的 JSON 文件。
文件如下所示:
{
"Customers": [
{
"Customer": "Customer Name Here",
"Company": "Super Coffee",
"First Name": "First Name Here",
"Main Phone": "777-777-7777",
"Fax": "777-777-7777",
"Bill to 1": "Billing Address One",
"Bill to 2": "Billing Address Two",
"Bill to 3": "Billing Address Three",
"Ship to 1": "Shipping Address One",
"Ship to 2": "Shipping Address Two",
"Ship to 3": "Shipping Address Three",
"Customer Type": "Dealer/Retail"
},
{
"Customer": "Customer Name Here",
"Company": "Turtle Mountain Welding",
"First Name": "First Name Here",
"Main Phone": "777-777-7777",
"Fax": "777-777-7777",
"Bill to 1": "Billing Address One",
"Bill to 2": "Billing Address Two",
"Bill to 3": "Billing Address Three",
"Ship to 1": "Shipping Address One",
"Ship to 2": "Shipping Address Two",
"Ship to 3": "Shipping Address Three",
"Customer Type": "Dealer/Retail"
},
{
"Customer": "Customer Name Here",
"Company": "Mountain Equipment Coop",
"First Name": "First Name Here",
"Main Phone": "777-777-7777",
"Fax": "777-777-7777",
"Bill to 1": "Billing Address One",
"Bill to 2": "Billing Address Two",
"Bill to 3": "Billing Address Three",
"Ship to 1": "Shipping Address One",
"Ship to 2": "Shipping Address Two",
"Ship to 3": "Shipping Address Three",
"Customer Type": "Dealer/Retail"
},
{
"Customer": "Customer Name Here",
"Company": "Best Soup Inc.",
"First Name": "First Name Here",
"Main Phone": "777-777-7777",
"Fax": "777-777-7777",
"Bill to 1": "Billing Address One",
"Bill to 2": "Billing Address Two",
"Bill to 3": "Billing Address Three",
"Ship to 1": "Shipping Address One",
"Ship to 2": "Shipping Address Two",
"Ship to 3": "Shipping Address Three",
"Customer Type": "Dealer/Retail"
}
]
}
我需要能够逐块而不是逐行地从文件中提取数据。
我习惯于逐行解析文件来获取数据,但是使用 JSON,我需要以某种方式逐块读取它(或者更准确地说,逐个对象?)。我需要为每个客户阅读括号内的内容。这样我就可以编写一个脚本来提取我需要的数据,并从中构建一个 CSV 文件。
例如:
i="1"
for file in *.json; do
customername=$(jsonblock$i:customername);
customerAddress=$(jsonblock$i:customeraddress);
etc...
i=$[i+1]
done
我理解逐行读取文件时是如何完成的,但是我如何才能读取每个 JSON 块,就好像它是一行一样?
使用 perl 和 JSON 库,您可以逐步解析 JSON 列表中的每个项目,但您需要修改 json 以便它实际上不是 json 而不是用逗号分隔的 json 个对象的列表。
#!/usr/bin/perl
use strict;
use warnings;
use feature qw(say);
use JSON;
my $json = JSON->new;
while (<>) {
my $obj_or_undef = eval { $json->incr_parse( $_ ); };
# Wait until its found a whole object
if (ref $obj_or_undef) {
say join ",", map {$obj_or_undef->{$_}} sort keys %$obj_or_undef;
}
}
对于customers.json(不再是json):
{
"some key" : "some value"
} {
"other key" : "other value"
}
至运行:
$ perl demo.pl < customers.json
some value
other value
$ perl demo.pl < customers.json > customer.csv
在上述 JSON 的情况下(由于提供的数据无效而被修改)以下脚本将解析并打印每个块的 "Company:"
部分:
#!/usr/bin/env perl
use JSON;
use IO::All;
use v5.16;
my $data < io 'Our_Customers.json';
my $customers_list = decode_json($data)->{"Customers"};
for my $customer (@$customers_list) {
say $customer->{"Company"} ;
}
输出:
Super Coffee
Turtle Mountain Welding
Mountain Equipment Coop
Best Soup Inc.
脚本使用 IO::All
and JSON
读取和解析 (decode_json
) 文件。
在这个例子中,JSON 数据被简单地映射到一个 Perl 数据结构(Array of Hashes),它与 JSON 数据完全对应。然后我们可以访问每个数组元素(i.e每个hash in the array)然后访问数据通过键名在散列中。 Perl 具有非常灵活的数据处理和访问功能,这使得使用 JSON 数据非常愉快。
每个数据块的键来自 JSON 文件的等效部分。如果我们将一个元素移出数组,它将是一个散列,我们访问可以看到元素的 keys
和 values
,如下所示:
say for keys shift $customers_list ;
Customer Type
First Name
Bill to 2
Main Phone
...
使用您在 for
循环中看到的 $element->{"key"}
语法访问每个键的值。
最好在将数据发布到 SO 之前验证 JSON 数据 - JSON Lint 类似的服务可以提供帮助。
如果您只是想以 CSV 格式打印 JSON 数据,那么您问错了问题。您应该解析整个 JSON 文档并逐项处理 Customers
数组。
使用 Perl 的 JSON
and Text::CSV
模块,看起来像这样
use strict;
use warnings;
use JSON 'from_json';
use Text::CSV ();
my @columns = (
'Bill to 1', 'Bill to 2', 'Bill to 3', 'Company',
'Customer', 'Customer Type', 'Fax', 'First Name',
'Main Phone', 'Ship to 1', 'Ship to 2', 'Ship to 3',
);
my $out_fh = \*STDOUT;
my $json_file = 'customers.json';
my $data = do {
open my $fh, '<', $json_file or die qq{Unable to open "$json_file" for input: $!};
local $/;
from_json(<$fh>);
};
my $customers = $data->{Customers};
my $csv = Text::CSV->new({ eol => $/ });
$csv->print($out_fh, \@columns);
for my $customer ( @$customers ) {
$csv->print($out_fh, [ @{$customer}{@columns} ]);
}
输出
"Bill to 1","Bill to 2","Bill to 3",Company,Customer,"Customer Type",Fax,"First Name","Main Phone","Ship to 1","Ship to 2","Ship to 3"
"Billing Address One","Billing Address Two","Billing Address Three","Super Coffee","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
"Billing Address One","Billing Address Two","Billing Address Three","Turtle Mountain Welding","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
"Billing Address One","Billing Address Two","Billing Address Three","Mountain Equipment Coop","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"
"Billing Address One","Billing Address Two","Billing Address Three","Best Soup Inc.","Customer Name Here",Dealer/Retail,777-777-7777,"First Name Here",777-777-7777,"Shipping Address One","Shipping Address Two","Shipping Address Three"