Talend 以一种奇怪的格式提取 JSON
Talend extract JSON with a strange format
我在使用 Talend 时遇到问题:我必须提取一种非常奇怪的 JSON 格式,它看起来像:
{"results":[{"id":0,"series":[{"name":"table1","columns":["column1","column2","column3","column4"],"values":[["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"]]}]}]}
实际上,在“series”对象中我们有“columns”对象,其中包含各列的名称,“values”对象包含各行的值。
所需的输出将是具有更正常格式的 table/csv/json,因此字段和值。
有谁知道我该怎么做?到目前为止,我已经尝试提取各种 JSON 字段,但输出如下:
Columns
Column1
Column2
Column3
Column4
values
["Value1","Value2","Value3","Value4"]
["Value1","Value2","Value3","Value4"]
["Value1","Value2","Value3","Value4"]
["Value1","Value2","Value3","Value4"]
(对于这个,我想我可能必须提取另一个 JSON 字段)。
感谢大家
PS。我在 post
中添加了 Talend
您没有指定任何语言,所以我想任何语言都可以玩?这个PHP脚本
<?php
$js=<<<'JS'
{
"results": [{
"id": 0,
"series": [{
"name": "table1",
"columns": ["column1", "column2", "column3", "column4"],
"values": [
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"]
]
}]
}]
}
JS;
$data=json_decode($js,true);
$extracted=array();
foreach($data['results'] as $result){
foreach($result['series'] as $serie){
foreach($serie['values'] as $values){
$extract=[];
foreach($values as $valueKey=>$value){
$extract[$serie["columns"][$valueKey]]=$value;
}
$extracted[]=$extract;
}
}
}
echo json_encode($extracted,JSON_PRETTY_PRINT);
产出
[
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
}
]
这是将结果作为 csv 文件获取的解决方案。
我使用 tFixedFlowInput_1 和 tFixedFlowInput_3 作为您示例中 json 的输入。
tExtractJSONFields_1 从列数组中提取各个列,然后将其非规范化到一个文件中。
tExtractJSONFields_2 将值提取为数组,然后对于每个值,我们使用 tExtractJSONFields_3 提取单个值,并且我们对每组值进行非规范化以获得 [=34= 中的 csv 行](以附加模式写入上一个文件)。
最终结果如下所示:
column1,column2,column3,column4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
我用逗号作为分隔符,可以在tDenormalize_1和tDenormalize_2
中更改
我在使用 Talend 时遇到问题:我必须提取一种非常奇怪的 JSON 格式,它看起来像:
{"results":[{"id":0,"series":[{"name":"table1","columns":["column1","column2","column3","column4"],"values":[["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"],["Value1","Value2","Value3","Value4"]]}]}]}
实际上,在“series”对象中我们有“columns”对象,其中包含各列的名称,“values”对象包含各行的值。 所需的输出将是具有更正常格式的 table/csv/json,因此字段和值。 有谁知道我该怎么做?到目前为止,我已经尝试提取各种 JSON 字段,但输出如下:
Columns
Column1
Column2
Column3
Column4
values
["Value1","Value2","Value3","Value4"]
["Value1","Value2","Value3","Value4"]
["Value1","Value2","Value3","Value4"]
["Value1","Value2","Value3","Value4"]
(对于这个,我想我可能必须提取另一个 JSON 字段)。
感谢大家
PS。我在 post
中添加了 Talend您没有指定任何语言,所以我想任何语言都可以玩?这个PHP脚本
<?php
$js=<<<'JS'
{
"results": [{
"id": 0,
"series": [{
"name": "table1",
"columns": ["column1", "column2", "column3", "column4"],
"values": [
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"],
["Value1", "Value2", "Value3", "Value4"]
]
}]
}]
}
JS;
$data=json_decode($js,true);
$extracted=array();
foreach($data['results'] as $result){
foreach($result['series'] as $serie){
foreach($serie['values'] as $values){
$extract=[];
foreach($values as $valueKey=>$value){
$extract[$serie["columns"][$valueKey]]=$value;
}
$extracted[]=$extract;
}
}
}
echo json_encode($extracted,JSON_PRETTY_PRINT);
产出
[
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
},
{
"column1": "Value1",
"column2": "Value2",
"column3": "Value3",
"column4": "Value4"
}
]
这是将结果作为 csv 文件获取的解决方案。
我使用 tFixedFlowInput_1 和 tFixedFlowInput_3 作为您示例中 json 的输入。
tExtractJSONFields_1 从列数组中提取各个列,然后将其非规范化到一个文件中。
tExtractJSONFields_2 将值提取为数组,然后对于每个值,我们使用 tExtractJSONFields_3 提取单个值,并且我们对每组值进行非规范化以获得 [=34= 中的 csv 行](以附加模式写入上一个文件)。
最终结果如下所示:
column1,column2,column3,column4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
Value1,Value2,Value3,Value4
我用逗号作为分隔符,可以在tDenormalize_1和tDenormalize_2
中更改