使用 jq 将来自 Json 文件的 table 形式的元素相关联
Relate elements in table form from Json file with jq
我是 jq
的新手,我有以下代码来获取每个名为 Abc
的元素的值列表:
["Abc"], ( .. | objects | select(has("Abc")) | [.["Abc"]] ) | @tsv
这是我得到的当前输出:
"Abc"
"4"
"2"
"1"
"9"
"3"
"2"
"4"
"9"
我想在左侧添加 4 列以显示每个 Abc
值对应的页面、行和列。此外,如果可能的话,在第一列添加一个从 1 到 "Abc" 元素数的计数器。
下面我展示了当前的输出,并与期望的输出和 Json 文件的结构进行了比较,以阐明:
输入的Json文件如下:
{
"document": {
"page": [
{
"@index": "0",
"image": {
"Abc": "4"
}
},
{
"@index": "1",
"row": [
{
"column": [
{
"text": {
"Abc": "2"
}
}
]
},
{
"column": [
{
"text": {
"Abc": "1"
}
},
{
"text": {
"Abc": "9"
}
}
]
},
{
"column": [
{
"text": {
"Abc": "3"
}
}
]
}
]
},
{
"@index": "2",
"row": [
{
"column": [
{
"text": {
"Abc": "2"
}
}
]
},
{
"column": [
{
"text": {
"Abc": "4"
}
},
{
"text": {
"Abc": "9"
}
}
]
}
]
}
]
}
}
我希望有人能帮助我。提前致谢。
输入数据的不规则性使得需求有点不透明,但是下面产生了想要的输出。
["counter", "page", "row", "column", "Abc"],
(foreach (.document.page[] | objects) as $page ({page: -1, counter: 0};
.page += 1
| if ($page | (has("image") and (.image|has("Abc"))))
then
.counter +=1
| .out = [.counter, .page, null, null, ($page|.image.Abc)]
else foreach ($page | .row[]?) as $row (.row=-1;
.row += 1
| foreach ($row | .column[]) as $column (.column=-1;
.column +=1
| foreach ($column | .text | objects) as $x (.;
.counter += 1
| .out = [.counter, .page, .row, .column, $x["Abc"]]
; . )
; . )
; . )
end
; .out )
)
| @tsv
输出
具体来说,使用 -r command-line 选项,给定输入产生的输出如下(包括制表符):
counter page row column Abc
1 0 4
2 1 0 0 2
3 1 1 0 1
4 1 1 1 9
5 1 2 0 3
6 2 0 0 2
7 2 1 0 4
8 2 1 1 9
以下解决方案使用 paths
并具有几个优点,包括简洁、简单,并且它可以很容易地适应不同格式的句柄数据。
为清楚起见,我们首先定义一个添加行号的函数:
# add a sequential id, starting at 1
def tsvRows(s):
foreach s as $s (0; .+1; [.] + $s)
| @tsv;
(["counter", "page", "row", "column", "Abc"] | @tsv),
tsvRows(paths as $p
| select($p[-1] == "Abc")
| getpath($p) as $v
| $p
| .[2] as $page
| (if .[3] == "row" then .[4] else null end) as $row
| (if .[5] == "column" then .[6] else null end) as $column
| [$page, $row, $column, $v] )
我是 jq
的新手,我有以下代码来获取每个名为 Abc
的元素的值列表:
["Abc"], ( .. | objects | select(has("Abc")) | [.["Abc"]] ) | @tsv
这是我得到的当前输出:
"Abc"
"4"
"2"
"1"
"9"
"3"
"2"
"4"
"9"
我想在左侧添加 4 列以显示每个 Abc
值对应的页面、行和列。此外,如果可能的话,在第一列添加一个从 1 到 "Abc" 元素数的计数器。
下面我展示了当前的输出,并与期望的输出和 Json 文件的结构进行了比较,以阐明:
输入的Json文件如下:
{
"document": {
"page": [
{
"@index": "0",
"image": {
"Abc": "4"
}
},
{
"@index": "1",
"row": [
{
"column": [
{
"text": {
"Abc": "2"
}
}
]
},
{
"column": [
{
"text": {
"Abc": "1"
}
},
{
"text": {
"Abc": "9"
}
}
]
},
{
"column": [
{
"text": {
"Abc": "3"
}
}
]
}
]
},
{
"@index": "2",
"row": [
{
"column": [
{
"text": {
"Abc": "2"
}
}
]
},
{
"column": [
{
"text": {
"Abc": "4"
}
},
{
"text": {
"Abc": "9"
}
}
]
}
]
}
]
}
}
我希望有人能帮助我。提前致谢。
输入数据的不规则性使得需求有点不透明,但是下面产生了想要的输出。
["counter", "page", "row", "column", "Abc"],
(foreach (.document.page[] | objects) as $page ({page: -1, counter: 0};
.page += 1
| if ($page | (has("image") and (.image|has("Abc"))))
then
.counter +=1
| .out = [.counter, .page, null, null, ($page|.image.Abc)]
else foreach ($page | .row[]?) as $row (.row=-1;
.row += 1
| foreach ($row | .column[]) as $column (.column=-1;
.column +=1
| foreach ($column | .text | objects) as $x (.;
.counter += 1
| .out = [.counter, .page, .row, .column, $x["Abc"]]
; . )
; . )
; . )
end
; .out )
)
| @tsv
输出
具体来说,使用 -r command-line 选项,给定输入产生的输出如下(包括制表符):
counter page row column Abc
1 0 4
2 1 0 0 2
3 1 1 0 1
4 1 1 1 9
5 1 2 0 3
6 2 0 0 2
7 2 1 0 4
8 2 1 1 9
以下解决方案使用 paths
并具有几个优点,包括简洁、简单,并且它可以很容易地适应不同格式的句柄数据。
为清楚起见,我们首先定义一个添加行号的函数:
# add a sequential id, starting at 1
def tsvRows(s):
foreach s as $s (0; .+1; [.] + $s)
| @tsv;
(["counter", "page", "row", "column", "Abc"] | @tsv),
tsvRows(paths as $p
| select($p[-1] == "Abc")
| getpath($p) as $v
| $p
| .[2] as $page
| (if .[3] == "row" then .[4] else null end) as $row
| (if .[5] == "column" then .[6] else null end) as $column
| [$page, $row, $column, $v] )