jq - 按字段值分组 json 个对象并在一行中输出分组值
jq - group json objects by field value and output grouped values in one line
我有一个 json 格式,其中包含来自 AWS Cloudwatch 的指标、时间戳和值。
{
"Messages": [],
"MetricDataResults": [
{
"Timestamps": [
"2021-07-07T13:26:00Z"
],
"StatusCode": "Complete",
"Values": [
0.0
],
"Id": "m19",
"Label": "CPUSurplusCreditsCharged"
},
{
"Timestamps": [
"2021-07-07T13:28:00Z",
"2021-07-07T13:27:00Z",
"2021-07-07T13:26:00Z",
"2021-07-07T13:25:00Z",
"2021-07-07T13:24:00Z",
"2021-07-07T13:23:00Z"
],
"StatusCode": "Complete",
"Values": [
12.750425014167137,
13.033116114731422,
12.70812153130781,
12.975,
15.441924032067199,
12.916451392476791
],
"Id": "m20",
"Label": "CPUUtilization"
},
{
"Timestamps": [
"2021-07-07T13:29:00Z",
"2021-07-07T13:28:00Z",
"2021-07-07T13:27:00Z",
"2021-07-07T13:26:00Z",
"2021-07-07T13:25:00Z",
"2021-07-07T13:24:00Z",
"2021-07-07T13:23:00Z"
],
"StatusCode": "Complete",
"Values": [
0.7,
0.6999533364442371,
0.6998833527745376,
0.6999416715273727,
0.7,
0.7001166861143524,
0.6998950157476379
],
"Id": "m21",
"Label": "NetworkReceiveThroughput"
}
]
}
我使用jq命令把这些值放在一个数组变量中
并将结果输出到数组变量如下。
jq -r '.MetricDataResults[] | "\(.Label) \(.Timestamps) \(.Values)"' test.json | while read Label timestamp value
do
Label=`echo $Label | sed 's/\"//g; s/\[//g; s/\]//g; s/,/ /g'`
timestamp=`echo $timestamp | sed 's/\"//g; s/\[//g; s/\]//g; s/,/ /g'`
value=`echo $value | sed 's/\"//g; s/\[//g; s/\]//g; s/,/ /g'`
arr_timestamp=($timestamp)
arr_value=($value)
echo $Label
echo ${arr_timestamp[@]}
echo ${arr_value[@]}
done
Evictions
2021-07-07T10:51:00Z 2021-07-07T10:50:00Z 2021-07-07T10:49:00Z 2021-07-07T10:48:00Z 2021-07-07T10:47:00Z 2021-07-07T10:46:00Z 2021-07-07T10:45:00Z
0 0 0 0 0 0 0
CPUUtilization
2021-07-07T10:50:00Z 2021-07-07T10:49:00Z 2021-07-07T10:48:00Z 2021-07-07T10:47:00Z 2021-07-07T10:46:00Z 2021-07-07T10:45:00Z
1.5333333333333332 1.4666666666666666 1.5833333333333333 1.5333333333333332 1.4916666666666665 1.4916666666666665
IsMaster
2021-07-07T10:51:00Z 2021-07-07T10:50:00Z 2021-07-07T10:49:00Z 2021-07-07T10:48:00Z 2021-07-07T10:47:00Z 2021-07-07T10:46:00Z 2021-07-07T10:45:00Z
1 1 1 1 1 1 1
当每个数组变量的时间戳长度不同时,
我想只显示与单个字符串相同时间戳中的值。
例如
"2021-07-07T10:51:00Z Evictions = 0\nIsMaster = 1"
"2021-07-07T10:50:00Z Evictions = 0\nCPUUtilization = 1.5333333333333332\n IsMaster = 1"
...
脑袋坏了,想不出好办法。
有什么好的方法请告诉我
我没有太多时间所以请帮助计算器。
- 添加
我的意思是按时间戳分组。像这样
{
"MetricDataResults": [
{
"Timestamps": "2021-07-07T13:28:00Z",
"Label" : [
"CPUUtilization",
"NetworkReceiveThroughput"
],
"Values" : [
12.750425014167137,
0.7
]
},
{
"Timestamps": "2021-07-07T13:27:00Z",
"Label" : [
"CPUUtilization",
"NetworkReceiveThroughput"
],
"Values" : [
13.033116114731422,
0.6999533364442371
]
},
{
"Timestamps": "2021-07-07T13:26:00Z",
"Label" : [
"CPUUtilization",
"NetworkReceiveThroughput",
"CPUSurplusCreditsCharged"
],
"Values" : [
12.70812153130781,
0.6998833527745376,
0.0
]
}
]
}
您只需使用 jq
即可实现您的目标。 shell 脚本的进一步处理是不必要的。
以下 shell 脚本为您提供了两种选择:
- 输出为文本
- 输出为json
#!/bin/bash
INPUT='
{
"Messages": [],
"MetricDataResults": [
{
"Timestamps": [
"2021-07-07T13:26:00Z"
],
"StatusCode": "Complete",
"Values": [
0.0
],
"Id": "m19",
"Label": "CPUSurplusCreditsCharged"
},
{
"Timestamps": [
"2021-07-07T13:28:00Z",
"2021-07-07T13:27:00Z",
"2021-07-07T13:26:00Z",
"2021-07-07T13:25:00Z",
"2021-07-07T13:24:00Z",
"2021-07-07T13:23:00Z"
],
"StatusCode": "Complete",
"Values": [
12.750425014167137,
13.033116114731422,
12.70812153130781,
12.975,
15.441924032067199,
12.916451392476791
],
"Id": "m20",
"Label": "CPUUtilization"
},
{
"Timestamps": [
"2021-07-07T13:29:00Z",
"2021-07-07T13:28:00Z",
"2021-07-07T13:27:00Z",
"2021-07-07T13:26:00Z",
"2021-07-07T13:25:00Z",
"2021-07-07T13:24:00Z",
"2021-07-07T13:23:00Z"
],
"StatusCode": "Complete",
"Values": [
0.7,
0.6999533364442371,
0.6998833527745376,
0.6999416715273727,
0.7,
0.7001166861143524,
0.6998950157476379
],
"Id": "m21",
"Label": "NetworkReceiveThroughput"
}
]
}
'
# output as plain text
jq -r '
.MetricDataResults
| map(.Values as $values | .Timestamps as $timestamps
| {Label} +
foreach range(.Timestamps | length) as $idx
(null; {"Timestamp": $timestamps[$idx], "Value": $values[$idx]}; .))
| group_by(.Timestamp)[]
| [.[0].Timestamp]
+ map("\(.Label)=\(.Value)")
| join("\n") + "\n"
' <<< "$INPUT"
# output as json
jq -r '
.MetricDataResults
|= (map(.Values as $values | .Timestamps as $timestamps
| {Id, Label, StatusCode} +
foreach range(.Timestamps | length) as $idx
(null; {"Timestamp": $timestamps[$idx], "Value": $values[$idx]}; .))
| group_by(.Timestamp)
| map({Timestamp: .[0].Timestamp,
Events: del(.[].Timestamp)}))
' <<< "$INPUT"
shell 脚本的第一个 jq
命令产生:
2021-07-07T13:23:00Z
CPUUtilization=12.916451392476791
NetworkReceiveThroughput=0.6998950157476379
2021-07-07T13:24:00Z
CPUUtilization=15.441924032067199
NetworkReceiveThroughput=0.7001166861143524
2021-07-07T13:25:00Z
CPUUtilization=12.975
NetworkReceiveThroughput=0.7
2021-07-07T13:26:00Z
CPUSurplusCreditsCharged=0
CPUUtilization=12.70812153130781
NetworkReceiveThroughput=0.6999416715273727
2021-07-07T13:27:00Z
CPUUtilization=13.033116114731422
NetworkReceiveThroughput=0.6998833527745376
2021-07-07T13:28:00Z
CPUUtilization=12.750425014167137
NetworkReceiveThroughput=0.6999533364442371
2021-07-07T13:29:00Z
NetworkReceiveThroughput=0.7
shell 脚本的第二个 jq
命令产生:
{
"Messages": [],
"MetricDataResults": [
{
"Timestamp": "2021-07-07T13:23:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 12.916451392476791
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.6998950157476379
}
]
},
{
"Timestamp": "2021-07-07T13:24:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 15.441924032067199
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.7001166861143524
}
]
},
{
"Timestamp": "2021-07-07T13:25:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 12.975
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.7
}
]
},
{
"Timestamp": "2021-07-07T13:26:00Z",
"Events": [
{
"Id": "m19",
"Label": "CPUSurplusCreditsCharged",
"StatusCode": "Complete",
"Value": 0
},
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 12.70812153130781
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.6999416715273727
}
]
},
{
"Timestamp": "2021-07-07T13:27:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 13.033116114731422
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.6998833527745376
}
]
},
{
"Timestamp": "2021-07-07T13:28:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 12.750425014167137
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.6999533364442371
}
]
},
{
"Timestamp": "2021-07-07T13:29:00Z",
"Events": [
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.7
}
]
}
]
}
这是文本输出情况下的一个简单、惯用的解决方案;从 1.3 开始,它可以与任何版本的 jq 一起使用。特别注意它不依赖foreach
,这里的使用过于复杂:
< input.json jq -r '
.MetricDataResults
| map(.Values as $values
| .Timestamps as $timestamps
| {Label} +
(range(0; .Timestamps|length) as $idx
| {Timestamp: $timestamps[$idx],
Value: $values[$idx]} ))
| group_by(.Timestamp)[]
| .[0].Timestamp, (.[]|"\(.Label)=\(.Value)"), ""
'
我有一个 json 格式,其中包含来自 AWS Cloudwatch 的指标、时间戳和值。
{
"Messages": [],
"MetricDataResults": [
{
"Timestamps": [
"2021-07-07T13:26:00Z"
],
"StatusCode": "Complete",
"Values": [
0.0
],
"Id": "m19",
"Label": "CPUSurplusCreditsCharged"
},
{
"Timestamps": [
"2021-07-07T13:28:00Z",
"2021-07-07T13:27:00Z",
"2021-07-07T13:26:00Z",
"2021-07-07T13:25:00Z",
"2021-07-07T13:24:00Z",
"2021-07-07T13:23:00Z"
],
"StatusCode": "Complete",
"Values": [
12.750425014167137,
13.033116114731422,
12.70812153130781,
12.975,
15.441924032067199,
12.916451392476791
],
"Id": "m20",
"Label": "CPUUtilization"
},
{
"Timestamps": [
"2021-07-07T13:29:00Z",
"2021-07-07T13:28:00Z",
"2021-07-07T13:27:00Z",
"2021-07-07T13:26:00Z",
"2021-07-07T13:25:00Z",
"2021-07-07T13:24:00Z",
"2021-07-07T13:23:00Z"
],
"StatusCode": "Complete",
"Values": [
0.7,
0.6999533364442371,
0.6998833527745376,
0.6999416715273727,
0.7,
0.7001166861143524,
0.6998950157476379
],
"Id": "m21",
"Label": "NetworkReceiveThroughput"
}
]
}
我使用jq命令把这些值放在一个数组变量中
并将结果输出到数组变量如下。
jq -r '.MetricDataResults[] | "\(.Label) \(.Timestamps) \(.Values)"' test.json | while read Label timestamp value
do
Label=`echo $Label | sed 's/\"//g; s/\[//g; s/\]//g; s/,/ /g'`
timestamp=`echo $timestamp | sed 's/\"//g; s/\[//g; s/\]//g; s/,/ /g'`
value=`echo $value | sed 's/\"//g; s/\[//g; s/\]//g; s/,/ /g'`
arr_timestamp=($timestamp)
arr_value=($value)
echo $Label
echo ${arr_timestamp[@]}
echo ${arr_value[@]}
done
Evictions
2021-07-07T10:51:00Z 2021-07-07T10:50:00Z 2021-07-07T10:49:00Z 2021-07-07T10:48:00Z 2021-07-07T10:47:00Z 2021-07-07T10:46:00Z 2021-07-07T10:45:00Z
0 0 0 0 0 0 0
CPUUtilization
2021-07-07T10:50:00Z 2021-07-07T10:49:00Z 2021-07-07T10:48:00Z 2021-07-07T10:47:00Z 2021-07-07T10:46:00Z 2021-07-07T10:45:00Z
1.5333333333333332 1.4666666666666666 1.5833333333333333 1.5333333333333332 1.4916666666666665 1.4916666666666665
IsMaster
2021-07-07T10:51:00Z 2021-07-07T10:50:00Z 2021-07-07T10:49:00Z 2021-07-07T10:48:00Z 2021-07-07T10:47:00Z 2021-07-07T10:46:00Z 2021-07-07T10:45:00Z
1 1 1 1 1 1 1
当每个数组变量的时间戳长度不同时,
我想只显示与单个字符串相同时间戳中的值。
例如
"2021-07-07T10:51:00Z Evictions = 0\nIsMaster = 1"
"2021-07-07T10:50:00Z Evictions = 0\nCPUUtilization = 1.5333333333333332\n IsMaster = 1"
...
脑袋坏了,想不出好办法。
有什么好的方法请告诉我
我没有太多时间所以请帮助计算器。
- 添加
我的意思是按时间戳分组。像这样
{
"MetricDataResults": [
{
"Timestamps": "2021-07-07T13:28:00Z",
"Label" : [
"CPUUtilization",
"NetworkReceiveThroughput"
],
"Values" : [
12.750425014167137,
0.7
]
},
{
"Timestamps": "2021-07-07T13:27:00Z",
"Label" : [
"CPUUtilization",
"NetworkReceiveThroughput"
],
"Values" : [
13.033116114731422,
0.6999533364442371
]
},
{
"Timestamps": "2021-07-07T13:26:00Z",
"Label" : [
"CPUUtilization",
"NetworkReceiveThroughput",
"CPUSurplusCreditsCharged"
],
"Values" : [
12.70812153130781,
0.6998833527745376,
0.0
]
}
]
}
您只需使用 jq
即可实现您的目标。 shell 脚本的进一步处理是不必要的。
以下 shell 脚本为您提供了两种选择:
- 输出为文本
- 输出为json
#!/bin/bash
INPUT='
{
"Messages": [],
"MetricDataResults": [
{
"Timestamps": [
"2021-07-07T13:26:00Z"
],
"StatusCode": "Complete",
"Values": [
0.0
],
"Id": "m19",
"Label": "CPUSurplusCreditsCharged"
},
{
"Timestamps": [
"2021-07-07T13:28:00Z",
"2021-07-07T13:27:00Z",
"2021-07-07T13:26:00Z",
"2021-07-07T13:25:00Z",
"2021-07-07T13:24:00Z",
"2021-07-07T13:23:00Z"
],
"StatusCode": "Complete",
"Values": [
12.750425014167137,
13.033116114731422,
12.70812153130781,
12.975,
15.441924032067199,
12.916451392476791
],
"Id": "m20",
"Label": "CPUUtilization"
},
{
"Timestamps": [
"2021-07-07T13:29:00Z",
"2021-07-07T13:28:00Z",
"2021-07-07T13:27:00Z",
"2021-07-07T13:26:00Z",
"2021-07-07T13:25:00Z",
"2021-07-07T13:24:00Z",
"2021-07-07T13:23:00Z"
],
"StatusCode": "Complete",
"Values": [
0.7,
0.6999533364442371,
0.6998833527745376,
0.6999416715273727,
0.7,
0.7001166861143524,
0.6998950157476379
],
"Id": "m21",
"Label": "NetworkReceiveThroughput"
}
]
}
'
# output as plain text
jq -r '
.MetricDataResults
| map(.Values as $values | .Timestamps as $timestamps
| {Label} +
foreach range(.Timestamps | length) as $idx
(null; {"Timestamp": $timestamps[$idx], "Value": $values[$idx]}; .))
| group_by(.Timestamp)[]
| [.[0].Timestamp]
+ map("\(.Label)=\(.Value)")
| join("\n") + "\n"
' <<< "$INPUT"
# output as json
jq -r '
.MetricDataResults
|= (map(.Values as $values | .Timestamps as $timestamps
| {Id, Label, StatusCode} +
foreach range(.Timestamps | length) as $idx
(null; {"Timestamp": $timestamps[$idx], "Value": $values[$idx]}; .))
| group_by(.Timestamp)
| map({Timestamp: .[0].Timestamp,
Events: del(.[].Timestamp)}))
' <<< "$INPUT"
shell 脚本的第一个 jq
命令产生:
2021-07-07T13:23:00Z
CPUUtilization=12.916451392476791
NetworkReceiveThroughput=0.6998950157476379
2021-07-07T13:24:00Z
CPUUtilization=15.441924032067199
NetworkReceiveThroughput=0.7001166861143524
2021-07-07T13:25:00Z
CPUUtilization=12.975
NetworkReceiveThroughput=0.7
2021-07-07T13:26:00Z
CPUSurplusCreditsCharged=0
CPUUtilization=12.70812153130781
NetworkReceiveThroughput=0.6999416715273727
2021-07-07T13:27:00Z
CPUUtilization=13.033116114731422
NetworkReceiveThroughput=0.6998833527745376
2021-07-07T13:28:00Z
CPUUtilization=12.750425014167137
NetworkReceiveThroughput=0.6999533364442371
2021-07-07T13:29:00Z
NetworkReceiveThroughput=0.7
shell 脚本的第二个 jq
命令产生:
{
"Messages": [],
"MetricDataResults": [
{
"Timestamp": "2021-07-07T13:23:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 12.916451392476791
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.6998950157476379
}
]
},
{
"Timestamp": "2021-07-07T13:24:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 15.441924032067199
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.7001166861143524
}
]
},
{
"Timestamp": "2021-07-07T13:25:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 12.975
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.7
}
]
},
{
"Timestamp": "2021-07-07T13:26:00Z",
"Events": [
{
"Id": "m19",
"Label": "CPUSurplusCreditsCharged",
"StatusCode": "Complete",
"Value": 0
},
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 12.70812153130781
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.6999416715273727
}
]
},
{
"Timestamp": "2021-07-07T13:27:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 13.033116114731422
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.6998833527745376
}
]
},
{
"Timestamp": "2021-07-07T13:28:00Z",
"Events": [
{
"Id": "m20",
"Label": "CPUUtilization",
"StatusCode": "Complete",
"Value": 12.750425014167137
},
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.6999533364442371
}
]
},
{
"Timestamp": "2021-07-07T13:29:00Z",
"Events": [
{
"Id": "m21",
"Label": "NetworkReceiveThroughput",
"StatusCode": "Complete",
"Value": 0.7
}
]
}
]
}
这是文本输出情况下的一个简单、惯用的解决方案;从 1.3 开始,它可以与任何版本的 jq 一起使用。特别注意它不依赖foreach
,这里的使用过于复杂:
< input.json jq -r '
.MetricDataResults
| map(.Values as $values
| .Timestamps as $timestamps
| {Label} +
(range(0; .Timestamps|length) as $idx
| {Timestamp: $timestamps[$idx],
Value: $values[$idx]} ))
| group_by(.Timestamp)[]
| .[0].Timestamp, (.[]|"\(.Label)=\(.Value)"), ""
'