JSON-LD 文件中具有特定 JSON 值的过滤键
Filter keys with specific JSON values in JSON-LD files
我有一个 zip 文件 (GZ),解压缩后每行包含 JSON。下面是一个示例 JSON 行。我正在尝试使用 jq
仅将特定字段提取到 CSV 文件。我想提取这些字段,条件是 type
键的值应该只有 dissertation
。
{
"id": "https://openalex.org/W2777209504",
"doi": "https://doi.org/10.24026/1818-1384.1(42).2013.77470",
"display_name": "Hyperandrogenism as a factor of reproductive losses",
"title": "Hyperandrogenism as a factor of reproductive losses",
"publication_year": 2013,
"publication_date": "2013-03-27",
"ids": {
"openalex": "https://openalex.org/W2777209504",
"doi": "https://doi.org/10.24026/1818-1384.1(42).2013.77470",
"mag": 2777209504
},
"type": "journal-article",
"counts_by_year": [
{
"year": 2019,
"cited_by_count": 1
}
],
"cited_by_api_url": "https://api.openalex.org/works?filter=cites:W2777209504",
"updated_date": "2021-11-03",
"created_date": "2018-01-05",
"abstract_inverted_index": {}
}
我尝试了以下两个命令,但均无效:\
gzcat -c sample.gz | jq -rc '[.doi,.title, .publication_year, .publication_date, .type] | select(.type |contains("dissertation")) | @csv'>target.csv
gzcat -c sample.gz | jq -rc '[.doi,.title, .publication_year, .publication_date, .type] | select(.type=="dissertation") | @csv'>target.csv
两者收到的输出是:
jq: error (at <stdin>:108753): Cannot index string with string "title"
我尝试了所有可能的方法来过滤我的 JSON-LD 文件,但我无法成功。任何指点都会有很大帮助。
在您的两次尝试中,select
的表述都不正确(或在错误的位置,具体取决于您的观点)。这会起作用:
select(.type == "dissertation")
| [.doi,.title, .publication_year, .publication_date, .type]
| @csv
我有一个 zip 文件 (GZ),解压缩后每行包含 JSON。下面是一个示例 JSON 行。我正在尝试使用 jq
仅将特定字段提取到 CSV 文件。我想提取这些字段,条件是 type
键的值应该只有 dissertation
。
{
"id": "https://openalex.org/W2777209504",
"doi": "https://doi.org/10.24026/1818-1384.1(42).2013.77470",
"display_name": "Hyperandrogenism as a factor of reproductive losses",
"title": "Hyperandrogenism as a factor of reproductive losses",
"publication_year": 2013,
"publication_date": "2013-03-27",
"ids": {
"openalex": "https://openalex.org/W2777209504",
"doi": "https://doi.org/10.24026/1818-1384.1(42).2013.77470",
"mag": 2777209504
},
"type": "journal-article",
"counts_by_year": [
{
"year": 2019,
"cited_by_count": 1
}
],
"cited_by_api_url": "https://api.openalex.org/works?filter=cites:W2777209504",
"updated_date": "2021-11-03",
"created_date": "2018-01-05",
"abstract_inverted_index": {}
}
我尝试了以下两个命令,但均无效:\
gzcat -c sample.gz | jq -rc '[.doi,.title, .publication_year, .publication_date, .type] | select(.type |contains("dissertation")) | @csv'>target.csv
gzcat -c sample.gz | jq -rc '[.doi,.title, .publication_year, .publication_date, .type] | select(.type=="dissertation") | @csv'>target.csv
两者收到的输出是:
jq: error (at <stdin>:108753): Cannot index string with string "title"
我尝试了所有可能的方法来过滤我的 JSON-LD 文件,但我无法成功。任何指点都会有很大帮助。
在您的两次尝试中,select
的表述都不正确(或在错误的位置,具体取决于您的观点)。这会起作用:
select(.type == "dissertation")
| [.doi,.title, .publication_year, .publication_date, .type]
| @csv