正在将 JSONL 文件加载为 JSON 个对象

Loading JSONL file as JSON objects

我想将 JSONL 文件作为 JSON 对象加载到 python 中。有简单的方法吗?

splitlines 会为您解决这个问题,所以通常下面的代码对您有用:

import json

result = [json.loads(jline) for jline in jsonl_content.splitlines()]

如果这是响应对象,结果将是:

result = [json.loads(jline) for jline in response.read().splitlines()]

包括文件操作在内的完整步骤,适合我这样的新手

假设您有一个 .jsonl 文件,例如:

{"reviewerID": "A2IBPI20UZIR0U", "asin": "1384719342", "reviewerName": "cassandra tu \"Yeah, well, that's just like, u...", "helpful": [0, 0], "reviewText": "Not much to write about here, but it does exactly what it's supposed to. filters out the pop sounds. now my recordings are much more crisp. it is one of the lowest prices pop filters on amazon so might as well buy it, they honestly work the same despite their pricing,", "overall": 5.0, "summary": "good", "unixReviewTime": 1393545600, "reviewTime": "02 28, 2014"}
{"reviewerID": "A14VAT5EAX3D9S", "asin": "1384719342", "reviewerName": "Jake", "helpful": [13, 14], "reviewText": "The product does exactly as it should and is quite affordable.I did not realized it was double screened until it arrived, so it was even better than I had expected.As an added bonus, one of the screens carries a small hint of the smell of an old grape candy I used to buy, so for reminiscent's sake, I cannot stop putting the pop filter next to my nose and smelling it after recording. :DIf you needed a pop filter, this will work just as well as the expensive ones, and it may even come with a pleasing aroma like mine did!Buy this product! :]", "overall": 5.0, "summary": "Jake", "unixReviewTime": 1363392000, "reviewTime": "03 16, 2013"}

此代码应该有效:

import json

with open('./data/my_filename.jsonl', 'r') as json_file:
    json_list = list(json_file)

for json_str in json_list:
    result = json.loads(json_str)
    print(f"result: {result}")
    print(isinstance(result, dict))

关于 .jsonl 个文件:
http://jsonlines.org/

将参数行设置为 True 应该可以解决问题。

import pandas as pd    
jsonObj = pd.read_json(path_or_buf=file_path, lines=True)

您可以添加更多键,但这应该可以。说,每一行都是以下格式。基本上,j_line 是一个字典,可以像访问字典一样访问每个元素。我也分享了访问嵌套对象。

{"key1":"value", "key2":{"prop_1": "value"}}

with open("foo.jsonl") as f1:
   for line in f1:
      j_line=json.loads(line)
      key_1=j_line['key1'] 
      prop_1=j_line['key2']['prop_2]

不使用任何 split() 函数的快速简便的本机解决方案:

import json
with open('/path/to/file.jsonl') as f:
    data = [json.loads(line) for line in f]