ValueError: Expected object or value when reading json.gzip to DataFrame

ValueError: Expected object or value when reading json.gzip to DataFrame

我想从可用的 Amazon 数据集列表中读取 Electronics json.gzip 文件: http://jmcauley.ucsd.edu/data/amazon/qa/

JSON样本:

{'questionType': 'yes/no', 'asin': 'B00004U9JP', 'answerTime': 'Jun 27, 2014', 'unixTime': 1403852400, 'question': 'I have a 9 year old Badger 1 that needs replacing, will this Badger 1 install just like the original one?', 'answerType': '?', 'answer': 'I replaced my old one with this without a hitch.'}
{'questionType': 'open-ended', 'asin': 'B00004U9JP', 'answerTime': 'Apr 28, 2014', 'unixTime': 1398668400, 'question': 'model number', 'answer': 'This may help InSinkErator Model BADGER-1: Badger 1 1/3 HP Garbage Disposal PRODUCT DETAILS - Bellacor Number:309641 / UPC:050375000419 Brand SKU:500181'}
{'questionType': 'yes/no', 'asin': 'B00004U9JP', 'answerTime': 'Aug 25, 2014', 'unixTime': 1408950000, 'question': 'can I replace Badger 1 1/3 with a Badger 5 1/2 - with same connections?', 'answerType': '?', 'answer': 'Plumbing connections will vary with different models. Usually the larger higher amp draw wil not affect the wiring, the disposals are designed to a basic standard setup common to all brands. They want you to buy their brand or version or model. As long as the disposal is UL listed, United Laboratories, they will setup and bolt up the same.'}
{'questionType': 'yes/no', 'asin': 'B00004U9JP', 'answerTime': 'Nov 3, 2014', 'unixTime': 1415001600, 'question': 'Does this come with power cord and dishwasher hook up?', 'answerType': '?', 'answer': 'It does not come with a power cord. It does come with the dishwasher hookup.'}
{'questionType': 'open-ended', 'asin': 'B00004U9JP', 'answerTime': 'Jun 21, 2014', 'unixTime': 1403334000, 'question': 'loud noise inside when turned on. sounds like blades are loose', 'answer': 'Check if you dropped something inside.Usually my wife put lemons inside make a lot of noise and I will have to get them out using my hands or mechanical fingers .'}
{'questionType': 'open-ended', 'asin': 'B00004U9JP', 'answerTime': 'Jul 13, 2013', 'unixTime': 1373698800, 'question': 'where is the reset button located', 'answer': 'on the bottom'}

我当前的代码使用 pd.read_json 方法和指定的 linesorient 参数,但是更改这些参数似乎不起作用。

electronics_url = 'http://jmcauley.ucsd.edu/data/amazon/qa/qa_Electronics.json.gz'
electronics_df = pd.read_json(electronics_url, orient='split', lines=True, compression='gzip')

我得到 ValueError: Expected object or value。我尝试了 orient 参数的所有可能变体,但没有帮助。我也尝试从本地缓冲区打开文件,不幸的是没有成功。

有什么问题?

存档内容 JSON 无效。文件的每一行看起来像一个 Python 字典。您可以使用此代码段:

import gzip
import ast
import urllib

data = []
url = 'http://jmcauley.ucsd.edu/data/amazon/qa/icdm/QA_Baby.json.gz'

with urllib.request.urlopen(url) as r:
    for qa in gzip.open(r):
        data.append(ast.literal_eval(qa.decode('utf-8')))

之后,使用pd.json_normalize读取dict列表:

answers = pd.json_normalize(data, ['questions', 'answers'])
print(answers)

# Output
                                              answerText      answererID          answerTime helpful answerType answerScore
0      Yes, the locks will keep adults out too.  My h...  A2WQX54BDMJTKY    November 6, 2013  [1, 1]        NaN         NaN
1      Yes if you install it correctly.  a lot of fol...  A3VRA4069D8C7L    November 6, 2013  [0, 0]        NaN         NaN
2      It probably will...  it's pretty good and much...   A3JEFPEUXUS0I    November 6, 2013  [0, 0]        NaN         NaN
3      The size of the locking mechanism. I bought th...  A1OCJ9L2PQJBUD    January 12, 2015  [0, 0]        NaN         NaN
4        The locking mechanism unlocks with the magnet .  A2KGWT9ZN4M1PO    January 14, 2015  [0, 0]        NaN         NaN
...                                                  ...             ...                 ...     ...        ...         ...
82029  I feel it would work fine for the 4 year old. ...  A2BIFRN88PPMGT  September 17, 2014  [1, 1]          Y      0.9828
82030  In my opinion, the pillow was slightly bigger ...   AHM5QX41VSV6B  September 17, 2014  [0, 0]          ?      0.9411
82031  Our 2yo is a belly sleeper too. At first she w...   AKW750RUMWK17     August 28, 2014  [1, 1]        NaN         NaN
82032  Hi. Yes, the pillow will settle with use for s...  A1XQAY39M2KOL0     August 27, 2014  [0, 0]        NaN         NaN
82033  I would recommend contacting the company to se...  A1ZCGIRS68DM9J     August 28, 2014  [0, 0]        NaN         NaN

[82034 rows x 6 columns]