python reddit praw psraw 得到解码 json 值错误

python reddit praw psraw got decode json Value Error

我正在尝试获取 subreddit 的内容和评论并将它们写入 txt 文件。 一个文件是每个 post 的评论,另一个文件将列出每个 post 的相关信息。 但是,我在 7250 个结果之后遇到了这些错误,我需要获得 36k+ 个结果。

我用的也是praw 4.6,因为更新到5.0后,psraw不能用了

错误信息:

Traceback (most recent call last):
  File "C:/Users/PycharmProjects/untitled/subreddit psraw.py", line 12, in <module>
    for submission in psraw.submission_search(reddit, subreddit='R', limit=40000):
  File "C:\Python27\lib\site-packages\psraw\base.py", line 71, in endpoint_func
    data = requests.get(url).json()['data']
  File "C:\Python27\lib\site-packages\requests\models.py", line 894, in json
    return complexjson.loads(self.text, **kwargs)
  File "C:\Python27\lib\json\__init__.py", line 339, in loads
return _default_decoder.decode(s)
  File "C:\Python27\lib\json\decoder.py", line 364, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "C:\Python27\lib\json\decoder.py", line 382, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

我的代码:

import praw, datetime, os, psraw

reddit = praw.Reddit('bot1')

subreddit = reddit.subreddit('R')

count = 0
try:
  for submission in psraw.submission_search(reddit, subreddit='R', limit=40000):
  count_coment = 0

  #get comments
    for comment in submission.comments:
        subid = submission.id
        comid = comment.id
        comauthor = comment.author
        com_body = comment.body.encode('utf-8').replace("\n", " ")
        comscore = comment.score
        com_date = datetime.datetime.utcfromtimestamp(comment.created_utc)
        string_com = '"{0}", "{1}", "{2}", "{3}", "{4}"\n'
        formatted_string_com = string_com.format(comid, comauthor, com_body, com_date, comscore)
        indexFile_comment = open('C:/Users/PycharmProjects/untitled/reddit_output_diabetes/' + subid + '.txt', 'a+')
        indexFile_comment.write(formatted_string_com)
        count_coment += 1
    print 'comment count: ', count_coment

    #get index

    date = datetime.datetime.utcfromtimestamp(submission.created_utc)
    _id = submission.id
    title = submission.title.encode('utf-8')
    text = submission.selftext.encode('utf-8').replace("\n", " ")
    author = submission.author
    score = submission.score
    string = '"{0}", "{1}", "{2}", "{3}", "{4}", "{5}"\n' 

    formatted_string = string.format(_id, title, text, author, date, score)
    count += 1
    indexFile = open('C:/Users/PycharmProjects/untitled/reddit_output/' + 'index.txt', 'a+')
    indexFile.write(formatted_string)

    print ("Successfuly writing in file")
    print count
    indexFile.close()
  print count
except ValueError:
    pass

这可能是解析特定评论时出错。您可以跳过此评论并通过使用 try, except.

处理它来继续下一个评论

将代码放入:

try:

.......put code here...

except ValueError:
   pass

应该是:

try:

.......put code here...

except ValueError:
   pass
   continue