Ruby JSON 提取器失败,可能是由于过大 JSON

Ruby JSON extractor failing, possibly due to overly large JSON

我正在创建一个脚本来从 Reddit 线程中提取所有评论作为 JSON:

 require "rubygems"
 require "json"
 require "net/http"
 require "uri"
 require 'open-uri'
 require 'neatjson'

 #The URL.
 url = ("https://www.reddit.com/r/AskReddit/comments/46n0zc.json")

 #Sets up the JSON reader.
 result = JSON.parse(open(url).read)
 children = result["data"]["children"]

 #Prints the jsons.
 children.each do |child|
   puts "Author:       " + child["data"]["author"]
   puts "Body:         " + child["data"]["body"]
   puts "ID:           " + child["data"]["id"]
   puts "Upvotes:      " + child["data"]["ups"].to_s
   puts ""
 end

出于某种原因,它给了我一个错误。但是,错误不在实际的 JSON 打印机中,而是在 reader:

   005----extractallredditpostcomments.rb:17:in `[]': no implicit conversion of String into Integer (TypeError)
           from 005----extractallredditpostcomments.rb:17:in `<main>'

出于某种原因,

children = result["data"]["children"]

不工作,这很奇怪,因为它工作正常

我想知道的是:这会不会是 JSON 的大小引起的?如果你真的去 link (https://www.reddit.com/r/AskReddit/comments/46n0zc.json) 你会发现这个文件很大。由于页面太大,我很难找到我需要的标签,这花了我几个小时,但我仍然不确定我有正确的标签,这也可能导致错误。我不确定这里出了什么问题。

哦,还有最后一件事:我尝试通过删除打印机来简化程序:

 #Sets up the JSON reader.
 result = JSON.parse(open(url).read)
 children = result["data"]["children"]

 puts children

 #Prints the jsons.
 #children.each do |child|
 #  puts "Author:       " + child["data"]["author"]
 #  puts "Body:         " + child["data"]["body"]
 #  puts "ID:           " + child["data"]["id"]
 #  puts "Upvotes:      " + child["data"]["ups"].to_s
 #  puts ""
 #end

它仍然失败:

005----extractallredditpostcomments.rb:13:in `[]': no implicit conversion of String into Integer (TypeError)
        from 005----extractallredditpostcomments.rb:13:in `<main>'

快速查看返回的 JSON 值表明它是两个 JSON 对象的 JSON array 而不是 JSON对象。它看起来有点像这样:

[
    {
        "data": {
            "after": null,
            "before": null,
            "children": [
                {
                    "data": {
                        "approved_by": null,
                        "archived": false,
      ...
      },
      "kind": "Listing"
    },
    {
        "data": {
            "after": null,
            "before": null,
            "children": [
                {
                    "data": {
                        "approved_by": null,
                        "archived": false,
                        "author": "finkledinkle7",
                        "author_flair_css_class": null,
                        "author_flair_text": null,
                        "banned_by": null,
                        "body": "My mother was really sick in 2008.  I was turning 25 with a younger brother and sister.\n\nLost both of my grandparents on mom's side to cancer a few years prior.  Mom had to watch as her parents slowly passed away.  It destroyed her not having her mother around as t ...
   }
]

这意味着您程序中的行 children = result["data"]["children"] 将不起作用,因为它将结果视为 JSON 对象。看起来你应该 children = result[1]["data"]["children"].