Ruby JSON 提取器失败,可能是由于过大 JSON
Ruby JSON extractor failing, possibly due to overly large JSON
我正在创建一个脚本来从 Reddit 线程中提取所有评论作为 JSON:
require "rubygems"
require "json"
require "net/http"
require "uri"
require 'open-uri'
require 'neatjson'
#The URL.
url = ("https://www.reddit.com/r/AskReddit/comments/46n0zc.json")
#Sets up the JSON reader.
result = JSON.parse(open(url).read)
children = result["data"]["children"]
#Prints the jsons.
children.each do |child|
puts "Author: " + child["data"]["author"]
puts "Body: " + child["data"]["body"]
puts "ID: " + child["data"]["id"]
puts "Upvotes: " + child["data"]["ups"].to_s
puts ""
end
出于某种原因,它给了我一个错误。但是,错误不在实际的 JSON 打印机中,而是在 reader:
005----extractallredditpostcomments.rb:17:in `[]': no implicit conversion of String into Integer (TypeError)
from 005----extractallredditpostcomments.rb:17:in `<main>'
出于某种原因,
children = result["data"]["children"]
不工作,这很奇怪,因为它工作正常
我想知道的是:这会不会是 JSON 的大小引起的?如果你真的去 link (https://www.reddit.com/r/AskReddit/comments/46n0zc.json) 你会发现这个文件很大。由于页面太大,我很难找到我需要的标签,这花了我几个小时,但我仍然不确定我有正确的标签,这也可能导致错误。我不确定这里出了什么问题。
哦,还有最后一件事:我尝试通过删除打印机来简化程序:
#Sets up the JSON reader.
result = JSON.parse(open(url).read)
children = result["data"]["children"]
puts children
#Prints the jsons.
#children.each do |child|
# puts "Author: " + child["data"]["author"]
# puts "Body: " + child["data"]["body"]
# puts "ID: " + child["data"]["id"]
# puts "Upvotes: " + child["data"]["ups"].to_s
# puts ""
#end
它仍然失败:
005----extractallredditpostcomments.rb:13:in `[]': no implicit conversion of String into Integer (TypeError)
from 005----extractallredditpostcomments.rb:13:in `<main>'
快速查看返回的 JSON 值表明它是两个 JSON 对象的 JSON array 而不是 JSON对象。它看起来有点像这样:
[
{
"data": {
"after": null,
"before": null,
"children": [
{
"data": {
"approved_by": null,
"archived": false,
...
},
"kind": "Listing"
},
{
"data": {
"after": null,
"before": null,
"children": [
{
"data": {
"approved_by": null,
"archived": false,
"author": "finkledinkle7",
"author_flair_css_class": null,
"author_flair_text": null,
"banned_by": null,
"body": "My mother was really sick in 2008. I was turning 25 with a younger brother and sister.\n\nLost both of my grandparents on mom's side to cancer a few years prior. Mom had to watch as her parents slowly passed away. It destroyed her not having her mother around as t ...
}
]
这意味着您程序中的行 children = result["data"]["children"]
将不起作用,因为它将结果视为 JSON 对象。看起来你应该 children = result[1]["data"]["children"]
.
我正在创建一个脚本来从 Reddit 线程中提取所有评论作为 JSON:
require "rubygems"
require "json"
require "net/http"
require "uri"
require 'open-uri'
require 'neatjson'
#The URL.
url = ("https://www.reddit.com/r/AskReddit/comments/46n0zc.json")
#Sets up the JSON reader.
result = JSON.parse(open(url).read)
children = result["data"]["children"]
#Prints the jsons.
children.each do |child|
puts "Author: " + child["data"]["author"]
puts "Body: " + child["data"]["body"]
puts "ID: " + child["data"]["id"]
puts "Upvotes: " + child["data"]["ups"].to_s
puts ""
end
出于某种原因,它给了我一个错误。但是,错误不在实际的 JSON 打印机中,而是在 reader:
005----extractallredditpostcomments.rb:17:in `[]': no implicit conversion of String into Integer (TypeError)
from 005----extractallredditpostcomments.rb:17:in `<main>'
出于某种原因,
children = result["data"]["children"]
不工作,这很奇怪,因为它工作正常
我想知道的是:这会不会是 JSON 的大小引起的?如果你真的去 link (https://www.reddit.com/r/AskReddit/comments/46n0zc.json) 你会发现这个文件很大。由于页面太大,我很难找到我需要的标签,这花了我几个小时,但我仍然不确定我有正确的标签,这也可能导致错误。我不确定这里出了什么问题。
哦,还有最后一件事:我尝试通过删除打印机来简化程序:
#Sets up the JSON reader.
result = JSON.parse(open(url).read)
children = result["data"]["children"]
puts children
#Prints the jsons.
#children.each do |child|
# puts "Author: " + child["data"]["author"]
# puts "Body: " + child["data"]["body"]
# puts "ID: " + child["data"]["id"]
# puts "Upvotes: " + child["data"]["ups"].to_s
# puts ""
#end
它仍然失败:
005----extractallredditpostcomments.rb:13:in `[]': no implicit conversion of String into Integer (TypeError)
from 005----extractallredditpostcomments.rb:13:in `<main>'
快速查看返回的 JSON 值表明它是两个 JSON 对象的 JSON array 而不是 JSON对象。它看起来有点像这样:
[
{
"data": {
"after": null,
"before": null,
"children": [
{
"data": {
"approved_by": null,
"archived": false,
...
},
"kind": "Listing"
},
{
"data": {
"after": null,
"before": null,
"children": [
{
"data": {
"approved_by": null,
"archived": false,
"author": "finkledinkle7",
"author_flair_css_class": null,
"author_flair_text": null,
"banned_by": null,
"body": "My mother was really sick in 2008. I was turning 25 with a younger brother and sister.\n\nLost both of my grandparents on mom's side to cancer a few years prior. Mom had to watch as her parents slowly passed away. It destroyed her not having her mother around as t ...
}
]
这意味着您程序中的行 children = result["data"]["children"]
将不起作用,因为它将结果视为 JSON 对象。看起来你应该 children = result[1]["data"]["children"]
.