JSON 脚本文本解析错误 ruby
JSON parsing error in script text ruby
我正在尝试从包含商店的脚本文本中解析 json data.It 在页面内 http://www.buildbase.co.uk/storefinder . The script text which i am working around is http://pastebin.com/embed_js/3cnewiSh 我的代码如下:
stores_url = "http://www.buildbase.co.uk/storefinder"
mechanize = Mechanize.new
stores_page = mechanize.get(stores_url)
stores_script_txt = stores_page.search("//script[contains(text(), 'storeLocator.initialize(')]")[0].text
stores_jsons = stores_script_txt.split("storeLocator.initialize( $.parseJSON('{\\"all\\":")[-1].split(",\\"selected\\":0}') ,\tfalse);\n });")[0]
puts stores_jsons
stores_result = JSON.parse(stores_jsons)
JSON.parse 给我的错误是:
from /home/private/.rvm/gems/ruby-2.1.5/gems/json-1.8.3/lib/json/common.rb:155:in `parse'
from /home/private/.rvm/gems/ruby-2.1.5/gems/json-1.8.3/lib/json/common.rb:155:in `parse'
from (irb):240
from /home/private/.rvm/rubies/ruby-2.1.5/bin/irb:11:in `<main>'
我不知道哪里出错了,因为 JSON 字符串对我来说似乎有效。
有几个问题。首先,您收到的文本格式不正确,因为它使用 \" 而不是引号等。
其次,它有 HTML 标签,其中包含引号,这打破了实际 JSON 中的引号。我抓取了一个只去掉标签的片段。
我不知道您需要多少数据,但这段代码确实有效。我也不确定它有多健壮(例如,我只是用 "
代替任何 \"
)
require 'mechanize'
stores_url = "http://www.buildbase.co.uk/storefinder"
mechanize = Mechanize.new
stores_page = mechanize.get(stores_url)
stores_script_txt = stores_page.search("//script[contains(text(), 'storeLocator.initialize(')]")[0].text
stores_jsons = stores_script_txt.split("storeLocator.initialize( $.parseJSON('{\\"all\\":")[-1].split(",\\"selected\\":0}') ,\tfalse);\n });")[0]
stores_jsons = stores_jsons.gsub('\"', '"').gsub(/<\/?[^>]*>/, '').gsub(/\n\n+/, "\n").gsub(/^\n|\n$/, '')
stores_result = JSON.parse(stores_jsons)
我正在尝试从包含商店的脚本文本中解析 json data.It 在页面内 http://www.buildbase.co.uk/storefinder . The script text which i am working around is http://pastebin.com/embed_js/3cnewiSh 我的代码如下:
stores_url = "http://www.buildbase.co.uk/storefinder"
mechanize = Mechanize.new
stores_page = mechanize.get(stores_url)
stores_script_txt = stores_page.search("//script[contains(text(), 'storeLocator.initialize(')]")[0].text
stores_jsons = stores_script_txt.split("storeLocator.initialize( $.parseJSON('{\\"all\\":")[-1].split(",\\"selected\\":0}') ,\tfalse);\n });")[0]
puts stores_jsons
stores_result = JSON.parse(stores_jsons)
JSON.parse 给我的错误是:
from /home/private/.rvm/gems/ruby-2.1.5/gems/json-1.8.3/lib/json/common.rb:155:in `parse'
from /home/private/.rvm/gems/ruby-2.1.5/gems/json-1.8.3/lib/json/common.rb:155:in `parse'
from (irb):240
from /home/private/.rvm/rubies/ruby-2.1.5/bin/irb:11:in `<main>'
我不知道哪里出错了,因为 JSON 字符串对我来说似乎有效。
有几个问题。首先,您收到的文本格式不正确,因为它使用 \" 而不是引号等。
其次,它有 HTML 标签,其中包含引号,这打破了实际 JSON 中的引号。我抓取了一个只去掉标签的片段。
我不知道您需要多少数据,但这段代码确实有效。我也不确定它有多健壮(例如,我只是用 "
代替任何 \"
)
require 'mechanize'
stores_url = "http://www.buildbase.co.uk/storefinder"
mechanize = Mechanize.new
stores_page = mechanize.get(stores_url)
stores_script_txt = stores_page.search("//script[contains(text(), 'storeLocator.initialize(')]")[0].text
stores_jsons = stores_script_txt.split("storeLocator.initialize( $.parseJSON('{\\"all\\":")[-1].split(",\\"selected\\":0}') ,\tfalse);\n });")[0]
stores_jsons = stores_jsons.gsub('\"', '"').gsub(/<\/?[^>]*>/, '').gsub(/\n\n+/, "\n").gsub(/^\n|\n$/, '')
stores_result = JSON.parse(stores_jsons)