使用双引号从 R 中的 json 字符串中提取文本
extracting text from a json string in R with double quotes
我正在尝试提取 JSON 对象的正文字符串中的内容。问题是我无法将双引号传递给并提取所有文本信息。在这种情况下,我需要文本 "There is a typo error in documentation regarding a link to a librarys function, which is quite irritating while browsing the documentation!"。我已经把我的代码放在这里,如果有人可以建议如何传递双引号而不是 (",) 并且可以达到这个贪婪的量词,(这是我一直在使用的表达式),
body<- str_extract(json_file , 'body[^,]*\s*').
谢谢
library(stringr)
json_file<- '{ "_id" : ObjectId( "539163d7bd350003" ), "login" : "vui", "id" : 369607, "avatar_url" : "https://avatars.mashupsusercontent.com/u/369607?", "gravatar_id" : "df8897ffebe16c5b0cd690925c63e190", "body":"There is a typo error in documentation regarding a link to a librarys function, which is quite irritating while browsing the documentation!","url" : "https://api.mashups.com/users/vui", "html_url" : "https://mashups.com/vui", "followers_url" : "https://api.mashups.com/users/vui/followers", "following_url" : "https://api.mashups.com/users/vui/following{/other_user}", "gists_url" : "https://api.mashups.com/users/vui/gists{/gist_id}", "starred_url" : "https://api.mashups.com/users/vui/starred{/owner}{/repo}", "subscriptions_url" : "https://api.mashups.com/users/vui/subscriptions", "organizations_url" : "https://api.mashups.com/users/vui/orgs", "repos_url" : "https://api.mashups.com/users/vui/repos", "events_url" : "https://api.mashups.com/users/vui/events{/privacy}", "received_events_url" : "https://api.mashups.com/users/vui/received_events", "type" : "User", "site_admin" : false, "org" : "amurath"}'
body<- str_extract(json_file , 'body[^,]*\s*')
body
您的输入有一个格式错误的示例,因此我不太相信这对您的其余数据有效。但是,在 此数据 上获得所需内容的方法是:
gsub('.*?body.*?:\"(.*?)\",\"\w+\"\s*:.*', "\1", json_file)
[1] "For the spout users such as myself, /reload is not an option
because it crashes the spout framework. Also, reloads take a while. I
was wondering if you could implement a command to reload the
\"configurations\".\r\nThat is all, have a Happy New
Year!\r\n-asleeponduty"
编辑:我做了一个小修改。新版本适用于您的旧示例和新示例。
gsub('.*?body.*?:\"(.*?)\",\"\w+\"\s*:.*', "\1", json_file)
[1] "There is a typo error in documentation regarding a link to a
librarys function, which is quite irritating while browsing the
documentation!"
这对我有用..
library(stringr)
body <- str_extract(json_file, 'body":"[^"]*')
我正在尝试提取 JSON 对象的正文字符串中的内容。问题是我无法将双引号传递给并提取所有文本信息。在这种情况下,我需要文本 "There is a typo error in documentation regarding a link to a librarys function, which is quite irritating while browsing the documentation!"。我已经把我的代码放在这里,如果有人可以建议如何传递双引号而不是 (",) 并且可以达到这个贪婪的量词,(这是我一直在使用的表达式),
body<- str_extract(json_file , 'body[^,]*\s*').
谢谢
library(stringr)
json_file<- '{ "_id" : ObjectId( "539163d7bd350003" ), "login" : "vui", "id" : 369607, "avatar_url" : "https://avatars.mashupsusercontent.com/u/369607?", "gravatar_id" : "df8897ffebe16c5b0cd690925c63e190", "body":"There is a typo error in documentation regarding a link to a librarys function, which is quite irritating while browsing the documentation!","url" : "https://api.mashups.com/users/vui", "html_url" : "https://mashups.com/vui", "followers_url" : "https://api.mashups.com/users/vui/followers", "following_url" : "https://api.mashups.com/users/vui/following{/other_user}", "gists_url" : "https://api.mashups.com/users/vui/gists{/gist_id}", "starred_url" : "https://api.mashups.com/users/vui/starred{/owner}{/repo}", "subscriptions_url" : "https://api.mashups.com/users/vui/subscriptions", "organizations_url" : "https://api.mashups.com/users/vui/orgs", "repos_url" : "https://api.mashups.com/users/vui/repos", "events_url" : "https://api.mashups.com/users/vui/events{/privacy}", "received_events_url" : "https://api.mashups.com/users/vui/received_events", "type" : "User", "site_admin" : false, "org" : "amurath"}'
body<- str_extract(json_file , 'body[^,]*\s*')
body
您的输入有一个格式错误的示例,因此我不太相信这对您的其余数据有效。但是,在 此数据 上获得所需内容的方法是:
gsub('.*?body.*?:\"(.*?)\",\"\w+\"\s*:.*', "\1", json_file)
[1] "For the spout users such as myself, /reload is not an option because it crashes the spout framework. Also, reloads take a while. I was wondering if you could implement a command to reload the \"configurations\".\r\nThat is all, have a Happy New Year!\r\n-asleeponduty"
编辑:我做了一个小修改。新版本适用于您的旧示例和新示例。
gsub('.*?body.*?:\"(.*?)\",\"\w+\"\s*:.*', "\1", json_file)
[1] "There is a typo error in documentation regarding a link to a librarys function, which is quite irritating while browsing the documentation!"
这对我有用..
library(stringr)
body <- str_extract(json_file, 'body":"[^"]*')