R 函数中处理推文 json 时出错:需要 TRUE/FALSE 的地方缺少值

error processing tweet json in R function: missing value where TRUE/FALSE needed

我正在使用一个函数,它将原始推文 json 文件作为输入并输出转推级联。这是函数的一部分:

if (api_version == 2) {
    parse_tweet <- function(tweet, keep_text = F) {
      tryCatch({
        json_tweet <- jsonlite::fromJSON(tweet)
        id <- json_tweet$data$id
        magnitude <-zero_if_null(json_tweet$includes$users$public_metrics$followers_count)
        user_id <- json_tweet$data$author_id
        retweet_id <- NA

        if (keep_text) text <- json_tweet$data$text

        #if this tweet is a retweet, get original tweet's information
        if (!is.null(json_tweet$data$referenced_tweets) && json_tweet$data$referenced_tweets$type == 'retweeted') {
          retweet_id <- json_tweet$data$referenced_tweets$id  
          cat("retweet_id: ", retweet_id, "\n")
          if (keep_text) text <- NA 
       }
      },
      .... # warning for error processing json 
        )
   }
}

这是错误:

Error processing json: Error in if (!is.null(json_tweet$data$referenced_tweets) && json_tweet$data$referenced_tweets$type == : missing value where TRUE/FALSE needed

我检查了我的 json 文件以查看查找推文类型(例如,“转推”、“引用”或“replied_to”)的路径: json_tweet-> data -> referenced_tweets -> type) 但我不知道为什么函数 returns 缺少值和空转推 ID。

这是一小部分数据(我什至无法上传 json 文件的第一行,因为它超出了 Whosebug 的字符限制):

{"data": [{"referenced_tweets": [{"type": "retweeted", "id": "1253739069273710594"}], "entities": {"mentions": [{"start": 3, "end": 16, "username": "warriors_mom", "id": "75184478"}, {"start": 18, "end": 24, "username": "AC360", "id": "227837742"}], "annotations": [{"start": 25, "end": 39, "probability": 0.7096, "type": "Person", "normalized_text": "President Trump"}], "urls": [{"start": 98, "end": 121, "url": "", "images": [{"url": "", "width": 144, "height": 144}, {"url": "", "width": 144, "height": 144}], "status": 200, "title": "Ultraviolet Irradiation of Blood: \u201cThe Cure That Time Forgot\u201d?", "description": "Ultraviolet blood irradiation (UBI) was extensively used in the 1940s and 1950s to treat many diseases including septicemia, pneumonia, tuberculosis, arthritis, asthma and even poliomyelitis. The early studies were carried out by several physicians in ...", "unwound_url": ""}]}, "public_metrics": {"retweet_count": 3, "reply_count": 0, "like_count": 0, "quote_count": 0}, "possibly_sensitive": false, "reply_settings": "everyone", "lang": "en", "id": "1253834847258370048", "context_annotations": [{"domain": {"id": "3", "name": "TV Shows", "description": "Television shows from around the world"}, "entity": {"id": "10000271509", "name": "Anderson Cooper 360", "description": "Anderson Cooper goes beyond the headlines with in-depth reporting and investigations."}}, {"domain": {"id": "4", "name": "TV Episodes", "description": "Television show episodes"}, "entity": {"id": "1249271407508242432", "name": "Anderson Cooper 360", "description": "Anderson Cooper goes beyond the headlines with in-depth reporting and investigations. Through nightly \"Keeping Them Honest\" reports, Anderson keeps his commitment to holding those in power accountable."}}, {"domain": {"id": "4", "name": "TV Episodes", "description": "Television show episodes"}, "entity": {"id": "1249277031881138178", "name": "Anderson Cooper 360", "description": "Anderson Cooper goes beyond the headlines with in-depth reporting and investigations. Through nightly \"Keeping Them Honest\" reports, Anderson keeps his commitment to holding those in power accountable"}}, {"domain": {"id": "4", "name": "TV Episodes", "description": "Television show episodes"}, "entity": {"id": "1250891078401552385", "name": "Anderson Cooper 360", "description": "Anderson Cooper goes beyond the headlines with in-depth reporting and investigations. Through nightly \"Keeping Them Honest\" reports, Anderson keeps his commitment to holding those in power accountable."}}, {"domain": {"id": "10", "name": "Person", "description": "Named people in the world like Nelson Mandela"}, "entity": {"id": "799022225751871488", "name": "Donald Trump", "description": "45th US President, Donald Trump"}}, {"domain": {"id": "29", "name": "Events [Entity Service]", "description": "Entity Service related Events domain"}, "entity": {"id": "1249271407508242432", "name": "Anderson Cooper 360", "description": "Anderson Cooper goes beyond the headlines with in-depth reporting and investigations. Through nightly \"Keeping Them Honest\" reports, Anderson keeps his commitment to holding those in power accountable. And, of course, there's the RidicuList, a tongue-in-cheek commentary on the day's news that may leave viewers (and Anderson) laughing. Joining him are guests that frequently include political and legal analysts."}}, {"domain": {"id": "29", "name": "Events [Entity Service]", "description": "Entity Service related Events domain"}, "entity": {"id": "1249277031881138178", "name": "Anderson Cooper 360", "description": "Anderson Cooper goes beyond the headlines with in-depth reporting and investigations. Through nightly \"Keeping Them Honest\" reports, Anderson keeps his commitment to holding those in power accountable. And, of course, there's the RidicuList, a tongue-in-cheek commentary on the day's news that may leave viewers (and Anderson) laughing. Joining him are guests that frequently include political and legal analysts."}}, {"domain": {"id": "29", "name": "Events [Entity Service]", "description": "Entity Service related Events domain"}, "entity": {"id": "1250891078401552385", "name": "Anderson Cooper 360", "description": "Anderson Cooper goes beyond the headlines with in-depth reporting and investigations. Through nightly \"Keeping Them Honest\" reports, Anderson keeps his commitment to holding those in power accountable. And, of course, there's the RidicuList, a tongue-in-cheek commentary on the day's news that may leave viewers (and Anderson) laughing. Joining him are guests that frequently include political and legal analysts."}}, {"domain": {"id": "35", "name": "Politician", "description": "Politicians in the world, like Joe Biden"}, "entity": {"id": "799022225751871488", "name": "Donald Trump", "description": "45th US President, Donald Trump"}}], "created_at": "2020-04-24T23:54:57.000Z", "author_id": "1890848160", "text": "RT @warriors_mom: @AC360 President Trump was referring to this well-documented medical treatment: ", "source": "Twitter for iPhone", "conversation_id": "1253834847258370048"}, {"referenced_tweets": [{"type": "retweeted", "id": "1253452455540666371"}], "entities": {"mentions": [{"start": 3, "end": 16, "username": "warriors_mom", "id": "75184478"}], "annotations": [{"start": 24, "end": 27, "probability": 0.691, "type": "Place", "normalized_text": "U.S."}]}, "public_metrics": {"retweet_count": 5, "reply_count": 0, "like_count": 0, "quote_count": 0}, "possibly_sensitive": false, "reply_settings": "everyone", "lang": "en", "id": "1253828982413410307", "context_annotations": [{"domain": {"id": "123", "name": "Ongoing News Story", "description": "Ongoing News Stories like 'Brexit'"}, "entity": {"id": "1220701888179359745", "name": "COVID-19"}}], "created_at": "2020-04-24T23:31:39.000Z", "author_id": "863857568", "text": "RT @warriors_mom: Major U.S. credit-card issuers begin lowering customer spending limits as coronavirus pandemic shutdowns leave millions j\u2026", "source": "Twitter for iPhone", "conversation_id": "1253828982413410307"}, {"referenced_tweets": [{"type": "retweeted", "id": "1253815956662620163"}],"entities":.... }}

我发现了一些类似的问题,但 none 的答案对我有帮助。有人可以帮我解决这个问题吗?

谢谢!

查看 JSON,referenced_tweets 是一个数组(它的值用方括号括起来:"referenced_tweets":[{"type": "retweeted", "id": "1253739069273710594"}])。

所以错误的原因是 json_tweet$data$referenced_tweets$type 不存在 - type 是数组每个元素的 属性,而不是数组本身。

因此您需要遍历数组。类似这样的内容,基于您的原始代码:

#if this tweet is a retweet, get original tweet's information
if (!is.null(json_tweet$data$referenced_tweets)) {
  for (i in seq_along(json_tweet$data$referenced_tweets)) {
    referenced_tweet <- json_tweet$data$referenced_tweets[[i]]
    if (referenced_tweet$type == 'retweeted') {
      cat("retweet_id: ", referenced_tweet$id, "\n")
      if (keep_text) text <- NA
    }
  }
}

您可能不需要 if (!is.null...),因为我认为 seq_along 会处理 null 情况,但为了便于阅读,您可能希望保留它。