R - 警告:"argument is not an atomic vector" 尝试删除空格时
R - Warning: "argument is not an atomic vector" when attempting to remove whitespace
我正处于分析前整理数据的最后阶段,在删除数据中的空格时遇到了一个我无法真正理解的问题 table。有关代码中步骤的说明,请参阅下面的完整代码。
从下一页 (How to remove all whitespace from a string?) 开始,并尝试通过其他页面讨论 errors/warning 原子向量进行故障排除,但运气不佳。
在第 6 步,我收到了流动警告
In stri_replace_all_fixed(allData, " ", "") :
argument is not an atomic vector; coercing
并且在第 7 步出现以下警告
> #Change sold and taxed columes from character to numerical
> allData$SoldAmount <- as.numeric(allData$SoldAmount)
Warning message:
NAs introduced by coercion
> allData$Tax <- as.numeric(allData$Tax)
Warning message:
NAs introduced by coercion
第 6 步和第 7 步似乎都 运行,但结果在两个列中最终为 NA(见图)
Result after wihtespace are removed
下面列出了完整的代码,我希望获得有关如何获得第 6 步和第 7 步的一些建议,以便为我提供没有空格且为数字的列。
#Step 1: Load needed library
library(tidyverse)
library(rvest)
library(jsonlite)
library(stringi)
#Step 2: Access the URL
url <- "https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/10/"
#Step 3: Direct JSON as format of data in URL
data <- jsonlite::fromJSON(url, flatten = TRUE)
#Step 4: Access all items in API
totalItems <- data$TotalNumberOfItems
#Step 5: Summarize all data from API
allData <- paste0('https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/', totalItems,'/') %>%
jsonlite::fromJSON(., flatten = TRUE) %>%
.[1] %>%
as.data.frame() %>%
rename_with(~str_replace(., "ListItems.", ""), everything())
#Step 6: removing colums not needed
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]
#Step 6: remove whitespace in all colums
stri_replace_all_fixed(allData, " ", "")
#Step 7: Change sold and taxed columes from character to numerical
allData$SoldAmount <- as.numeric(allData$SoldAmount)
allData$Tax <- as.numeric(allData$Tax)
你调用 stri_replace_all_fixed(allData, " ", "")
但 ignore/discard 它的输出。 保存在某处。
#Step 6: remove whitespace in all colums
allData[] <- lapply(allData, gsub, pattern = " ", replacement = "")
#Step 7: Change sold and taxed columes from character to numerical
allData$SoldAmount <- as.numeric(allData$SoldAmount)
allData$Tax <- as.numeric(allData$Tax)
head(allData)
# County Municipality Tax SoldAmount Type Date
# 1 Akershus FROGN 2400000 2550000 Bolig 2004
# 2 Akershus FROGN 2225000 2100000 Bolig 2004
# 3 Akershus SKI 7600000 18000000 Næringstomt 2006
# 4 Østfold SARPSBORG 3000000 3815000 Tomt 2004
# 5 Østfold RYGGE 10000000 16000000 Næringseiendom 2006
# 6 Vestfold LARVIK 61950 61950 Tomt 2013
或者,只对您需要的列执行一次:
# allData <- paste0(...) %>% ...
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]
allData[c("Tax", "SoldAmount")] <- lapply(allData[c("Tax", "SoldAmount")], function(z) as.numeric(gsub(" ", "", z)))
head(allData)
# County Municipality Tax SoldAmount Type Date
# 1 Akershus FROGN 2400000 2550000 Bolig 2004
# 2 Akershus FROGN 2225000 2100000 Bolig 2004
# 3 Akershus SKI 7600000 18000000 Næringstomt 2006
# 4 Østfold SARPSBORG 3000000 3815000 Tomt 2004
# 5 Østfold RYGGE 10000000 16000000 Næringseiendom 2006
# 6 Vestfold LARVIK 61950 61950 Tomt 2013
仅替换这两列的特殊性很重要,因为其他列中有许多值有空格,我不知道您是打算将它们全部压缩:
str(sapply(allData, function(z) unique(grep(" ", z, value = TRUE)), simplify = FALSE))
# List of 6
# $ County : chr [1:2] "Møre og Romsdal" "Sogn- og fjordane"
# $ Municipality: chr [1:4] "EVJE OG HORNNES" "VESTRE TOTEN" "ØSTRE TOTEN" "NORDRE LAND"
# $ Tax : chr [1:414] " 2 400 000" " 2 225 000" " 7 600 000" " 3 000 000" ...
# $ SoldAmount : chr [1:538] " 2 550 000" " 2 100 000" " 18 000 000" " 3 815 000" ...
# $ Type : chr "Annen kategori"
# $ Date : chr(0)
我正处于分析前整理数据的最后阶段,在删除数据中的空格时遇到了一个我无法真正理解的问题 table。有关代码中步骤的说明,请参阅下面的完整代码。
从下一页 (How to remove all whitespace from a string?) 开始,并尝试通过其他页面讨论 errors/warning 原子向量进行故障排除,但运气不佳。
在第 6 步,我收到了流动警告
In stri_replace_all_fixed(allData, " ", "") :
argument is not an atomic vector; coercing
并且在第 7 步出现以下警告
> #Change sold and taxed columes from character to numerical
> allData$SoldAmount <- as.numeric(allData$SoldAmount)
Warning message:
NAs introduced by coercion
> allData$Tax <- as.numeric(allData$Tax)
Warning message:
NAs introduced by coercion
第 6 步和第 7 步似乎都 运行,但结果在两个列中最终为 NA(见图)
Result after wihtespace are removed
下面列出了完整的代码,我希望获得有关如何获得第 6 步和第 7 步的一些建议,以便为我提供没有空格且为数字的列。
#Step 1: Load needed library
library(tidyverse)
library(rvest)
library(jsonlite)
library(stringi)
#Step 2: Access the URL
url <- "https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/10/"
#Step 3: Direct JSON as format of data in URL
data <- jsonlite::fromJSON(url, flatten = TRUE)
#Step 4: Access all items in API
totalItems <- data$TotalNumberOfItems
#Step 5: Summarize all data from API
allData <- paste0('https://www.forsvarsbygg.no/ListApi/ListContent/78635/SoldEstates/0/', totalItems,'/') %>%
jsonlite::fromJSON(., flatten = TRUE) %>%
.[1] %>%
as.data.frame() %>%
rename_with(~str_replace(., "ListItems.", ""), everything())
#Step 6: removing colums not needed
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]
#Step 6: remove whitespace in all colums
stri_replace_all_fixed(allData, " ", "")
#Step 7: Change sold and taxed columes from character to numerical
allData$SoldAmount <- as.numeric(allData$SoldAmount)
allData$Tax <- as.numeric(allData$Tax)
你调用 stri_replace_all_fixed(allData, " ", "")
但 ignore/discard 它的输出。 保存在某处。
#Step 6: remove whitespace in all colums
allData[] <- lapply(allData, gsub, pattern = " ", replacement = "")
#Step 7: Change sold and taxed columes from character to numerical
allData$SoldAmount <- as.numeric(allData$SoldAmount)
allData$Tax <- as.numeric(allData$Tax)
head(allData)
# County Municipality Tax SoldAmount Type Date
# 1 Akershus FROGN 2400000 2550000 Bolig 2004
# 2 Akershus FROGN 2225000 2100000 Bolig 2004
# 3 Akershus SKI 7600000 18000000 Næringstomt 2006
# 4 Østfold SARPSBORG 3000000 3815000 Tomt 2004
# 5 Østfold RYGGE 10000000 16000000 Næringseiendom 2006
# 6 Vestfold LARVIK 61950 61950 Tomt 2013
或者,只对您需要的列执行一次:
# allData <- paste0(...) %>% ...
allData <- allData[, -c(1,4,8,9,11,12,13,14,15)]
allData[c("Tax", "SoldAmount")] <- lapply(allData[c("Tax", "SoldAmount")], function(z) as.numeric(gsub(" ", "", z)))
head(allData)
# County Municipality Tax SoldAmount Type Date
# 1 Akershus FROGN 2400000 2550000 Bolig 2004
# 2 Akershus FROGN 2225000 2100000 Bolig 2004
# 3 Akershus SKI 7600000 18000000 Næringstomt 2006
# 4 Østfold SARPSBORG 3000000 3815000 Tomt 2004
# 5 Østfold RYGGE 10000000 16000000 Næringseiendom 2006
# 6 Vestfold LARVIK 61950 61950 Tomt 2013
仅替换这两列的特殊性很重要,因为其他列中有许多值有空格,我不知道您是打算将它们全部压缩:
str(sapply(allData, function(z) unique(grep(" ", z, value = TRUE)), simplify = FALSE))
# List of 6
# $ County : chr [1:2] "Møre og Romsdal" "Sogn- og fjordane"
# $ Municipality: chr [1:4] "EVJE OG HORNNES" "VESTRE TOTEN" "ØSTRE TOTEN" "NORDRE LAND"
# $ Tax : chr [1:414] " 2 400 000" " 2 225 000" " 7 600 000" " 3 000 000" ...
# $ SoldAmount : chr [1:538] " 2 550 000" " 2 100 000" " 18 000 000" " 3 815 000" ...
# $ Type : chr "Annen kategori"
# $ Date : chr(0)