R 中带有 paste0 的新列
new column with paste0 in R
我正在寻找一个允许我添加新列以将称为 ID 的值添加到字符串的函数,即:
我有一个包含你ID的单词列表:
car = 9112
red = 9512
employee = 6117
sky = 2324
words<- c("car", "sky", "red", "employee", "domestic")
match<- c("car", "red", "domestic", "employee", "sky")
通过读取excel文件进行比较,如果它发现值等于我的向量词,它用它的ID替换词,但保留原始词
x10<- c(words)# string
words.corpus <- c(L4$`match`) # pattern
idwords.corpus <- c(L4$`ID`) # replace
words.corpus <- paste0("\A",idwords.corpus, "\z|\A", words.corpus,"\z")
vect.corpus <- idwords.corpus
names(vect.corpus) <- words.corpus
data15 <- str_replace_all(x10, vect.corpus)
结果:
数据 15:
" 9112", "2324", "9512", "6117", "employee"
我正在寻找的是添加一个带有 ID 的新列,而不是用 ID 替换单词
words ID
car 9112
red 9512
employee 6117
sky 2324
domestic domestic
我会使用 data.table 基于固定词值进行快速查找。虽然不是 100% 清楚您的要求,但听起来您想在匹配时用索引值替换单词,或者在不匹配时将单词保留为单词。此代码将执行此操作:
library("data.table")
# associate your ids with fixed word matches in a named numeric vector
ids <- data.table(
word = c("car", "red", "employee", "sky"),
ID = c(9112, 9512, 6117, 2324)
)
setkey(ids, word)
# this is what you would read in
data <- data.table(
word = c("car", "sky", "red", "employee", "domestic", "sky")
)
setkey(data, word)
data <- ids[data]
# replace NAs from no match with word
data[, ID := ifelse(is.na(ID), word, ID)]
data
## word ID
## 1: car 9112
## 2: domestic domestic
## 3: employee 6117
## 4: red 9512
## 5: sky 2324
## 6: sky 2324
此处 "domestic" 未匹配,因此它仍作为 ID 列中的单词。我还重复了 "sky" 来展示这将如何适用于单词的每个实例。
如果要保留原始排序顺序,可以在合并之前创建一个索引变量,然后按该索引变量对输出重新排序。
我正在寻找一个允许我添加新列以将称为 ID 的值添加到字符串的函数,即:
我有一个包含你ID的单词列表:
car = 9112
red = 9512
employee = 6117
sky = 2324
words<- c("car", "sky", "red", "employee", "domestic")
match<- c("car", "red", "domestic", "employee", "sky")
通过读取excel文件进行比较,如果它发现值等于我的向量词,它用它的ID替换词,但保留原始词
x10<- c(words)# string
words.corpus <- c(L4$`match`) # pattern
idwords.corpus <- c(L4$`ID`) # replace
words.corpus <- paste0("\A",idwords.corpus, "\z|\A", words.corpus,"\z")
vect.corpus <- idwords.corpus
names(vect.corpus) <- words.corpus
data15 <- str_replace_all(x10, vect.corpus)
结果:
数据 15:
" 9112", "2324", "9512", "6117", "employee"
我正在寻找的是添加一个带有 ID 的新列,而不是用 ID 替换单词
words ID
car 9112
red 9512
employee 6117
sky 2324
domestic domestic
我会使用 data.table 基于固定词值进行快速查找。虽然不是 100% 清楚您的要求,但听起来您想在匹配时用索引值替换单词,或者在不匹配时将单词保留为单词。此代码将执行此操作:
library("data.table")
# associate your ids with fixed word matches in a named numeric vector
ids <- data.table(
word = c("car", "red", "employee", "sky"),
ID = c(9112, 9512, 6117, 2324)
)
setkey(ids, word)
# this is what you would read in
data <- data.table(
word = c("car", "sky", "red", "employee", "domestic", "sky")
)
setkey(data, word)
data <- ids[data]
# replace NAs from no match with word
data[, ID := ifelse(is.na(ID), word, ID)]
data
## word ID
## 1: car 9112
## 2: domestic domestic
## 3: employee 6117
## 4: red 9512
## 5: sky 2324
## 6: sky 2324
此处 "domestic" 未匹配,因此它仍作为 ID 列中的单词。我还重复了 "sky" 来展示这将如何适用于单词的每个实例。
如果要保留原始排序顺序,可以在合并之前创建一个索引变量,然后按该索引变量对输出重新排序。