条件分裂单细胞
Conditionally split single cells
我有这个 data.frame
,我想确定 sample1$domain
中哪些单元格有 "www",将其替换为 ""
和 strsplit
相应的 sample1$suffix
。数据如下所示:
domain suffix
1 wbx2 com
2 redhat com
3 something com
4 gstatic com
5 www googleapis.com
6 smartfilter com
我已经设法解决了这个问题,如下所示,但它改变了行的位置(我希望它保持在位置 5)并且考虑到它会 运行 数百万个案例,我认为这不是最有效的方法。:
library("stringr")
sample1$domain <- ifelse(sample1$domain == "www", "", sample1$domain)
sample1[sample1$domain == "", c("domain", "suffix")] <- sample1[sample1$domain == "", c("suffix", "domain")]
y <- sample1$domain[sample1$suffix == ""]
z <- as.data.frame(unlist(str_split_fixed(y, "[.]", 2)))
colnames(z) <- c("domain", "suffix")
sample1 <- rbind(sample1, z)
sample1 <- subset(sample1, sample1$suffix != "")
rownames(sample1) <- NULL
sample1
# domain suffix
#1 wbx2 com
#2 redhat com
#3 something com
#4 gstatic com
#5 smartfilter com
#6 googleapis com
数据
sample1 <- structure(list(domain = c("wbx2", "redhat", "something",
"gstatic", "www", "smartfilter"), suffix = c("com", "com", "com",
"com", "googleapis.com", "com")), .Names = c("domain", "suffix"
), row.names = c(NA, 6L), class = "data.frame")
我们可以为 "www"
的值创建索引。然后使用该索引替换站点名称,最后替换站点后缀:
ind <- sample1$domain == "www"
sample1$domain[ind] <- sub("^(.*)\..*", "\1", sample1$suffix[ind])
sample1$suffix[ind] <- sub(".*\.(.*)", "\1", sample1$suffix[ind])
sample1
# domain suffix
# 1 wbx2 com
# 2 redhat com
# 3 something com
# 4 gstatic com
# 5 googleapis com
# 6 smartfilter com
我有这个 data.frame
,我想确定 sample1$domain
中哪些单元格有 "www",将其替换为 ""
和 strsplit
相应的 sample1$suffix
。数据如下所示:
domain suffix
1 wbx2 com
2 redhat com
3 something com
4 gstatic com
5 www googleapis.com
6 smartfilter com
我已经设法解决了这个问题,如下所示,但它改变了行的位置(我希望它保持在位置 5)并且考虑到它会 运行 数百万个案例,我认为这不是最有效的方法。:
library("stringr")
sample1$domain <- ifelse(sample1$domain == "www", "", sample1$domain)
sample1[sample1$domain == "", c("domain", "suffix")] <- sample1[sample1$domain == "", c("suffix", "domain")]
y <- sample1$domain[sample1$suffix == ""]
z <- as.data.frame(unlist(str_split_fixed(y, "[.]", 2)))
colnames(z) <- c("domain", "suffix")
sample1 <- rbind(sample1, z)
sample1 <- subset(sample1, sample1$suffix != "")
rownames(sample1) <- NULL
sample1
# domain suffix
#1 wbx2 com
#2 redhat com
#3 something com
#4 gstatic com
#5 smartfilter com
#6 googleapis com
数据
sample1 <- structure(list(domain = c("wbx2", "redhat", "something",
"gstatic", "www", "smartfilter"), suffix = c("com", "com", "com",
"com", "googleapis.com", "com")), .Names = c("domain", "suffix"
), row.names = c(NA, 6L), class = "data.frame")
我们可以为 "www"
的值创建索引。然后使用该索引替换站点名称,最后替换站点后缀:
ind <- sample1$domain == "www"
sample1$domain[ind] <- sub("^(.*)\..*", "\1", sample1$suffix[ind])
sample1$suffix[ind] <- sub(".*\.(.*)", "\1", sample1$suffix[ind])
sample1
# domain suffix
# 1 wbx2 com
# 2 redhat com
# 3 something com
# 4 gstatic com
# 5 googleapis com
# 6 smartfilter com