将号码范围扩展到带有 NA 的单个号码
Expand number range to the individual numbers with NAs
此题基于。
假设我有一个 NA:
df <- data.frame(start = c(10, 20), end = c(15,NA), label = c('ex1','ex2'))
当我使用以下代码时:
df[, seq(.SD[['start']], .SD[['end']]), by = label]
我收到以下错误:
Error in `[.data.frame`(df, , seq(.SD[["start"]], .SD[["end"]]), by = label) :
unused argument (by = label)
我怎样才能得到这样的东西?:
label V1
1: ex1 10
2: ex1 11
3: ex1 12
4: ex1 13
5: ex1 14
6: ex1 15
7: ex2 20
您可以使用 fcoalesce
将 end
中的 NA
值替换为 start
值,并创建一个从 start
到 [=14] 的序列=] 每个 label
.
library(data.table)
setDT(df)
df <- df[!(is.na(start) & is.na(end))]
df[, end := fcoalesce(end, start)]
df[, seq(start, end), by = label]
# label V1
#1: ex1 10
#2: ex1 11
#3: ex1 12
#4: ex1 13
#5: ex1 14
#6: ex1 15
#7: ex2 20
或使用dplyr
-
library(dplyr)
df %>%
filter(!(is.na(start) & is.na(end))) %>%
mutate(end = coalesce(end, start)) %>%
group_by(label) %>%
summarise(num = seq(start, end), .groups = 'drop')
此题基于
假设我有一个 NA:
df <- data.frame(start = c(10, 20), end = c(15,NA), label = c('ex1','ex2'))
当我使用以下代码时:
df[, seq(.SD[['start']], .SD[['end']]), by = label]
我收到以下错误:
Error in `[.data.frame`(df, , seq(.SD[["start"]], .SD[["end"]]), by = label) :
unused argument (by = label)
我怎样才能得到这样的东西?:
label V1
1: ex1 10
2: ex1 11
3: ex1 12
4: ex1 13
5: ex1 14
6: ex1 15
7: ex2 20
您可以使用 fcoalesce
将 end
中的 NA
值替换为 start
值,并创建一个从 start
到 [=14] 的序列=] 每个 label
.
library(data.table)
setDT(df)
df <- df[!(is.na(start) & is.na(end))]
df[, end := fcoalesce(end, start)]
df[, seq(start, end), by = label]
# label V1
#1: ex1 10
#2: ex1 11
#3: ex1 12
#4: ex1 13
#5: ex1 14
#6: ex1 15
#7: ex2 20
或使用dplyr
-
library(dplyr)
df %>%
filter(!(is.na(start) & is.na(end))) %>%
mutate(end = coalesce(end, start)) %>%
group_by(label) %>%
summarise(num = seq(start, end), .groups = 'drop')