按行查找特定值的数据框中的所有列
Finding all columns in data frame of a certain value by row
我试图通过数据框中的每一行查找具有特定数字的第一列和具有相同值的最后一列。如果数字是 4,请查看示例数据和所需的输出。
示例数据
ID WZ_1 WZ_2 WZ_3 WZ_4
1 5 4 4 3
2 4 4 3 3
3 4 4 4 4
示例输出
ID First Last
1 WZ_2 WZ_3
2 WZ_1 WZ_2
3 WZ_1 WZ_4
library(data.table)
# dummy data
# use setDT(df) if yours isn't a datatable already
df <- data.table(id = 1:3
, a = c(4,4,0)
, b = c(0,4,0)
, c = c(4,0,4)
); df
id a b c
1: 1 4 0 4
2: 2 4 4 0
3: 3 0 0 4
# find 1st & last column with target value
df[, .(id
, first = apply(.SD, 1, \(i) names(df)[min(which(i==4))])
, last = apply(.SD, 1, \(i) names(df)[max(which(i==4))])
)
]
这是一个 tidyverse
选项,我在其中输入长格式,然后 filter
只保留带有 4
的值,并且只保留第一次和最后一次出现的值。然后,我创建一个新列来表示它是第一个值还是最后一个值,然后转回宽格式。
library(tidyverse)
df %>%
pivot_longer(-ID) %>%
group_by(ID) %>%
filter(value == 4) %>%
filter(row_number()==1 | row_number()==n()) %>%
mutate(col = c("First", "Last")) %>%
pivot_wider(names_from = "col", values_from = "name") %>%
select(-value)
输出
<int> <chr> <chr>
1 1 WZ_2 WZ_3
2 2 WZ_1 WZ_2
3 3 WZ_1 WZ_4
数据
df <- structure(list(ID = 1:3, WZ_1 = c(5L, 4L, 4L), WZ_2 = c(4L, 4L,
4L), WZ_3 = c(4L, 3L, 4L), WZ_4 = c(3L, 3L, 4L)), class = "data.frame", row.names = c(NA,
-3L))
与max.col
:
data.frame(ID = df$ID,
First = names(df)[max.col(df == 4, ties.method = "first")],
Last = names(df)[max.col(df == 4, ties.method = "last")])
ID First Last
1 1 WZ_2 WZ_3
2 2 WZ_1 WZ_2
3 3 WZ_1 WZ_4
数据
df <- read.table(header= T, text= "ID WZ_1 WZ_2 WZ_3 WZ_4
1 5 4 4 3
2 4 4 3 3
3 4 4 4 4 ")
我试图通过数据框中的每一行查找具有特定数字的第一列和具有相同值的最后一列。如果数字是 4,请查看示例数据和所需的输出。
示例数据
ID WZ_1 WZ_2 WZ_3 WZ_4
1 5 4 4 3
2 4 4 3 3
3 4 4 4 4
示例输出
ID First Last
1 WZ_2 WZ_3
2 WZ_1 WZ_2
3 WZ_1 WZ_4
library(data.table)
# dummy data
# use setDT(df) if yours isn't a datatable already
df <- data.table(id = 1:3
, a = c(4,4,0)
, b = c(0,4,0)
, c = c(4,0,4)
); df
id a b c
1: 1 4 0 4
2: 2 4 4 0
3: 3 0 0 4
# find 1st & last column with target value
df[, .(id
, first = apply(.SD, 1, \(i) names(df)[min(which(i==4))])
, last = apply(.SD, 1, \(i) names(df)[max(which(i==4))])
)
]
这是一个 tidyverse
选项,我在其中输入长格式,然后 filter
只保留带有 4
的值,并且只保留第一次和最后一次出现的值。然后,我创建一个新列来表示它是第一个值还是最后一个值,然后转回宽格式。
library(tidyverse)
df %>%
pivot_longer(-ID) %>%
group_by(ID) %>%
filter(value == 4) %>%
filter(row_number()==1 | row_number()==n()) %>%
mutate(col = c("First", "Last")) %>%
pivot_wider(names_from = "col", values_from = "name") %>%
select(-value)
输出
<int> <chr> <chr>
1 1 WZ_2 WZ_3
2 2 WZ_1 WZ_2
3 3 WZ_1 WZ_4
数据
df <- structure(list(ID = 1:3, WZ_1 = c(5L, 4L, 4L), WZ_2 = c(4L, 4L,
4L), WZ_3 = c(4L, 3L, 4L), WZ_4 = c(3L, 3L, 4L)), class = "data.frame", row.names = c(NA,
-3L))
与max.col
:
data.frame(ID = df$ID,
First = names(df)[max.col(df == 4, ties.method = "first")],
Last = names(df)[max.col(df == 4, ties.method = "last")])
ID First Last
1 1 WZ_2 WZ_3
2 2 WZ_1 WZ_2
3 3 WZ_1 WZ_4
数据
df <- read.table(header= T, text= "ID WZ_1 WZ_2 WZ_3 WZ_4
1 5 4 4 3
2 4 4 3 3
3 4 4 4 4 ")