按行查找特定值的数据框中的所有列

Question

我试图通过数据框中的每一行查找具有特定数字的第一列和具有相同值的最后一列。如果数字是 4，请查看示例数据和所需的输出。

示例数据

ID WZ_1 WZ_2 WZ_3 WZ_4
1  5    4    4    3 
2  4    4    3    3
3  4    4    4    4

示例输出

ID First Last 
1  WZ_2  WZ_3
2  WZ_1  WZ_2
3  WZ_1  WZ_4

Answer 1

library(data.table)

# dummy data
# use setDT(df) if yours isn't a datatable already
df <- data.table(id = 1:3
                 , a = c(4,4,0)
                 , b = c(0,4,0)
                 , c = c(4,0,4)
                 ); df
   id a b c
1:  1 4 0 4
2:  2 4 4 0
3:  3 0 0 4

# find 1st & last column with target value
df[, .(id
       , first = apply(.SD, 1, \(i) names(df)[min(which(i==4))])
       , last = apply(.SD, 1, \(i) names(df)[max(which(i==4))])
       )
   ]

Answer 2

这是一个 tidyverse 选项，我在其中输入长格式，然后 filter 只保留带有 4 的值，并且只保留第一次和最后一次出现的值。然后，我创建一个新列来表示它是第一个值还是最后一个值，然后转回宽格式。

library(tidyverse)

df %>% 
  pivot_longer(-ID) %>% 
  group_by(ID) %>% 
  filter(value == 4) %>% 
  filter(row_number()==1 | row_number()==n()) %>% 
  mutate(col = c("First", "Last")) %>% 
  pivot_wider(names_from = "col", values_from = "name") %>% 
  select(-value)

输出

  <int> <chr> <chr>
1     1 WZ_2  WZ_3 
2     2 WZ_1  WZ_2 
3     3 WZ_1  WZ_4

数据

df <- structure(list(ID = 1:3, WZ_1 = c(5L, 4L, 4L), WZ_2 = c(4L, 4L, 
4L), WZ_3 = c(4L, 3L, 4L), WZ_4 = c(3L, 3L, 4L)), class = "data.frame", row.names = c(NA, 
-3L))

Answer 3

与max.col:

data.frame(ID = df$ID,
           First = names(df)[max.col(df == 4, ties.method = "first")],
           Last = names(df)[max.col(df == 4, ties.method = "last")])

  ID First Last
1  1  WZ_2 WZ_3
2  2  WZ_1 WZ_2
3  3  WZ_1 WZ_4

数据

df <- read.table(header= T, text= "ID WZ_1 WZ_2 WZ_3 WZ_4
1  5    4    4    3 
2  4    4    3    3
3  4    4    4    4 ")

按行查找特定值的数据框中的所有列

Finding all columns in data frame of a certain value by row

row

r

data-manipulation

dataframe