如何从数据框的每一列中提取非空值并制作一个列表?

How to extract non-empty values from each column of a dataframe and make a list?

我有一个如下所示的数据集,我想从每一列中提取非空单元格,同时保留 Date 信息。

df <- structure(list(Date = as.Date(c("6/25/2020", "6/26/2020", "6/27/2020"),
      format = "%m/%d/%y"), 
      A = c("",2L,1L),B = c(3L,"",""),C = c(3L,2L,"")),
      class = "data.frame", row.names = c("1", "2", "3"))

这是我正在寻找的结果:

Date       Company Number
2020-06-26    A      2
2020-06-27    A      1
2020-06-25    B      3
2020-06-25    C      3
2020-06-26    C      2

您可以将 pivot_longervalues_drop_na = T 一起使用:

library(tidyverse)
df %>% 
  na_if("") %>% 
  pivot_longer(-Date, values_drop_na = T, names_to = "Company", values_to  = "Number")

  Date       Company Number
  <date>     <chr>   <chr> 
1 2020-06-25 B       3     
2 2020-06-25 C       3     
3 2020-06-26 A       2     
4 2020-06-26 C       2     
5 2020-06-27 A       1   

您还可以使用 pivot_longer 并使用 filter 处理空单元格:

df %>% 
  pivot_longer(-Date, names_to = "Company", values_to  = "Number") %>% 
  filter(Number != "")

另一个可能的解决方案:

library(tidyverse)

df %>% 
  pivot_longer(A:C, names_to = "Company", values_to = "Number",
    values_transform = list(Number = \(x) ifelse(x == "", NA, as.numeric(x))),
    values_drop_na = T)

#> # A tibble: 5 × 3
#>   Date       Company Number
#>   <date>     <chr>    <dbl>
#> 1 2020-06-25 B            3
#> 2 2020-06-25 C            3
#> 3 2020-06-26 A            2
#> 4 2020-06-26 C            2
#> 5 2020-06-27 A            1

base Rreshape

结合使用
out <- transform(na.omit(reshape(type.convert(df, as.is = TRUE),
   idvar = 'Date', varying = list(2:4), v.names = 'Number', 
  direction = "long", timevar = "Company")), Company = names(df)[-1][Company])
row.names(out) <- NULL