如何删除 NA 并将非 NA 值移动到新列？

Question

展开函数后，我想将非 NA 值复制到新列。有没有办法让非NA的数据复制到新的列中？

数据

Serial_ID   Repair_type    Col1        Col2         Coln+1
ID_1            Warranty    NA         02.02.2013   NA
ID_1            Normal      NA         15.10.2011   12.01.2012
ID_2            Warranty    01-01-2013 NA           NA
ID_2            Normal      NA         NA           18.12.2014
ID_n            Normal      NA         23.01.2014   NA

想要的结果

Serial_ID   Repair_type    ColX (new)  ColX2 (new)   Col1      Col2         
ID_1            Warranty   02.02.2013 
ID_1            Normal     15.10.2011  12.01.2012
ID_2            Warranty   01-01-2013 
ID_2            Normal     18.12.2014
ID_n            Normal     23.01.2014

请看下图的数据和结果：

希望这样更清楚。提前谢谢你。

传播前的长数据

数据：

COMM_VIN    Si_DocDate  COMM_Kind   Cost
V1  2017-01-01  Normal  100
V1  2017-03-02  Warranty    200
V2  2015-04-04  Warranty    50
V2  2017-05-22  Warranty    100
V3  2004-05-22  Normal  150
V3  2016-06-01  Normal  250

我希望根据 COMM_Kind

将访问站点的日期移至 COMM_VIN 变量的列

结果：

COMM_VIN    COMM_Kind   Col_ne1 Col_nen Cost(sum)
V1  Normal  2017-01-01      100
V1  Warramty    2015-04-04  2017-03-02  250
V2  Normal  2004-05-22  2016-06-01  400
V2  Warranty    2017-05-22      50

抱歉，我不知道如何添加 table。请看附图：

Answer 1

我认为您需要 dplyr 包中的 coalesce() 函数。我无法读取您的数据，但这是一个包含虚拟数据的示例：

library(dplyr)
df <- data_frame(
  c1 = c(NA, "hey", NA),
  c2 = c(NA, NA, "ho"),
  c3 = c("go", NA, NA)
)

df %>% mutate(colx = coalesce(c1, c2, c3))

生产：

# A tibble: 3 x 4
  c1    c2    c3    colx 
  <chr> <chr> <chr> <chr>
1 NA    NA    go    go   
2 hey   NA    NA    hey  
3 NA    ho    NA    ho

Answer 2

在传播之前，根据长数据实际上更容易做到这一点：

dd %>% gather("key","value",-Serial_ID, -Repair_type) %>% 
 filter(!is.na(value)) %>% # reverse engineer original data (if the original had NAs, you'll need this row to remove them)
group_by(Serial_ID, Repair_type) %>% 
mutate(key=paste0("colx",row_number())) %>% # replace key with minimal number of keys
spread(key,value) # spread again

结果：

# A tibble: 5 x 4
# Groups:   Serial_ID, Repair_type [5]
  Serial_ID Repair_type colx1       colx2      
  <chr>     <chr>       <chr>      <chr>     
1 ID_1      Normal      15.10.2011 12.01.2012
2 ID_1      Warranty    02.02.2013 NA        
3 ID_2      Normal      18.12.2014 NA        
4 ID_2      Warranty    01-01-2013 NA        
5 ID_n      Normal      23.01.2014 NA

如果您真的想避免所有 NA，即使在一行的末尾，您也需要将 NA 替换为空字符串。但我不建议这样做。

以下是适用于您提供的长数据的相同解决方案：

dd %>% group_by(COMM_VIN,COMM_Kind) %>% 
    dplyr::mutate(Cost=sum(Cost),key=paste0("colx",row_number())) %>% 
    spread(key,Si_DocDate)

您会注意到我在价差之前创建了新的成本总和列，以避免创建具有相同 COMM_VIN/Comm_Kind 组合的多行。

结果：

# A tibble: 4 x 5
# Groups:   COMM_VIN, COMM_Kind [4]
  COMM_VIN COMM_Kind  Cost colx1      colx2     
  <fct>    <fct>     <int> <fct>      <fct>     
1 V1       Normal      100 2017-01-01 NA        
2 V1       Warranty    200 2017-03-02 NA        
3 V2       Warranty    150 2015-04-04 2017-05-22
4 V3       Normal      400 2004-05-22 2016-06-01

如何删除 NA 并将非 NA 值移动到新列？

How to remove NA and move the non-NA values to new column?

r

dataframe

na