在R中合并并填充不同长度的数据

Question

我正在使用 R，需要合并不同长度的数据

关注此数据集

> means2012
 # A tibble: 232 x 2
   exporter    eci
   <fct>     <dbl>
 1 ABW       0.235
 2 AFG      -0.850
 3 AGO      -1.40 
 4 AIA       1.34 
 5 ALB      -0.480
 6 AND       1.22 
 7 ANS       0.662
 8 ARE       0.289
 9 ARG       0.176
 10 ARM       0.490
 # ... with 222 more rows

> means2013
 # A tibble: 234 x 2
    exporter     eci
    <fct>      <dbl>
  1 ABW       0.534 
  2 AFG      -0.834 
  3 AGO      -1.26  
  4 AIA       1.47  
  5 ALB      -0.498 
  6 AND       1.13  
  7 ANS       0.616 
  8 ARE       0.267 
  9 ARG       0.127 
 10 ARM       0.0616
 # ... with 224 more rows


> str(means2012)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   232 obs. of  2 variables:
 $ exporter: Factor w/ 242 levels "ABW","AFG","AGO",..: 1 2 3 4 5 6 7 9 10 11 ...
 $ eci     : num  0.235 -0.85 -1.404 1.337 -0.48 ...
> str(means2013)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   234 obs. of  2 variables:
 $ exporter: Factor w/ 242 levels "ABW","AFG","AGO",..: 1 2 3 4 5 6 7 9 10 11 ...
 $ eci     : num  0.534 -0.834 -1.263 1.471 -0.498 ...

请注意，2 个小标题有不同的长度。 "Exporter" 是国家。

有没有办法合并两个 tibble，寻找因素（出口商）并用 "na" 填补缺失的部分？

无论是 tibble、dataframe 还是其他类型都没有关系。

像这样：

tibble 1
a 5
b 10
c 15
d 25

tibble 2
a 7
c 23
d 20

merged one:
a 5  7 
b 10 na
c 15 23
d 25 20

Answer 1

使用 merge 并将参数 all 设置为 TRUE:

tibble1 <- read.table(text="
x y
a 5
b 10
c 15
d 25",header=TRUE,stringsAsFactors=FALSE)

tibble2 <- read.table(text="
x z
a 7
c 23
d 20",header=TRUE,stringsAsFactors=FALSE)


merge(tibble1,tibble2,all=TRUE)

  x  y  z
1 a  5  7
2 b 10 NA
3 c 15 23
4 d 25 20

或dplyr::full_join(tibble1,tibble2)效果相同

Answer 2

您可以重命名列以加入它们，并在缺少其他值的地方得到 NA。

library(tidyverse)

means2012 %>% 
  rename(eci2012 = eci) %>% 
  full_join(means2013 %>% 
              rename(eci2013 = eci))

但更简洁的方法是添加 year 列，保持列 eci 不变，并将行绑定在一起。

means2012 %>% 
  mutate(year = 2012) %>% 
  bind_rows(means2013 %>% 
              mutate(year = 2013))

在R中合并并填充不同长度的数据

Merge and fill different length data in R

r

factors

dataframe

tibble

data-transform