如果数据帧长度不同,如何用 R 中另一个数据帧的值填充列?

How to fill in a column with values from another dataframe in R if the dataframe length is different?

我试图通过类似的主题找到解决方案,但没有找到合适的。这可能是由于我使用的搜索词。如果我遗漏了什么,请接受我的歉意

我有两个数据帧 un_1ets_1。它们在列方面已经具有相同的结构。它们之间的区别在于它们有不同的年份范围(un_1 = 1990:2016; ets_1 = 2005:2017)并且一些国家也不同。

我想做的是创建一个合并的数据集energy,其中将填充来自两个数据集的数据。这必须理解为将un_1$UNemissions的内容填充到energyets_1$ETSemissions 相同。 energy 中的列结构将与其他两个数据框中的相同。

这里是数据的摘录:

un_1

 country iso2 year     sector  UNemissions ETSemissions
 Austria   AT 1990 1 - Energy  14025035.91          NaN
 Austria   AT 1991 1 - Energy  14791166.44          NaN
 Austria   AT 1992 1 - Energy  11581927.50          NaN
 Austria   AT 1993 1 - Energy  11623249.52          NaN
 Austria   AT 1994 1 - Energy  11915320.45          NaN
 Austria   AT 1995 1 - Energy  13044941.45          NaN
 Austria   AT 1996 1 - Energy  14048250.53          NaN
 Austria   AT 1997 1 - Energy  14077003.80          NaN
 Austria   AT 1998 1 - Energy  13106015.03          NaN
 Austria   AT 1999 1 - Energy  12548768.26          NaN
 Austria   AT 2000 1 - Energy  12263794.33          NaN
 Austria   AT 2001 1 - Energy  13770416.74          NaN
 Austria   AT 2002 1 - Energy  13380097.16          NaN
 Austria   AT 2003 1 - Energy  15965271.93          NaN
 Austria   AT 2004 1 - Energy  15899196.33          NaN
 Austria   AT 2005 1 - Energy  16194772.33          NaN
 Austria   AT 2006 1 - Energy  15039192.77          NaN
 Austria   AT 2007 1 - Energy  13757091.05          NaN
 Austria   AT 2008 1 - Energy  13582006.99          NaN
 Austria   AT 2009 1 - Energy  12526267.29          NaN
 Austria   AT 2010 1 - Energy  13852187.50          NaN
 Austria   AT 2011 1 - Energy  13666544.68          NaN
 Austria   AT 2012 1 - Energy  12256272.25          NaN
 Austria   AT 2013 1 - Energy  11224625.46          NaN
 Austria   AT 2014 1 - Energy   9499544.19          NaN
 Austria   AT 2015 1 - Energy  10623550.19          NaN
 Austria   AT 2016 1 - Energy  10448925.88          NaN
 Belgium   BE 1990 1 - Energy  29859360.87          NaN
 Belgium   BE 1991 1 - Energy  30491531.89          NaN
 Belgium   BE 1992 1 - Energy  29289874.38          NaN
 Belgium   BE 1993 1 - Energy  28769050.88          NaN
 Belgium   BE 1994 1 - Energy  29867955.59          NaN
 Belgium   BE 1995 1 - Energy  29386218.06          NaN
 Belgium   BE 1996 1 - Energy  28658131.35          NaN
 Belgium   BE 1997 1 - Energy  27609157.78          NaN
 Belgium   BE 1998 1 - Energy  30340887.77          NaN
 Belgium   BE 1999 1 - Energy  26555203.53          NaN
 Belgium   BE 2000 1 - Energy  28425730.95          NaN
 Belgium   BE 2001 1 - Energy  26382223.52          NaN
 Belgium   BE 2002 1 - Energy  27819402.95          NaN
 Belgium   BE 2003 1 - Energy  28954615.63          NaN
 Belgium   BE 2004 1 - Energy  29442709.72          NaN
 Belgium   BE 2005 1 - Energy  29246990.16          NaN
 Belgium   BE 2006 1 - Energy  28136794.10          NaN
 Belgium   BE 2007 1 - Energy  27435553.32          NaN
 Belgium   BE 2008 1 - Energy  25344134.83          NaN
 Belgium   BE 2009 1 - Energy  25744709.35          NaN
 Belgium   BE 2010 1 - Energy  26341043.76          NaN
 Belgium   BE 2011 1 - Energy  22921875.41          NaN
 Belgium   BE 2012 1 - Energy  22809482.09          NaN
 Belgium   BE 2013 1 - Energy  21242431.53          NaN
 Belgium   BE 2014 1 - Energy  20375966.00          NaN
 Belgium   BE 2015 1 - Energy  21091059.19          NaN
 Belgium   BE 2016 1 - Energy  19792162.61          NaN 

ets_1

 country iso2 year     sector UNemissions ETSemissions
 Austria   AT 2005 1 - Energy         NaN     16539659
 Austria   AT 2006 1 - Energy         NaN     15275065
 Austria   AT 2007 1 - Energy         NaN     14124646
 Austria   AT 2008 1 - Energy         NaN     14572511
 Austria   AT 2009 1 - Energy         NaN     12767555
 Austria   AT 2010 1 - Energy         NaN     15506112
 Austria   AT 2011 1 - Energy         NaN     15131551
 Austria   AT 2012 1 - Energy         NaN     13121434
 Austria   AT 2013 1 - Energy         NaN      8074514
 Austria   AT 2014 1 - Energy         NaN      6426135
 Austria   AT 2015 1 - Energy         NaN      7514263
 Austria   AT 2016 1 - Energy         NaN      7142937
 Austria   AT 2017 1 - Energy         NaN      7795277
 Belgium   BE 2005 1 - Energy         NaN     25460856
 Belgium   BE 2006 1 - Energy         NaN     24099282
 Belgium   BE 2007 1 - Energy         NaN     23706084
 Belgium   BE 2008 1 - Energy         NaN     23166180
 Belgium   BE 2009 1 - Energy         NaN     21185552
 Belgium   BE 2010 1 - Energy         NaN     22073616
 Belgium   BE 2011 1 - Energy         NaN     18950876
 Belgium   BE 2012 1 - Energy         NaN     17463388
 Belgium   BE 2013 1 - Energy         NaN     16728267
 Belgium   BE 2014 1 - Energy         NaN     15230243
 Belgium   BE 2015 1 - Energy         NaN     16053800
 Belgium   BE 2016 1 - Energy         NaN     15027777
 Belgium   BE 2017 1 - Energy         NaN     15093036

我试过 energy <- merge(un_1, ets_1),但这只会创建一个包含 6 列和零观测值的新日期框。

我也尝试过 rbind,但这只会将一个数据帧中的数据添加到另一个数据帧的底部。

un_1$UNemissionsets_1$ETSemissions 中的两个排放列都是数字。

energy 应该是什么样子(一个国家应该做的例子)

 country iso2 year     sector  UNemissions ETSemissions
 Austria   AT 1990 1 - Energy  14025035.91          NaN
 Austria   AT 1991 1 - Energy  14791166.44          NaN
 Austria   AT 1992 1 - Energy  11581927.50          NaN
 Austria   AT 1993 1 - Energy  11623249.52          NaN
 Austria   AT 1994 1 - Energy  11915320.45          NaN
 Austria   AT 1995 1 - Energy  13044941.45          NaN
 Austria   AT 1996 1 - Energy  14048250.53          NaN
 Austria   AT 1997 1 - Energy  14077003.80          NaN
 Austria   AT 1998 1 - Energy  13106015.03          NaN
 Austria   AT 1999 1 - Energy  12548768.26          NaN
 Austria   AT 2000 1 - Energy  12263794.33          NaN
 Austria   AT 2001 1 - Energy  13770416.74          NaN
 Austria   AT 2002 1 - Energy  13380097.16          NaN
 Austria   AT 2003 1 - Energy  15965271.93          NaN
 Austria   AT 2004 1 - Energy  15899196.33          NaN
 Austria   AT 2005 1 - Energy  16194772.33          16539659
 Austria   AT 2006 1 - Energy  15039192.77          15275065
 Austria   AT 2007 1 - Energy  13757091.05          14124646
 Austria   AT 2008 1 - Energy  13582006.99          14572511
 Austria   AT 2009 1 - Energy  12526267.29          12767555
 Austria   AT 2010 1 - Energy  13852187.50          15506112
 Austria   AT 2011 1 - Energy  13666544.68          15131551
 Austria   AT 2012 1 - Energy  12256272.25          13121434
 Austria   AT 2013 1 - Energy  11224625.46          8074514
 Austria   AT 2014 1 - Energy   9499544.19          6426135
 Austria   AT 2015 1 - Energy  10623550.19          7514263
 Austria   AT 2016 1 - Energy  10448925.88          7142937
 Austria   AT 2017 1 - Energy         NaN           7795277  

非常感谢您的帮助!!

最佳,

康斯坦丁

我会为此使用 dplyr 包:

library(dplyr)

energy <- un_1 %>% 
  select(-ETSemissions) %>% 
  full_join(ets_1 %>% 
              select(-UNemissions))

在上面,我们取 un_1 并用 select 删除空列 ETSemissions。接下来,我们使用 full_joinets_1 组合,但在此之前,我们将 UNemissionsselect

移除

考虑使用 mergebyall 参数与指示符字段完全连接合并。然后,填写 emissions 列的缺失项。所有都可以在 within() 上下文中处理:

# MERGE ON INDICATORS (FULL OUTER JOIN)
merge_df <- merge(un_1, ets_1, by=c("country", "iso2", "year", "sector"), 
                  all=TRUE, suffixes=c("", "_"))

final_df <- within(merge_df, {
                   # FILL IN MISSINGS WITH UNDERSCORE COLS
                   UNemissions <- ifelse(is.na(UNemissions), UNemissions_, UNemissions)
                   ETSemissions <- ifelse(is.na(ETSemissions), ETSemissions_, ETSemissions)

                   # REMOVE FILL-IN UNDERSCORE COLS
                   UNemissions_ <- NULL
                   ETSemissions_ <- NULL
                 })

Rextester Demo