如果数据帧长度不同,如何用 R 中另一个数据帧的值填充列?
How to fill in a column with values from another dataframe in R if the dataframe length is different?
我试图通过类似的主题找到解决方案,但没有找到合适的。这可能是由于我使用的搜索词。如果我遗漏了什么,请接受我的歉意
我有两个数据帧 un_1
和 ets_1
。它们在列方面已经具有相同的结构。它们之间的区别在于它们有不同的年份范围(un_1
= 1990:2016; ets_1
= 2005:2017)并且一些国家也不同。
我想做的是创建一个合并的数据集energy
,其中将填充来自两个数据集的数据。这必须理解为将un_1$UNemissions
的内容填充到energy
与 ets_1$ETSemissions
相同。 energy
中的列结构将与其他两个数据框中的相同。
这里是数据的摘录:
un_1
country iso2 year sector UNemissions ETSemissions
Austria AT 1990 1 - Energy 14025035.91 NaN
Austria AT 1991 1 - Energy 14791166.44 NaN
Austria AT 1992 1 - Energy 11581927.50 NaN
Austria AT 1993 1 - Energy 11623249.52 NaN
Austria AT 1994 1 - Energy 11915320.45 NaN
Austria AT 1995 1 - Energy 13044941.45 NaN
Austria AT 1996 1 - Energy 14048250.53 NaN
Austria AT 1997 1 - Energy 14077003.80 NaN
Austria AT 1998 1 - Energy 13106015.03 NaN
Austria AT 1999 1 - Energy 12548768.26 NaN
Austria AT 2000 1 - Energy 12263794.33 NaN
Austria AT 2001 1 - Energy 13770416.74 NaN
Austria AT 2002 1 - Energy 13380097.16 NaN
Austria AT 2003 1 - Energy 15965271.93 NaN
Austria AT 2004 1 - Energy 15899196.33 NaN
Austria AT 2005 1 - Energy 16194772.33 NaN
Austria AT 2006 1 - Energy 15039192.77 NaN
Austria AT 2007 1 - Energy 13757091.05 NaN
Austria AT 2008 1 - Energy 13582006.99 NaN
Austria AT 2009 1 - Energy 12526267.29 NaN
Austria AT 2010 1 - Energy 13852187.50 NaN
Austria AT 2011 1 - Energy 13666544.68 NaN
Austria AT 2012 1 - Energy 12256272.25 NaN
Austria AT 2013 1 - Energy 11224625.46 NaN
Austria AT 2014 1 - Energy 9499544.19 NaN
Austria AT 2015 1 - Energy 10623550.19 NaN
Austria AT 2016 1 - Energy 10448925.88 NaN
Belgium BE 1990 1 - Energy 29859360.87 NaN
Belgium BE 1991 1 - Energy 30491531.89 NaN
Belgium BE 1992 1 - Energy 29289874.38 NaN
Belgium BE 1993 1 - Energy 28769050.88 NaN
Belgium BE 1994 1 - Energy 29867955.59 NaN
Belgium BE 1995 1 - Energy 29386218.06 NaN
Belgium BE 1996 1 - Energy 28658131.35 NaN
Belgium BE 1997 1 - Energy 27609157.78 NaN
Belgium BE 1998 1 - Energy 30340887.77 NaN
Belgium BE 1999 1 - Energy 26555203.53 NaN
Belgium BE 2000 1 - Energy 28425730.95 NaN
Belgium BE 2001 1 - Energy 26382223.52 NaN
Belgium BE 2002 1 - Energy 27819402.95 NaN
Belgium BE 2003 1 - Energy 28954615.63 NaN
Belgium BE 2004 1 - Energy 29442709.72 NaN
Belgium BE 2005 1 - Energy 29246990.16 NaN
Belgium BE 2006 1 - Energy 28136794.10 NaN
Belgium BE 2007 1 - Energy 27435553.32 NaN
Belgium BE 2008 1 - Energy 25344134.83 NaN
Belgium BE 2009 1 - Energy 25744709.35 NaN
Belgium BE 2010 1 - Energy 26341043.76 NaN
Belgium BE 2011 1 - Energy 22921875.41 NaN
Belgium BE 2012 1 - Energy 22809482.09 NaN
Belgium BE 2013 1 - Energy 21242431.53 NaN
Belgium BE 2014 1 - Energy 20375966.00 NaN
Belgium BE 2015 1 - Energy 21091059.19 NaN
Belgium BE 2016 1 - Energy 19792162.61 NaN
ets_1
country iso2 year sector UNemissions ETSemissions
Austria AT 2005 1 - Energy NaN 16539659
Austria AT 2006 1 - Energy NaN 15275065
Austria AT 2007 1 - Energy NaN 14124646
Austria AT 2008 1 - Energy NaN 14572511
Austria AT 2009 1 - Energy NaN 12767555
Austria AT 2010 1 - Energy NaN 15506112
Austria AT 2011 1 - Energy NaN 15131551
Austria AT 2012 1 - Energy NaN 13121434
Austria AT 2013 1 - Energy NaN 8074514
Austria AT 2014 1 - Energy NaN 6426135
Austria AT 2015 1 - Energy NaN 7514263
Austria AT 2016 1 - Energy NaN 7142937
Austria AT 2017 1 - Energy NaN 7795277
Belgium BE 2005 1 - Energy NaN 25460856
Belgium BE 2006 1 - Energy NaN 24099282
Belgium BE 2007 1 - Energy NaN 23706084
Belgium BE 2008 1 - Energy NaN 23166180
Belgium BE 2009 1 - Energy NaN 21185552
Belgium BE 2010 1 - Energy NaN 22073616
Belgium BE 2011 1 - Energy NaN 18950876
Belgium BE 2012 1 - Energy NaN 17463388
Belgium BE 2013 1 - Energy NaN 16728267
Belgium BE 2014 1 - Energy NaN 15230243
Belgium BE 2015 1 - Energy NaN 16053800
Belgium BE 2016 1 - Energy NaN 15027777
Belgium BE 2017 1 - Energy NaN 15093036
我试过 energy <- merge(un_1, ets_1)
,但这只会创建一个包含 6 列和零观测值的新日期框。
我也尝试过 rbind,但这只会将一个数据帧中的数据添加到另一个数据帧的底部。
un_1$UNemissions
和 ets_1$ETSemissions
中的两个排放列都是数字。
energy
应该是什么样子(一个国家应该做的例子)
country iso2 year sector UNemissions ETSemissions
Austria AT 1990 1 - Energy 14025035.91 NaN
Austria AT 1991 1 - Energy 14791166.44 NaN
Austria AT 1992 1 - Energy 11581927.50 NaN
Austria AT 1993 1 - Energy 11623249.52 NaN
Austria AT 1994 1 - Energy 11915320.45 NaN
Austria AT 1995 1 - Energy 13044941.45 NaN
Austria AT 1996 1 - Energy 14048250.53 NaN
Austria AT 1997 1 - Energy 14077003.80 NaN
Austria AT 1998 1 - Energy 13106015.03 NaN
Austria AT 1999 1 - Energy 12548768.26 NaN
Austria AT 2000 1 - Energy 12263794.33 NaN
Austria AT 2001 1 - Energy 13770416.74 NaN
Austria AT 2002 1 - Energy 13380097.16 NaN
Austria AT 2003 1 - Energy 15965271.93 NaN
Austria AT 2004 1 - Energy 15899196.33 NaN
Austria AT 2005 1 - Energy 16194772.33 16539659
Austria AT 2006 1 - Energy 15039192.77 15275065
Austria AT 2007 1 - Energy 13757091.05 14124646
Austria AT 2008 1 - Energy 13582006.99 14572511
Austria AT 2009 1 - Energy 12526267.29 12767555
Austria AT 2010 1 - Energy 13852187.50 15506112
Austria AT 2011 1 - Energy 13666544.68 15131551
Austria AT 2012 1 - Energy 12256272.25 13121434
Austria AT 2013 1 - Energy 11224625.46 8074514
Austria AT 2014 1 - Energy 9499544.19 6426135
Austria AT 2015 1 - Energy 10623550.19 7514263
Austria AT 2016 1 - Energy 10448925.88 7142937
Austria AT 2017 1 - Energy NaN 7795277
非常感谢您的帮助!!
最佳,
康斯坦丁
我会为此使用 dplyr
包:
library(dplyr)
energy <- un_1 %>%
select(-ETSemissions) %>%
full_join(ets_1 %>%
select(-UNemissions))
在上面,我们取 un_1
并用 select
删除空列 ETSemissions
。接下来,我们使用 full_join
与 ets_1
组合,但在此之前,我们将 UNemissions
与 select
移除
考虑使用 merge
的 by
和 all
参数与指示符字段完全连接合并。然后,填写 emissions 列的缺失项。所有都可以在 within()
上下文中处理:
# MERGE ON INDICATORS (FULL OUTER JOIN)
merge_df <- merge(un_1, ets_1, by=c("country", "iso2", "year", "sector"),
all=TRUE, suffixes=c("", "_"))
final_df <- within(merge_df, {
# FILL IN MISSINGS WITH UNDERSCORE COLS
UNemissions <- ifelse(is.na(UNemissions), UNemissions_, UNemissions)
ETSemissions <- ifelse(is.na(ETSemissions), ETSemissions_, ETSemissions)
# REMOVE FILL-IN UNDERSCORE COLS
UNemissions_ <- NULL
ETSemissions_ <- NULL
})
我试图通过类似的主题找到解决方案,但没有找到合适的。这可能是由于我使用的搜索词。如果我遗漏了什么,请接受我的歉意
我有两个数据帧 un_1
和 ets_1
。它们在列方面已经具有相同的结构。它们之间的区别在于它们有不同的年份范围(un_1
= 1990:2016; ets_1
= 2005:2017)并且一些国家也不同。
我想做的是创建一个合并的数据集energy
,其中将填充来自两个数据集的数据。这必须理解为将un_1$UNemissions
的内容填充到energy
与 ets_1$ETSemissions
相同。 energy
中的列结构将与其他两个数据框中的相同。
这里是数据的摘录:
un_1
country iso2 year sector UNemissions ETSemissions
Austria AT 1990 1 - Energy 14025035.91 NaN
Austria AT 1991 1 - Energy 14791166.44 NaN
Austria AT 1992 1 - Energy 11581927.50 NaN
Austria AT 1993 1 - Energy 11623249.52 NaN
Austria AT 1994 1 - Energy 11915320.45 NaN
Austria AT 1995 1 - Energy 13044941.45 NaN
Austria AT 1996 1 - Energy 14048250.53 NaN
Austria AT 1997 1 - Energy 14077003.80 NaN
Austria AT 1998 1 - Energy 13106015.03 NaN
Austria AT 1999 1 - Energy 12548768.26 NaN
Austria AT 2000 1 - Energy 12263794.33 NaN
Austria AT 2001 1 - Energy 13770416.74 NaN
Austria AT 2002 1 - Energy 13380097.16 NaN
Austria AT 2003 1 - Energy 15965271.93 NaN
Austria AT 2004 1 - Energy 15899196.33 NaN
Austria AT 2005 1 - Energy 16194772.33 NaN
Austria AT 2006 1 - Energy 15039192.77 NaN
Austria AT 2007 1 - Energy 13757091.05 NaN
Austria AT 2008 1 - Energy 13582006.99 NaN
Austria AT 2009 1 - Energy 12526267.29 NaN
Austria AT 2010 1 - Energy 13852187.50 NaN
Austria AT 2011 1 - Energy 13666544.68 NaN
Austria AT 2012 1 - Energy 12256272.25 NaN
Austria AT 2013 1 - Energy 11224625.46 NaN
Austria AT 2014 1 - Energy 9499544.19 NaN
Austria AT 2015 1 - Energy 10623550.19 NaN
Austria AT 2016 1 - Energy 10448925.88 NaN
Belgium BE 1990 1 - Energy 29859360.87 NaN
Belgium BE 1991 1 - Energy 30491531.89 NaN
Belgium BE 1992 1 - Energy 29289874.38 NaN
Belgium BE 1993 1 - Energy 28769050.88 NaN
Belgium BE 1994 1 - Energy 29867955.59 NaN
Belgium BE 1995 1 - Energy 29386218.06 NaN
Belgium BE 1996 1 - Energy 28658131.35 NaN
Belgium BE 1997 1 - Energy 27609157.78 NaN
Belgium BE 1998 1 - Energy 30340887.77 NaN
Belgium BE 1999 1 - Energy 26555203.53 NaN
Belgium BE 2000 1 - Energy 28425730.95 NaN
Belgium BE 2001 1 - Energy 26382223.52 NaN
Belgium BE 2002 1 - Energy 27819402.95 NaN
Belgium BE 2003 1 - Energy 28954615.63 NaN
Belgium BE 2004 1 - Energy 29442709.72 NaN
Belgium BE 2005 1 - Energy 29246990.16 NaN
Belgium BE 2006 1 - Energy 28136794.10 NaN
Belgium BE 2007 1 - Energy 27435553.32 NaN
Belgium BE 2008 1 - Energy 25344134.83 NaN
Belgium BE 2009 1 - Energy 25744709.35 NaN
Belgium BE 2010 1 - Energy 26341043.76 NaN
Belgium BE 2011 1 - Energy 22921875.41 NaN
Belgium BE 2012 1 - Energy 22809482.09 NaN
Belgium BE 2013 1 - Energy 21242431.53 NaN
Belgium BE 2014 1 - Energy 20375966.00 NaN
Belgium BE 2015 1 - Energy 21091059.19 NaN
Belgium BE 2016 1 - Energy 19792162.61 NaN
ets_1
country iso2 year sector UNemissions ETSemissions
Austria AT 2005 1 - Energy NaN 16539659
Austria AT 2006 1 - Energy NaN 15275065
Austria AT 2007 1 - Energy NaN 14124646
Austria AT 2008 1 - Energy NaN 14572511
Austria AT 2009 1 - Energy NaN 12767555
Austria AT 2010 1 - Energy NaN 15506112
Austria AT 2011 1 - Energy NaN 15131551
Austria AT 2012 1 - Energy NaN 13121434
Austria AT 2013 1 - Energy NaN 8074514
Austria AT 2014 1 - Energy NaN 6426135
Austria AT 2015 1 - Energy NaN 7514263
Austria AT 2016 1 - Energy NaN 7142937
Austria AT 2017 1 - Energy NaN 7795277
Belgium BE 2005 1 - Energy NaN 25460856
Belgium BE 2006 1 - Energy NaN 24099282
Belgium BE 2007 1 - Energy NaN 23706084
Belgium BE 2008 1 - Energy NaN 23166180
Belgium BE 2009 1 - Energy NaN 21185552
Belgium BE 2010 1 - Energy NaN 22073616
Belgium BE 2011 1 - Energy NaN 18950876
Belgium BE 2012 1 - Energy NaN 17463388
Belgium BE 2013 1 - Energy NaN 16728267
Belgium BE 2014 1 - Energy NaN 15230243
Belgium BE 2015 1 - Energy NaN 16053800
Belgium BE 2016 1 - Energy NaN 15027777
Belgium BE 2017 1 - Energy NaN 15093036
我试过 energy <- merge(un_1, ets_1)
,但这只会创建一个包含 6 列和零观测值的新日期框。
我也尝试过 rbind,但这只会将一个数据帧中的数据添加到另一个数据帧的底部。
un_1$UNemissions
和 ets_1$ETSemissions
中的两个排放列都是数字。
energy
应该是什么样子(一个国家应该做的例子)
country iso2 year sector UNemissions ETSemissions
Austria AT 1990 1 - Energy 14025035.91 NaN
Austria AT 1991 1 - Energy 14791166.44 NaN
Austria AT 1992 1 - Energy 11581927.50 NaN
Austria AT 1993 1 - Energy 11623249.52 NaN
Austria AT 1994 1 - Energy 11915320.45 NaN
Austria AT 1995 1 - Energy 13044941.45 NaN
Austria AT 1996 1 - Energy 14048250.53 NaN
Austria AT 1997 1 - Energy 14077003.80 NaN
Austria AT 1998 1 - Energy 13106015.03 NaN
Austria AT 1999 1 - Energy 12548768.26 NaN
Austria AT 2000 1 - Energy 12263794.33 NaN
Austria AT 2001 1 - Energy 13770416.74 NaN
Austria AT 2002 1 - Energy 13380097.16 NaN
Austria AT 2003 1 - Energy 15965271.93 NaN
Austria AT 2004 1 - Energy 15899196.33 NaN
Austria AT 2005 1 - Energy 16194772.33 16539659
Austria AT 2006 1 - Energy 15039192.77 15275065
Austria AT 2007 1 - Energy 13757091.05 14124646
Austria AT 2008 1 - Energy 13582006.99 14572511
Austria AT 2009 1 - Energy 12526267.29 12767555
Austria AT 2010 1 - Energy 13852187.50 15506112
Austria AT 2011 1 - Energy 13666544.68 15131551
Austria AT 2012 1 - Energy 12256272.25 13121434
Austria AT 2013 1 - Energy 11224625.46 8074514
Austria AT 2014 1 - Energy 9499544.19 6426135
Austria AT 2015 1 - Energy 10623550.19 7514263
Austria AT 2016 1 - Energy 10448925.88 7142937
Austria AT 2017 1 - Energy NaN 7795277
非常感谢您的帮助!!
最佳,
康斯坦丁
我会为此使用 dplyr
包:
library(dplyr)
energy <- un_1 %>%
select(-ETSemissions) %>%
full_join(ets_1 %>%
select(-UNemissions))
在上面,我们取 un_1
并用 select
删除空列 ETSemissions
。接下来,我们使用 full_join
与 ets_1
组合,但在此之前,我们将 UNemissions
与 select
考虑使用 merge
的 by
和 all
参数与指示符字段完全连接合并。然后,填写 emissions 列的缺失项。所有都可以在 within()
上下文中处理:
# MERGE ON INDICATORS (FULL OUTER JOIN)
merge_df <- merge(un_1, ets_1, by=c("country", "iso2", "year", "sector"),
all=TRUE, suffixes=c("", "_"))
final_df <- within(merge_df, {
# FILL IN MISSINGS WITH UNDERSCORE COLS
UNemissions <- ifelse(is.na(UNemissions), UNemissions_, UNemissions)
ETSemissions <- ifelse(is.na(ETSemissions), ETSemissions_, ETSemissions)
# REMOVE FILL-IN UNDERSCORE COLS
UNemissions_ <- NULL
ETSemissions_ <- NULL
})