当两个变量与第二个 DF 匹配时,将列添加到 DF
add column to DF when two variables are matched with second DF
我希望在 DF1
中添加另一列,在该列下方 returns 观察年份相关国家/地区的人口。即当国家 和 年份与 DF2
匹配时,人口被添加到 DF2
中的列。我之前用merge
只匹配一个变量,有没有方法可以完成两个变量?
DF1:
eventid |iyear | imonth| iday | CountryTxt
1.97000e+1 |1970 | 7| 2 | Albania
1.97000e+11| 1970| 0| 0 | United Kingdom
1.97001e+11| 1984| 1| 0 | Somalia
1.97001e+11| 1990| 1| 0 | France
1.97001e+11| 1991| 1| 0 | New Zealand
DF2:
Country.Name|Code|Year|Population
Aruba |ABW |1960| 123
Afganistan |AFG |1970| 456
Albania |ALB |1970| 1000
France |FRA |1990| 5000
这完全在 merge()
的能力范围内:请注意 ?merge
引述中所有强调的词都是复数,即该函数可以在多个匹配列上工作...
by, by.x, by.y: specifications of the columns used for merging. See
‘Details’.
...
By default the data frames are merged on the columns with names
they both have, but separate specifications of the columns can be
given by ‘by.x’ and ‘by.y’. The rows in the two data frames that
match on the specified columns are extracted
merge(df1,df2,
by.x=c("iyear","CountryTxt"),
by.y=c("Year","Country.Name"))
iyear CountryTxt eventid imonth iday Code Population
1 1970 Albania 1.97000e+01 7 2 ALB 1000
2 1990 France 1.97001e+11 1 0 FRA 5000
数据设置
df1 <- read.table(header=TRUE,sep="|", strip.white=TRUE, text="
eventid |iyear | imonth| iday | CountryTxt
1.97000e+1 |1970 | 7| 2 | Albania
1.97000e+11| 1970| 0| 0 | United Kingdom
1.97001e+11| 1984| 1| 0 | Somalia
1.97001e+11| 1990| 1| 0 | France
1.97001e+11| 1991| 1| 0 | New Zealand
")
df2 <- read.table(header=TRUE,sep="|", strip.white=TRUE, text="
Country.Name|Code|Year|Population
Aruba |ABW |1960| 123
Afganistan |AFG |1970| 456
Albania |ALB |1970| 1000
France |FRA |1990| 5000
")
我希望在 DF1
中添加另一列,在该列下方 returns 观察年份相关国家/地区的人口。即当国家 和 年份与 DF2
匹配时,人口被添加到 DF2
中的列。我之前用merge
只匹配一个变量,有没有方法可以完成两个变量?
DF1:
eventid |iyear | imonth| iday | CountryTxt
1.97000e+1 |1970 | 7| 2 | Albania
1.97000e+11| 1970| 0| 0 | United Kingdom
1.97001e+11| 1984| 1| 0 | Somalia
1.97001e+11| 1990| 1| 0 | France
1.97001e+11| 1991| 1| 0 | New Zealand
DF2:
Country.Name|Code|Year|Population
Aruba |ABW |1960| 123
Afganistan |AFG |1970| 456
Albania |ALB |1970| 1000
France |FRA |1990| 5000
这完全在 merge()
的能力范围内:请注意 ?merge
引述中所有强调的词都是复数,即该函数可以在多个匹配列上工作...
by, by.x, by.y: specifications of the columns used for merging. See ‘Details’.
...
By default the data frames are merged on the columns with names they both have, but separate specifications of the columns can be given by ‘by.x’ and ‘by.y’. The rows in the two data frames that match on the specified columns are extracted
merge(df1,df2,
by.x=c("iyear","CountryTxt"),
by.y=c("Year","Country.Name"))
iyear CountryTxt eventid imonth iday Code Population
1 1970 Albania 1.97000e+01 7 2 ALB 1000
2 1990 France 1.97001e+11 1 0 FRA 5000
数据设置
df1 <- read.table(header=TRUE,sep="|", strip.white=TRUE, text="
eventid |iyear | imonth| iday | CountryTxt
1.97000e+1 |1970 | 7| 2 | Albania
1.97000e+11| 1970| 0| 0 | United Kingdom
1.97001e+11| 1984| 1| 0 | Somalia
1.97001e+11| 1990| 1| 0 | France
1.97001e+11| 1991| 1| 0 | New Zealand
")
df2 <- read.table(header=TRUE,sep="|", strip.white=TRUE, text="
Country.Name|Code|Year|Population
Aruba |ABW |1960| 123
Afganistan |AFG |1970| 456
Albania |ALB |1970| 1000
France |FRA |1990| 5000
")