pandas:如果相交则更新数据框
pandas: if intersection then update dataframe
我有两个数据框:
国家:
Country or Area Name ISO-2 ISO-3
0 Afghanistan AF AFG
1 Philippines PH PHL
2 Albania AL ALB
3 Norway NO NOR
4 American Samoa AS ASM
合同:
Country Name Jurisdiction Signature year
0 Yemen KY;NO;CA;NO 1999.0
1 Yemen BM;TC;YE 2007.0
2 Congo, CD;CD 2015.0
3 Philippines PH 2009.0
4 Philippines PH;PH 2007.0
5 Philippines PH 2001.0
6 Philippines PH;PH 1997.0
7 Bolivia, Plurinational State of BO;BO 2006.0
我想:
- 检查
contracts
中的 Jurdisctiction
列是否至少包含 countries
ISO-2
列中的两个字母代码。
我尝试了多种方法来测试是否存在交集,但其中 none 有效。我最后一次尝试是:
i1 = pd.Index(contracts['Jurisdiction of Incorporation'].str.split(';'))
i2 = pd.Index(countries['ISO-2'])
print i1, i2
i1.intersection(i2)
这给了我 TypeError: unhashable type: 'list'
- 如果至少存在一个代码,我想用
new column
更新 contracts
数据框,它只包含布尔值
contracts['new column'] = np.where("piece of code that will actually work", 1, 0)
所以期望的输出是
Country Name Jurisdiction Signature year new column
0 Yemen KY;NO;CA;NO 1999.0 1
1 Yemen BM;TC;YE 2007.0 0
2 Congo, CD;CD 2015.0 0
3 Philippines PH 2009.0 1
4 Philippines PH;PH 2007.0 1
5 Philippines PH 2001.0 1
6 Philippines PH;PH 1997.0 1
7 Bolivia, Plurinational State of BO;BO 2006.0 0
我怎样才能做到这一点?
有点啰嗦,但试试这个:
occuring_iso_2_codes = set(countries['ISO-2'])
contracts['new column'] = contracts.Jurisdiction.apply(
lambda s: int(bool(set(s.split(';')).intersection(occuring_iso_2_codes))))
我有两个数据框:
国家:
Country or Area Name ISO-2 ISO-3
0 Afghanistan AF AFG
1 Philippines PH PHL
2 Albania AL ALB
3 Norway NO NOR
4 American Samoa AS ASM
合同:
Country Name Jurisdiction Signature year
0 Yemen KY;NO;CA;NO 1999.0
1 Yemen BM;TC;YE 2007.0
2 Congo, CD;CD 2015.0
3 Philippines PH 2009.0
4 Philippines PH;PH 2007.0
5 Philippines PH 2001.0
6 Philippines PH;PH 1997.0
7 Bolivia, Plurinational State of BO;BO 2006.0
我想:
- 检查
contracts
中的Jurdisctiction
列是否至少包含countries
ISO-2
列中的两个字母代码。
我尝试了多种方法来测试是否存在交集,但其中 none 有效。我最后一次尝试是:
i1 = pd.Index(contracts['Jurisdiction of Incorporation'].str.split(';'))
i2 = pd.Index(countries['ISO-2'])
print i1, i2
i1.intersection(i2)
这给了我 TypeError: unhashable type: 'list'
- 如果至少存在一个代码,我想用
new column
更新contracts
数据框,它只包含布尔值
contracts['new column'] = np.where("piece of code that will actually work", 1, 0)
所以期望的输出是
Country Name Jurisdiction Signature year new column
0 Yemen KY;NO;CA;NO 1999.0 1
1 Yemen BM;TC;YE 2007.0 0
2 Congo, CD;CD 2015.0 0
3 Philippines PH 2009.0 1
4 Philippines PH;PH 2007.0 1
5 Philippines PH 2001.0 1
6 Philippines PH;PH 1997.0 1
7 Bolivia, Plurinational State of BO;BO 2006.0 0
我怎样才能做到这一点?
有点啰嗦,但试试这个:
occuring_iso_2_codes = set(countries['ISO-2'])
contracts['new column'] = contracts.Jurisdiction.apply(
lambda s: int(bool(set(s.split(';')).intersection(occuring_iso_2_codes))))