从列中减去字符串并将原始索引保留在 Pandas

Question

我有一个包含两列的 df:

country      amount
USA          34 
USA          21
China        5
France       7
Italy        9
USA          1
Spain        10
Ireland      12

我想创建 3 个基于大洲的变量：美国、中国和欧洲，以便使用 'amount' 列进行进一步计算。

对于美国和中国，我是这样做的：

    usa = df.loc[df['country']=='USA']['country']
    china = df.loc[df['country']=='China (Mainland)']['country']

对于欧洲，我被卡住了，因为我需要列中的所有欧洲国家并维护其索引（因此各自的数量）。

是否可以从 ['country'] 中减去美国和中国，得到其余的（欧洲国家）并将它们存储在变量中 'europe'？

最终目标是获得，例如，所有欧洲国家的金额总和，不幸的是没有另一个'marker'来区分它们是欧洲国家。

Answer 1

检查

EU = df.loc[~df['country'].isin(['USA', 'China (Mainland)'])]['country']

Answer 2

您可以获得所有不属于 USA 或 China 的国家/地区。

为此，您可以使用以下方法

europe = df.loc[(df['country']!='China (Mainland)') & (df['country']!='USA')]['country']

Answer 3

USA and China are not continents :)

df['continent'] = 'Europe'
df['continent'][df['country']=='USA'] = 'USA'
df['continent'][df['country']=='China'] = 'China (Mainland)'

Subtract strings from columns and keep original index in Pandas