如何使用 pandas 连接具有相同列名但具有不同数据的两个表？

Question

假设

df1,

col1 | col2 | col3 | col4 |
A    |  131 | 666  | 777  |
B    |  123 | 345  | 435  |
C    | 1424 | 3214 | 2314 |

df2,

col1 | col2 | col3 | col4 |
A    |  10  | 1    | 0    |
B    |  20  | 14   | 68   |
C    |  23  | 43   | 4    |

我想达到的最终目标，

col1 | col2           | col3         | col4      |
A    |  131 (10%)     | 666 (1%)     | 777       |
B    |  123 (20%)     | 345 (14%)    | 435 (68%) |
C    |  1424 (23%)    | 3214 (43%)   | 2314 (4%) |

P.S。数字是随机的

Answer 1

您可以将 DataFrames 转换为字符串，将 0 替换为缺失值，添加 ( %)，因此不添加缺失值，最后一个先添加 DataFrame:

df = ((df1.set_index('col1').astype(str) + 
      (' (' + df2.set_index('col1').astype(str).replace('0', np.nan) + '%)').fillna(''))
      .reset_index())
print (df)
  col1        col2        col3       col4
0    A   131 (10%)    666 (1%)        777
1    B   123 (20%)   345 (14%)  435 (68%)
2    C  1424 (23%)  3214 (43%)  2314 (4%)

另一个想法是 DataFrame.mask:

的测试值

df11 = df1.set_index('col1').astype(str)
df22 = df2.set_index('col1').astype(str)

df = (df11 + (' (' + df22 + '%)').mask(df22.eq('0'), '')).reset_index()
      
print (df)
  col1        col2        col3       col4
0    A   131 (10%)    666 (1%)        777
1    B   123 (20%)   345 (14%)  435 (68%)
2    C  1424 (23%)  3214 (43%)  2314 (4%)

Answer 2

或applymap:

>>> (df1.set_index('col1').astype(str).add(df2.set_index('col1')
                      .applymap(lambda x: f' ({x}%)' if x else ''))
                      .reset_index())
  col1        col2        col3       col4
0    A   131 (10%)    666 (1%)        777
1    B   123 (20%)   345 (14%)  435 (68%)
2    C  1424 (23%)  3214 (43%)  2314 (4%)
>>>

此代码添加来自 df2 的字符串（如果不是 0）。它使用 set_index 在同一个 col1 上合并，并使用 applymap 对其进行格式化。

如何使用 pandas 连接具有相同列名但具有不同数据的两个表？

How to join two tables with same column names but with different data using pandas?

python

join

dataframe

pandas