python 序列后的字符串连接

Question

创建以下字符串连接的最 pythonic 方法是什么：

我们有一个初始数据框，其中一些列是：

起源
dest_1_country
dest_1_city
dest_2_country
dest_2_city
dest_3_country
dest_3_city
dest_4_country
dest_4_city

我们想创建一个额外的列，它是数据框中每一行的完整路线，可以由

生成

df['full_route'] = df['origin].fillna("") + df['dest_1_country].fillna("") + df['dest_1_city .fillna("") + df['dest_2_country].fillna("") + df['dest_2_city].fillna("") + df['dest_3_country]。 fillna("") + df['dest_3_city].fillna("") + df['dest_4_country].fillna("") + df['dest_4_city].fillna( "")

显然这不是获得所需结果的最pythonic 方法，因为它非常麻烦。如果我在 df 中有 100 个城市怎么办？

在 python 中实现此目标的最佳方法是什么？

注意：在数据框中，还有其他列与路由无关，不应在串联中考虑。

非常感谢！！

Answer 1

如果你有这个数据框：

  origin dest_1_country dest_1_city dest_2_country dest_2_city
0      a              b           c              d           e
1      f              g           h              i           j

那么你可以这样做：

df["full_route"] = df.sum(axis=1)  # df.fillna("").sum(axis=1) if you have NaNs
print(df)

连接所有列：

  origin dest_1_country dest_1_city dest_2_country dest_2_city full_route
0      a              b           c              d           e      abcde
1      f              g           h              i           j      fghij

编辑：如果你想连接“origin”和每个“*city”/“*country”列，你可以这样做：

df["full_route"] = df["origin"].fillna("") + df.filter(
    regex=r"country$|city$"
).fillna("").sum(axis=1)
print(df)

python 序列后的字符串连接

python string concatenation following a sequence

python

string

string-concatenation

dataframe