在 python 中合并数据框中的两列时出现问题

Question

我正在尝试合并 python 中 dataframe 中的两列。原来的dataframe是这样的：

    type    id      details                details2
0   hotel   df9466  #2 in the rank of 288       nan
1   hotel   gt9444  #48 in the rank of 340      nan
2   hotel   dfa887  #12 in the rank of 7414     nan
3   hotel   fgfd81  nan                     #1 in rank of 8792
4   hotel   fsf887  nan                     #70 in rank of 245

而我的预期结果应该是这样的：

    type    id          details                
0   hotel   df9466  #2 in the rank of 288       
1   hotel   gt9444  #48 in the rank of 340      
2   hotel   dfa887  #12 in the rank of 7414     
3   hotel   fgfd81  #1 in the rank of 8792
4   hotel   fsf887  #70 in the rank of 245

在我的编码中，我试图将它与

合并

df_hotel["details"] = (df_hotel["details"] + df_hotel["details2"])

但是，它失败了，它给出了一个结果，其中包含“详细信息”列中的所有 nan 值。

Answer 1

尝试：

replace()用于替换字符串'nan'（如果有如果'nan'是实际的NaN那么你可以跳过这一步直接运行 fillna()）到实际 NaN 和 fillna() 来填充那些 NaN 的

df_hotel= df_hotel.replace('nan',float('NaN'),regex=True)
df_hotel["details"]=df_hotel["details"].fillna(df_hotel.pop("details2"))

df_hotel的输出：

    type    id          details
0   hotel   df9466      #2 in rank of 288
1   hotel   gt9444      #48 in rank of 340
2   hotel   dfa887      #12 in rank of 7414
3   hotel   fgfd81      #1 in rank of 8792
4   hotel   fsf887      #70 in rank of 245

Answer 2

NaN 加任何东西都会是 NaN。相反，我们可以使用 Series.add 并将 fill_value 设置为空字符串。

df_hotel['details'] = (
    df_hotel["details"].add(df_hotel["details2"], fill_value='')
)

或者我们可以 Series.fillna 两个系列并添加 +:

df_hotel["details"] = (df_hotel["details"].fillna('') +
                       df_hotel["details2"].fillna(''))

df_hotel:

    type      id              details            details2
0  hotel  df9466    #2 in rank of 288                 NaN
1  hotel  gt9444   #48 in rank of 340                 NaN
2  hotel  dfa887  #12 in rank of 7414                 NaN
3  hotel  fgfd81   #1 in rank of 8792  #1 in rank of 8792
4  hotel  fsf887   #70 in rank of 245  #70 in rank of 245

我们可以 pop details2 如果我们想从 DataFrame 中删除：

df_hotel['details'] = (
    df_hotel["details"].add(df_hotel.pop("details2"), fill_value='')
)

或

df_hotel["details"] = (df_hotel["details"].fillna('') +
                       df_hotel.pop("details2").fillna(''))

df_hotel:

    type      id              details
0  hotel  df9466    #2 in rank of 288
1  hotel  gt9444   #48 in rank of 340
2  hotel  dfa887  #12 in rank of 7414
3  hotel  fgfd81   #1 in rank of 8792
4  hotel  fsf887   #70 in rank of 245

在 python 中合并数据框中的两列时出现问题

Problem on merging two columns in a dataframe in python

python

merge

pandas