使用 for 循环结合嵌套的 if 语句，根据 Python 中不同 DataFrame 的 3 列创建新的 pandas DataFrame

Question

trv_last = []
for i in range(0,len(trv)):
    if (trv_name_split.iloc[i,3] != None):
        trv_last = trv_name_split.iloc[i,3]
    elif (trv_name_split.iloc[i,2] != None):
        trv_last = trv_name_split.iloc[i,2]
    else: 
        trv_last = trv_name_split.iloc[i,1]
        
trv_last

这个 returns 'Campo' 这是我范围内的最后一个索引：

     0         1     2        3
  1 John    Doe     None    None
  2 Jane    K.      Martin  None
  :   :      :       :       :
972 Jino    Campo   None    None

如您所见，所有名称都在一栏中，我使用 str.split() 将它们分开。由于有些名字的名字中间是第一个，中间是最后一个，所以我只剩下 4 列。我只对他的姓氏感兴趣。

我的目标是创建一个只有姓氏的新 DF。这里的逻辑是，如果第 4 列不是“None”，那么它就是姓氏，如果其他都是“None”，则向后移动到第 2 列是姓氏。

感谢您的浏览，感谢您的帮助！

Answer 1

遍历 pandas 数据帧不是一个好主意。这就是他们制作 apply 的原因。最佳做法是使用应用和分配。

def build_last_name(row):
    if row.3:
        return row.3
    if row.2:
        return row.2
    return row.1

last_names = trv_name_split.apply(build_last_name, axis=1)
trv_name_split = trv_name_split.assign(last_name=last_names)

熟悉 apply 可以避免很多麻烦。 Here's the docs.

Answer 2

找到了我自己问题的答案..

trv_last = []
for i in range(0,len(trv)):
    if (trv_name_split.iloc[i,3] != None):
        trv_last.append(trv_name_split.iloc[i,3])
    elif (trv_name_split.iloc[i,2] != None):
        trv_last.append(trv_name_split.iloc[i,2])
    else: 
        trv_last.append(trv_name_split.iloc[i,1])
        
trv_last

使用 for 循环结合嵌套的 if 语句，根据 Python 中不同 DataFrame 的 3 列创建新的 pandas DataFrame

Using a for loop combined with a nested if statement to create a new pandas DataFrame based on 3 columns of a different DataFrame in Python

python

for-loop

data-manipulation

nested-if

pandas