卡在 For 循环中
Stuck with a For Loop
我不久前开始编写代码,然后从 Kaggle 跳入泰坦尼克号练习。我试图将一些乘客的年龄的 Nan 值更改为我认为适合他们前缀的年龄(先生,女士,主人......)。
尝试执行 for 循环,但它似乎不起作用,因为它为 Age 中具有 Nan 值的每个人提供相同的值,而不管他们的前缀如何。我做错了什么,我该如何改正?
import math
for i in range(len(database)):
if math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Capt.' or database['Prefix'][i] == ' Col.':
database['Age'] = 65.0
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Sir.' or database['Prefix'][i] == ' Major.' or database['Prefix'][i] == ' Rev.' or database['Prefix'][i] == ' Lady.' or database['Prefix'][i] == ' Dr.':
database['Age'] = 47.5
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Don.' or database['Prefix'][i] == ' Jonkheer.' or database['Prefix'][i] == ' Mrs.' or database['Prefix'][i] == ' the Countess.':
database['Age'] = 36.5
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Mr.' or database['Prefix'][i] == ' Ms.':
database['Age'] = 29.0
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Mme.' or database['Prefix'][i] == ' Mlle.':
database['Age'] = 24.0
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Miss.':
database['Age'] = 21.0
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Master.':
database['Age'] = 3.5
这是 for 循环之前:
Titanic1.py
这是在 for 循环之后:
Titanic2.py
非常感谢!!
关于您的代码,有几件事可以解决。
首先,我们将所有 if/elifs 的公共元素放在一个 if:
中
import math
for i in range(len(database)):
if math.isnan(database['Age'][i]) == True:
if database['Prefix'][i] == ' Capt.' or database['Prefix'][i] == ' Col.':
database['Age'] = 65.0
elif database['Prefix'][i] == ' Sir.' or database['Prefix'][i] == ' Major.' or database['Prefix'][i] == ' Rev.' or database['Prefix'][i] == ' Lady.' or database['Prefix'][i] == ' Dr.':
database['Age'] = 47.5
elif database['Prefix'][i] == ' Don.' or database['Prefix'][i] == ' Jonkheer.' or database['Prefix'][i] == ' Mrs.' or database['Prefix'][i] == ' the Countess.':
database['Age'] = 36.5
elif database['Prefix'][i] == ' Mr.' or database['Prefix'][i] == ' Ms.':
database['Age'] = 29.0
elif database['Prefix'][i] == ' Mme.' or database['Prefix'][i] == ' Mlle.':
database['Age'] = 24.0
elif database['Prefix'][i] == ' Miss.':
database['Age'] = 21.0
elif database['Prefix'][i] == ' Master.':
database['Age'] = 3.5
然后我们将通过将其保存到变量中来摆脱所有 database["Prefix"][i]
检查,并使用 in
运算符来避免许多 prefix == "something" or prefix == "something else"
.
for i in range(len(database)):
if math.isnan(database['Age'][i]) == True:
prefix = database['Prefix'][i]
if prefix in (' Capt.', ' Col.'):
database['Age'] = 65.0
elif prefix in (' Sir.', ' Major.', ' Rev.', ' Lady.', ' Dr.'):
database['Age'] = 47.5
elif prefix in (' Don.', ' Jonkheer.', ' Mrs.', ' the Countess.'):
database['Age'] = 36.5
elif prefix (' Mr.', ' Ms.'):
database['Age'] = 29.0
elif prefix (' Mme.', ' Mlle.'):
database['Age'] = 24.0
elif prefix == ' Miss.':
database['Age'] = 21.0
elif prefix == ' Master.':
database['Age'] = 3.5
然后,请注意您修改的是 database["Age"]
而不是 database["Age"][i]
,所以我们也会修复它。
for i in range(len(database)):
if math.isnan(database['Age'][i]) == True:
prefix = database['Prefix'][i]
if prefix in (' Capt.', ' Col.'): age = 65.0
elif prefix in (' Sir.', ' Major.', ' Rev.', ' Lady.', ' Dr.'): age = 47.5
elif prefix in (' Don.', ' Jonkheer.', ' Mrs.', ' the Countess.'): age = 36.5
elif prefix (' Mr.', ' Ms.'): age = 29.0
elif prefix (' Mme.', ' Mlle.'): age = 24.0
elif prefix == ' Miss.': age = 21.0
elif prefix == ' Master.': age = 3.5
database['Age'][i] = age
最后,如果你愿意,你可以自己写一个字典来匹配前缀和年龄,并使用它来避免许多 if 和 elifs。
# Define how an age is matched with some prefixes.
ages_and_prefixes = ((65.0, ("Capt", "Col")),
(47.5, ("Sir", "Major", "Rev", "Lady", "Dr")),
(36.5, ("Don", "Jonkheer", "Mrs", "the Countess")),
(29.0, ("Mr", "Ms")),
(24.0, ("Mme", "Mlle")),
(21.0, ("Miss",)),
(3.5, ("Master",))
)
prefix_to_age_dict = {}
for data in ages_and_prefixes:
age = data[0]
prefixes = data[1]
for prefix in prefixes:
prefix_to_age_dict[prefix] = age
# The replacement step in the database is now much simpler.
for i in range(len(database)):
if math.isnan(database['Age'][i]):
prefix = " " + database['Prefix'][i] + "."
age = prefix_to_age_dict[prefix]
database['Age'][i] = age
您不能分配值,例如 database['Age'] = 65.0
。这会将 'Age' 的值替换为每条记录的 65.0。因此,在 i
的其余部分中,math.isnan(database['Age'][i]) == True
将始终为 False,因为列 Age
不再有 NaN
。
我建议您使用 df.iterrows()
遍历数据帧并使用 df.at
为记录的特定单元格赋值。下面的代码应该可以工作。
for i, row in database.iterrows():
if math.isnan(database['Age'][i]) == True:
if database['Prefix'][i] == ' Capt.' or database['Prefix'][i] == ' Col.':
age = 65.0
elif database['Prefix'][i] == ' Sir.' or database['Prefix'][i] == ' Major.' or database['Prefix'][i] == ' Rev.' or database['Prefix'][i] == ' Lady.' or database['Prefix'][i] == ' Dr.':
age = 47.5
elif database['Prefix'][i] == ' Don.' or database['Prefix'][i] == ' Jonkheer.' or database['Prefix'][i] == ' Mrs.' or database['Prefix'][i] == ' the Countess.':
age = 36.5
elif database['Prefix'][i] == ' Mr.' or database['Prefix'][i] == ' Ms.':
age = 29.0
elif database['Prefix'][i] == ' Mme.' or database['Prefix'][i] == ' Mlle.':
age = 24.0
elif database['Prefix'][i] == ' Miss.':
age = 21.0
elif database['Prefix'][i] == ' Master.':
age = 3.5
database.at[i,'Age'] = age
这是带有地图的版本,比更长的地图更易于阅读 for-loop:
prefix_dict = {" Capt.": 65.0,
" Col.": 65.0,
" Sir.": 47.5,
" Major.": 47.5,
" Rev.": 47.5,
" Lady.": 47.5,
" Dr.": 47.5,
" Don.": 36.5,
" Jonkheer.": 36.5,
" Mrs.": 36.5,
" the Countess.": 36.5,
" Mr.": 29.0,
" Ms.": 29.0,
" Mme.": 24.0,
" Mlle.": 24.0,
" Miss.": 21.0,
" Master.": 3.5
}
database.loc[database["Age"].isna(), 'Age'] = database.loc[database["Age"].isna(), 'Prefix'].map(lambda x: prefix_dict[x])
.isna()
将仅过滤 n/a 个值,而 .map(lambda x: prefix_dict[x])
获取列中的每个值,returns 字典中的相关值。
我不久前开始编写代码,然后从 Kaggle 跳入泰坦尼克号练习。我试图将一些乘客的年龄的 Nan 值更改为我认为适合他们前缀的年龄(先生,女士,主人......)。
尝试执行 for 循环,但它似乎不起作用,因为它为 Age 中具有 Nan 值的每个人提供相同的值,而不管他们的前缀如何。我做错了什么,我该如何改正?
import math
for i in range(len(database)):
if math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Capt.' or database['Prefix'][i] == ' Col.':
database['Age'] = 65.0
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Sir.' or database['Prefix'][i] == ' Major.' or database['Prefix'][i] == ' Rev.' or database['Prefix'][i] == ' Lady.' or database['Prefix'][i] == ' Dr.':
database['Age'] = 47.5
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Don.' or database['Prefix'][i] == ' Jonkheer.' or database['Prefix'][i] == ' Mrs.' or database['Prefix'][i] == ' the Countess.':
database['Age'] = 36.5
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Mr.' or database['Prefix'][i] == ' Ms.':
database['Age'] = 29.0
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Mme.' or database['Prefix'][i] == ' Mlle.':
database['Age'] = 24.0
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Miss.':
database['Age'] = 21.0
elif math.isnan(database['Age'][i]) == True and database['Prefix'][i] == ' Master.':
database['Age'] = 3.5
这是 for 循环之前: Titanic1.py 这是在 for 循环之后: Titanic2.py
非常感谢!!
关于您的代码,有几件事可以解决。
首先,我们将所有 if/elifs 的公共元素放在一个 if:
中import math
for i in range(len(database)):
if math.isnan(database['Age'][i]) == True:
if database['Prefix'][i] == ' Capt.' or database['Prefix'][i] == ' Col.':
database['Age'] = 65.0
elif database['Prefix'][i] == ' Sir.' or database['Prefix'][i] == ' Major.' or database['Prefix'][i] == ' Rev.' or database['Prefix'][i] == ' Lady.' or database['Prefix'][i] == ' Dr.':
database['Age'] = 47.5
elif database['Prefix'][i] == ' Don.' or database['Prefix'][i] == ' Jonkheer.' or database['Prefix'][i] == ' Mrs.' or database['Prefix'][i] == ' the Countess.':
database['Age'] = 36.5
elif database['Prefix'][i] == ' Mr.' or database['Prefix'][i] == ' Ms.':
database['Age'] = 29.0
elif database['Prefix'][i] == ' Mme.' or database['Prefix'][i] == ' Mlle.':
database['Age'] = 24.0
elif database['Prefix'][i] == ' Miss.':
database['Age'] = 21.0
elif database['Prefix'][i] == ' Master.':
database['Age'] = 3.5
然后我们将通过将其保存到变量中来摆脱所有 database["Prefix"][i]
检查,并使用 in
运算符来避免许多 prefix == "something" or prefix == "something else"
.
for i in range(len(database)):
if math.isnan(database['Age'][i]) == True:
prefix = database['Prefix'][i]
if prefix in (' Capt.', ' Col.'):
database['Age'] = 65.0
elif prefix in (' Sir.', ' Major.', ' Rev.', ' Lady.', ' Dr.'):
database['Age'] = 47.5
elif prefix in (' Don.', ' Jonkheer.', ' Mrs.', ' the Countess.'):
database['Age'] = 36.5
elif prefix (' Mr.', ' Ms.'):
database['Age'] = 29.0
elif prefix (' Mme.', ' Mlle.'):
database['Age'] = 24.0
elif prefix == ' Miss.':
database['Age'] = 21.0
elif prefix == ' Master.':
database['Age'] = 3.5
然后,请注意您修改的是 database["Age"]
而不是 database["Age"][i]
,所以我们也会修复它。
for i in range(len(database)):
if math.isnan(database['Age'][i]) == True:
prefix = database['Prefix'][i]
if prefix in (' Capt.', ' Col.'): age = 65.0
elif prefix in (' Sir.', ' Major.', ' Rev.', ' Lady.', ' Dr.'): age = 47.5
elif prefix in (' Don.', ' Jonkheer.', ' Mrs.', ' the Countess.'): age = 36.5
elif prefix (' Mr.', ' Ms.'): age = 29.0
elif prefix (' Mme.', ' Mlle.'): age = 24.0
elif prefix == ' Miss.': age = 21.0
elif prefix == ' Master.': age = 3.5
database['Age'][i] = age
最后,如果你愿意,你可以自己写一个字典来匹配前缀和年龄,并使用它来避免许多 if 和 elifs。
# Define how an age is matched with some prefixes.
ages_and_prefixes = ((65.0, ("Capt", "Col")),
(47.5, ("Sir", "Major", "Rev", "Lady", "Dr")),
(36.5, ("Don", "Jonkheer", "Mrs", "the Countess")),
(29.0, ("Mr", "Ms")),
(24.0, ("Mme", "Mlle")),
(21.0, ("Miss",)),
(3.5, ("Master",))
)
prefix_to_age_dict = {}
for data in ages_and_prefixes:
age = data[0]
prefixes = data[1]
for prefix in prefixes:
prefix_to_age_dict[prefix] = age
# The replacement step in the database is now much simpler.
for i in range(len(database)):
if math.isnan(database['Age'][i]):
prefix = " " + database['Prefix'][i] + "."
age = prefix_to_age_dict[prefix]
database['Age'][i] = age
您不能分配值,例如 database['Age'] = 65.0
。这会将 'Age' 的值替换为每条记录的 65.0。因此,在 i
的其余部分中,math.isnan(database['Age'][i]) == True
将始终为 False,因为列 Age
不再有 NaN
。
我建议您使用 df.iterrows()
遍历数据帧并使用 df.at
为记录的特定单元格赋值。下面的代码应该可以工作。
for i, row in database.iterrows():
if math.isnan(database['Age'][i]) == True:
if database['Prefix'][i] == ' Capt.' or database['Prefix'][i] == ' Col.':
age = 65.0
elif database['Prefix'][i] == ' Sir.' or database['Prefix'][i] == ' Major.' or database['Prefix'][i] == ' Rev.' or database['Prefix'][i] == ' Lady.' or database['Prefix'][i] == ' Dr.':
age = 47.5
elif database['Prefix'][i] == ' Don.' or database['Prefix'][i] == ' Jonkheer.' or database['Prefix'][i] == ' Mrs.' or database['Prefix'][i] == ' the Countess.':
age = 36.5
elif database['Prefix'][i] == ' Mr.' or database['Prefix'][i] == ' Ms.':
age = 29.0
elif database['Prefix'][i] == ' Mme.' or database['Prefix'][i] == ' Mlle.':
age = 24.0
elif database['Prefix'][i] == ' Miss.':
age = 21.0
elif database['Prefix'][i] == ' Master.':
age = 3.5
database.at[i,'Age'] = age
这是带有地图的版本,比更长的地图更易于阅读 for-loop:
prefix_dict = {" Capt.": 65.0,
" Col.": 65.0,
" Sir.": 47.5,
" Major.": 47.5,
" Rev.": 47.5,
" Lady.": 47.5,
" Dr.": 47.5,
" Don.": 36.5,
" Jonkheer.": 36.5,
" Mrs.": 36.5,
" the Countess.": 36.5,
" Mr.": 29.0,
" Ms.": 29.0,
" Mme.": 24.0,
" Mlle.": 24.0,
" Miss.": 21.0,
" Master.": 3.5
}
database.loc[database["Age"].isna(), 'Age'] = database.loc[database["Age"].isna(), 'Prefix'].map(lambda x: prefix_dict[x])
.isna()
将仅过滤 n/a 个值,而 .map(lambda x: prefix_dict[x])
获取列中的每个值,returns 字典中的相关值。