循环遍历数据帧列表中的列时出现 TypeError 问题

Issue with TypeError when looping through columns in a list of data frames

我有一个数据框列表dataframes一个名称列表keeplist和一个字典Hydrocap

我试图根据列名 keeplist 遍历每个数据框的列,同时在列循环中应用 where 函数以将列中的值替换为字典值中的值 (对于其各自的键)如果它大于字典值。问题是我 运行 变成了 TypeError: '>=' not supported between instances of 'str' and 'int',我不确定如何解决这个问题。

keeplist = ['BOUND','GCOUL','CHIEF','ROCKY','WANAP','PRIRA','LGRAN','LMONU','ICEHA','MCNAR','DALLE']
HydroCap = {'BOUND':55000,'GCOUL':280000,'CHIEF':219000,'ROCKY':220000,'WANAP':161000,'PRIRA':162000,'LGRAN':130000,'LMONU':130000,'ICEHA':106000,'MCNAR':232000,'DALLE':375000}
for i in dataframes:
  for c in i[keeplist]:
    c = np.where(c >= HydroCap[c], HydroCap[c], c)

任何朝着正确方向的推动将不胜感激。我认为问题在于它期望 HydroCap[1] 而不是 HydroCap[c] 的索引值,但是,这是一种预感。

dataframe[0]

的前 7 列
      Week  Month  Day  Year         BOUND          GCOUL          CHIEF  \
0        1      8    5  1979  44999.896673  161241.036388  166497.578098   
1        2      8   12  1979  15309.259762   58219.122747   63413.204052   
2        3      8   19  1979  15316.965781   56072.024363   60606.956215   
3        4      8   26  1979  14371.269016   58574.003087   63311.569888 
import pandas as pd
import numpy as np

# Since I don't have all of the dataframes, I just use the sample you shared
df = pd.read_csv('dataframe.tsv', sep = "\t")

# Note, I've changed some values so you can see something actually happens
keeplist = ['BOUND','GCOUL','CHIEF']
HydroCap = {'BOUND':5500,'GCOUL':280000,'CHIEF':21900}

# The inside of the loop has been changed to accomplish the actual goal
# First, there are now two variables inside the loop: col, and c
# col is the column
# c represents a single element in that column at a time

# The code operates over a column at a time,
# using a list comprehension to cycle over each element
# and replace the full column with the new values at once
for col in df[keeplist]:
    df[col] = [np.where(c >= HydroCap[col], HydroCap[col], c) for c in df[col]]

产生:

df
Week Month Day Year BOUND GCOUL CHIEF
0 1 8 5 1979 5500.0 161241.036388 21900.0
1 2 8 12 1979 5500.0 58219.122747 21900.0
2 3 8 19 1979 5500.0 56072.024363 21900.0
3 4 8 26 1979 5500.0 58574.003087 21900.0

为了替换数据框中的元素,您需要一次替换一整列,或者将值重新分配给由行和列坐标指定的单元格。在您的原始代码中重新分配 c 变量——假设它代表您想到的单元格值,而不是列名——不会改变数据框中的任何内容。