如何使用 Python 上的 for 循环和数组更新 df？

Question

假设我创建了以下 df:

import pandas as pd

#column names
column_names = ["Time", "Currency", "Volatility expected", "Event", "Actual", "Forecast", "Previous"]

#create a dataframe including the column names
df = pd.DataFrame(columns=column_names)

然后，我创建了以下数组，其中的单元格值将添加到我的 df:

rows = ["2:00", "GBP", "", "Construction Output (MoM) (Jan)", "1.1%", "0.5%", "2.0%",
        "2:00", "GBP", "", "U.K. Construction Output (YoY) (Jan)", "9.9%", "9.2%", "7.4%"]

那么，我如何使用 for 循环来更新我的 df，使其最终像这样：

|Time   |Currency  |Volatility expected    |Event                               |Actual   |Forecast   |Previous  |
------------------------------------------------------------------------------------------------------------------
|02:00  |GBP       |                       |Construction Output (MoM) (Jan)     |1.1%     |0.5%       |2.0%      |
|04:00  |GBP       |                       |U.K. Construction Output (YoY) (Jan)|9.9%     |9.2%       |7.4%      |

我试过了：

column_name_location = 0
for row in rows:
    df.at['0', df[column_name_location]] = row
    column_name_location += 1

print(df)

但是得到了：

KeyError: 0

我可以在这里得到一些建议吗？

Answer 1

如果 rows 是项目的平面列表，您可以将其转换为 numpy 数组以先对其进行整形

假设 rows 实际上是 sub-lists 的列表，每个 sub-list 是一行，您可以使用数据框的列名称从每一行创建一个 pd.Series系列的索引，然后使用 df.append 将它们全部附加：

df.append([pd.Series(r, index=df.columns) for r in rows])

如果 rows 实际上只是一个平面列表，您需要将其转换为 numpy 数组以重塑它：

rows = np.array(rows).reshape(-1, 7).tolist()

Answer 2

您似乎创建了一个包含 14 项的列表。您可以改为将其作为包含 2 个项目的列表，其中每个项目是一个包含 7 个值的列表。

rows = [["2:00", "GBP", "", "Construction Output (MoM) (Jan)", "1.1%", "0.5%", "2.0%"],
       ["2:00", "GBP", "", "U.K. Construction Output (YoY) (Jan)", "9.9%", "9.2%", "7.4%"]]

有了这个，我们可以直接创建一个dataframe，如下图

df = pd.DataFrame(rows, columns=column_names)
print(df)

这输出 2 行

   Time Currency Volatility expected                                 Event Actual Forecast Previous
0  2:00      GBP                           Construction Output (MoM) (Jan)   1.1%     0.5%     2.0%
1  2:00      GBP                      U.K. Construction Output (YoY) (Jan)   9.9%     9.2%     7.4%

如何使用 Python 上的 for 循环和数组更新 df？

How to update a df using a for loop and arrays on Python?

python-3.x

pandas

dataframe

for-loop

debugging