如何将 DataFrame.append() 转换为 pandas.concat()?

How to convert DataFrame.append() to pandas.concat()?

在 pandas 1.4.0 中:append() 已被弃用,文档建议改用 concat()

FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.

有问题的代码块:

def generate_features(data, num_samples, mask):
    """
    The main function for generating features to train or evaluate on.
    Returns a pd.DataFrame()
    """
    logger.debug("Generating features, number of samples", num_samples)
    features = pd.DataFrame()

    for count in range(num_samples):
        row, col = get_pixel_within_mask(data, mask)
        input_vars = get_pixel_data(data, row, col)
        features = features.append(input_vars)
        print_progress(count, num_samples)

    return features

这是我试过的两个选项,但没有用:

features = pd.concat([features],[input_vars])

pd.concat([features],[input_vars])

这是已弃用并引发错误的行:

features = features.append(input_vars)

这将“追加”空白 df 并通过使用 concat 选项防止将来出现错误

features= pd.concat([features, input_vars])

然而,如果无法访问实际数据和数据结构,这将很难测试复制。

您可以将循环中生成的 DataFrame 存储在列表中,并在完成循环后将它们与 features 连接起来。

换句话说,替换循环:

for count in range(num_samples):
    # .... code to produce `input_vars`
    features = features.append(input_vars)        # remove this `DataFrame.append`

与下面的那个:

tmp = [features]                                  # initialize list
for count in range(num_samples):
    # .... code to produce `input_vars`
    tmp.append(input_vars)                        # append to the list, (not DF)
features = pd.concat(tmp)                         # concatenate after loop

您当然可以在循环中连接,但只连接一次效率更高。