我正在尝试将一个小数据框合并到另一个大数据框，循环遍历小数据框

Question

我能够打印小数据框并看到它正在正确生成，我使用下面的代码编写了它。然而，我的最终结果只包含最终合并的结果，而不是传递每个结果并合并它们。

MIK_Quantiles 是第一个较大的数据帧，df2_t 是 while 循环中生成的较小数据帧。数据帧都正确生成并且合并有效，但我只剩下最后一次合并的结果。我希望它将当前 df2_t 与上一个循环的已合并结果 (df_merged) 合并。我希望这是有道理的！

i = 0
while i < df_length - 1:   


    cur_bound = MIK_Quantiles['bound'].iloc[i]
    cur_percentile = MIK_Quantiles['percentile'].iloc[i]
    cur_bin_low = MIK_Quantiles['auppm'].iloc[i]
    cur_bin_high = MIK_Quantiles['auppm'].iloc[i+1]

    ### Grades/Counts within bin, along with min and max
    df2 = df_orig['auppm'].loc[(df_orig['bound'] == cur_bound) & (df_orig['auppm'] >= cur_bin_low) & (df_orig['auppm'] < cur_bin_high)].describe()

    ### Add fields of interest to the output of describe for later merging together
    df2['bound'] = cur_bound
    df2['percentile'] = cur_percentile
    df2['bin_name'] = 'bin name'
    df2['bin_lower'] = cur_bin_low
    df2['bin_upper'] = cur_bin_high
    df2['temp_merger'] =  str(int(df2['bound'])) + '_' + str(df2['percentile'])

    # Write results of describe to a CSV file and transpose columns to rows
    df2.to_csv('df2.csv')
    df2_t = pd.read_csv('df2.csv').T
    df2_t.columns = ['count', 'mean', 'std', 'min', '25%', '50%', '75%', 'max', 'bound', 'percentile', 'bin_name', 'bin_lower', 'bin_upper', 'temp_merger']

    # Merge the results of the describe on the selected data with the table of quantile values to produce a final output    
    df_merged = MIK_Quantiles.merge(df2_t, how = 'inner', on = ['temp_merger'])
    pd.merge(df_merged, df2_t)
    print(df_merged)


i = i + 1

Answer 1

除了递增 i.

，你的循环没有做任何有意义的事情

你合并了 2 个（静态）dfs（MIK_Quantiles 和 df2_t），你做了 df_length 次。每次执行此操作时（首先，i-th，以及循环的最后一次迭代），都会覆盖输出变量 df_merged.

要在输出中保留在前一个循环迭代中创建的任何内容，您需要连接所有创建的 df2_t:

df2 = pd.concat([df2, df2_t]) 到 'append' 新创建的数据 df2_t 到输出数据帧 df2 在循环的每次迭代期间，所以最后所有数据都将是包含在 df2

然后，在循环之后，merge那一个进入MIK_Quantiles

pd.merge(MIK_Quantiles, df2)（不是 df2_t (!)）以合并上一个输出

df2 = pd.DataFrame([]) # initialize your output
for i in range(0, df_length):
    df2_t = ...       # read your .csv files
    df2 = pd.concat([df2, df2_t])
 df2 = ...      # do vector operations on df2 (process all of the df2_t at once)
 out = pd.merge(MIK_Quantiles, df2)

我正在尝试将一个小数据框合并到另一个大数据框，循环遍历小数据框

I'm trying to merge a small dataframe to another large one, looping through the small dataframes

python

merge

loops

pandas