如何获取特定索引处的列名并使用 python 和 pandas 将其打印到另一列

How to get the name of a column at a specific index and print it into another column using python and pandas

我是 python 新手,需要帮助完成特定任务。我的主要目标是识别所有具有特定值和列名的指标,这些指标在一行中大于 0,并将这些值相加到同一行中的另一列中。

这是我尝试过的:

import pandas as pd
import numpy as np
table = {
    'A':[0, 2, 0, 5],
    'B' :[4, 1, 3, 0],
    'C':[2, 9, 0, 6],
    'D':[1, 0, 1, 6]
              }
df = pd.DataFrame(table)
print(df)

# create a new column that sums up the row
df['summary'] = 'NoData'

# print the header
print(df.columns.values)

   A  B  C  D summary
0  0  4  2  1  NoData
1  2  1  9  0  NoData
2  0  3  0  1  NoData
3  5  0  6  6  NoData

# get length of rows and columns
row = len(df.index)
column = len(df.columns)

# If a value at a spefic index is greater
# than 0, take the column name and the value at that index and print it into the column
#'summary'. Also write all values greater than 0 within a row below each other

for i in range(row):
    for j in range(column):
        if df.iloc[i][j] > 0:
            df.at[i,'summary'] = df.columns(df.iloc[i][j]) + '\n'

我希望我想达到的目标有点清楚。这是 'summary'

列中结果的图片

试试这个:

import pandas as pd
import numpy as np
table = {
    'A':[0, 2, 0, 5],
    'B' :[4, 1, 3, 0],
    'C':[2, 9, 0, 6],
    'D':[1, 0, 1, 6]
              }
df = pd.DataFrame(table)
print(df)

print(f'\n\n-------------BREAK-----------\n\n')
def func(line):
    templist = ''
    list_col = line.index.values.tolist()
    temp = line.values.tolist()
    for x in  range(0, len(temp)):
        if (temp[x] <= 0):
            pass
        else:
            if (x == 0 ):
                templist = f"{temp[x]}{list_col[x]}"
            else:
                templist = f"{templist}\n{temp[x]}{list_col[x]}"
    return templist
        

df['summary'] = df.apply(func, axis = 1)
print(df)

退出


   A  B  C  D
0  0  4  2  1
1  2  1  9  0
2  0  3  0  1
3  5  0  6  6


-------------BREAK-----------

   A  B  C  D       summary
0  0  4  2  1  \n4B\n2C\n1D
1  2  1  9  0    2A\n1B\n9C
2  0  3  0  1      \n3B\n1D
3  5  0  6  6    5A\n6C\n6D

你真的不需要for loop

df 开始:

    A   B   C   D
0   0   4   2   1
1   2   1   9   0
2   0   3   0   1
3   5   0   6   6

你可以这样做:

# Define an helper function
def f(val, col_name):
    # You can modify this function in order to customize the summary string
    return "" if val == 0 else str(val) + col_name + "\n"

# Assign summary column
df["summary"] = df.apply(lambda x: x.apply(f, args=(x.name,))).sum(axis=1).str[:-1]

输出:

   A    B   C   D   summary
0   0   4   2   1   4B\n2C\n1D
1   2   1   9   0   2A\n1B\n9C
2   0   3   0   1   3B\n1D
3   5   0   6   6   5A\n6C\n6D

它也适用于较长的列名:

      one  two  three four summary
0     0    4    2     1    4two\n2three\n1four
1     2    1    9     0    2one\n1two\n9three
2     0    3    0     1    3two\n1four
3     5    0    6     6    5one\n6three\n6four