使用 NaN 条件迭代数据帧

Iterate over dataframe with NaN condition

我有一个 table,我想从 table A 中提取数据并将其整理为 table B:

Table答:

Day City A City B City C
Mon NaN Mike NaN
Tue NaN NaN Joe
Wed Jack Charlie NaN

Table乙:

Day Name City
Mon Mike City B
Tue Joe City C
Wed Jack City A
Wed Charlie City B

我从 excel sheet 中提取了这些信息,并且正在使用 python 来完成这项任务。 我的想法是我需要将数据作为数据框绘制,遍历行以查找不包含 NaN 的条目并将它们的位置和关联数据存储在新数据框中。

不幸的是,我在设置忽略 NaN 条目的条件时卡住了,我正在尝试逐步测试它并取得了进展:

    import pandas as pd
    df = pd.read_excel('./csvtasks/rosta.xlsx',sheet_name='Sheet2')
    
    #open new excel to write to with new variable df2
    #determine whether null or not
    dg=df.notnull()
    #loop over rows
    for i,j in dg.iteritems():
        if dg.bool==FALSE:
            print('skipped something') #i want this to skip but using this print to see if it's actually skipped anything
        else:
            print (i,j)
            #this will be replace by some command that uses the df.iloc[something] and writes to df2, printing for now so i can see what it does
    #loop to end
    #close file

所有这些所做的就是把整个数据框作为一个 bool 像这样给我:

Day City A City B City C
True False True False
True False False True
True True True False

试试 stack

s = df.set_index('Day').stack().reset_index()
s.columns = ['Day','City','Name']
s
Out[43]: 
   Day    City     Name
0  Mon  City B     Mike
1  Tue  City C      Joe
2  Wed  City A     Jack
3  Wed  City B  Charlie

您可以尝试 melt 然后 dropna

out = (df.melt(id_vars='Day', var_name='Name', value_name='City')
       .dropna())
print(out)

   Day    Name     City
2  Wed  City A     Jack
3  Mon  City B     Mike
5  Wed  City B  Charlie
7  Tue  City C      Joe