pandas.dataframe to orderedDictionary: 使用传递的参数指定键列名称而不是显式写入

pandas.dataframe to orderedDictionary: using a passed argument to specify the key column name instead of explicitly writing it

基于此 我想编写一个函数将 csv 加载到 OrderedDict(),但我不知道如何解决将键列名称作为字符串而不是手动声明它?这是我的代码,可以使它更清楚:

dic_key = 'uniqueID'
df.dic_key #this gives AttributeError: 'DataFrame' object has no attribute 'dic_key'

而不是 df.uniqueID,其中 uniqueID 是我们要将其用作键的列的名称

完整代码如下:

def csv_to_OrderedDic1(path, dic_key='uniqueID'):
    '''

    Parameters:
        dic_key: the name of the column to be used as the dictionary key

    '''
    df = pd.DataFrame.from_csv(path, sep='\t', header=0)
    # Get an unordered dictionary
    unordered_dict = df.set_index(dic_key).T.to_dict('list')
    # Then order it
    ordered_dict = OrderedDict((k,unordered_dict.get(k)) for k in df.dic_key)
    return ordered_dict

我认为更好的做法是使用 read_csv 和 select 列 [] 而不是点符号:

def csv_to_OrderedDic1(path, dic_key='uniqueID'):
    '''

    Parameters:
        dic_key: the name of the column to be used as the dictionary key

    '''
    df = pd.read_csv(path, sep='\t', header=0)
    # Get an unordered dictionary
    unordered_dict = df.set_index(dic_key).T.to_dict('list')
    # Then order it
    ordered_dict = OrderedDict((k,unordered_dict.get(k)) for k in df[dic_key])
    return ordered_dict

另一种解决方案 zip 并通过 drop 删除列:

def csv_to_OrderedDic1(path, dic_key='uniqueID'):
    '''

    Parameters:
        dic_key: the name of the column to be used as the dictionary key

    '''
    df = pd.read_csv(path, sep='\t', header=0)
    L = zip(df[dic_key], df.drop(dic_key, 1).values.tolist())
    ordered_dict = OrderedDict(L)
    return ordered_dict