pandas.dataframe to orderedDictionary: 使用传递的参数指定键列名称而不是显式写入

Question

基于此我想编写一个函数将 csv 加载到 OrderedDict()，但我不知道如何解决将键列名称作为字符串而不是手动声明它？这是我的代码，可以使它更清楚：

dic_key = 'uniqueID'
df.dic_key #this gives AttributeError: 'DataFrame' object has no attribute 'dic_key'

而不是 df.uniqueID，其中 uniqueID 是我们要将其用作键的列的名称

完整代码如下：

def csv_to_OrderedDic1(path, dic_key='uniqueID'):
    '''

    Parameters:
        dic_key: the name of the column to be used as the dictionary key

    '''
    df = pd.DataFrame.from_csv(path, sep='\t', header=0)
    # Get an unordered dictionary
    unordered_dict = df.set_index(dic_key).T.to_dict('list')
    # Then order it
    ordered_dict = OrderedDict((k,unordered_dict.get(k)) for k in df.dic_key)
    return ordered_dict

Answer 1

我认为更好的做法是使用 read_csv 和 select 列 [] 而不是点符号：

def csv_to_OrderedDic1(path, dic_key='uniqueID'):
    '''

    Parameters:
        dic_key: the name of the column to be used as the dictionary key

    '''
    df = pd.read_csv(path, sep='\t', header=0)
    # Get an unordered dictionary
    unordered_dict = df.set_index(dic_key).T.to_dict('list')
    # Then order it
    ordered_dict = OrderedDict((k,unordered_dict.get(k)) for k in df[dic_key])
    return ordered_dict

另一种解决方案 zip 并通过 drop 删除列：

def csv_to_OrderedDic1(path, dic_key='uniqueID'):
    '''

    Parameters:
        dic_key: the name of the column to be used as the dictionary key

    '''
    df = pd.read_csv(path, sep='\t', header=0)
    L = zip(df[dic_key], df.drop(dic_key, 1).values.tolist())
    ordered_dict = OrderedDict(L)
    return ordered_dict

pandas.dataframe to orderedDictionary: 使用传递的参数指定键列名称而不是显式写入

pandas.dataframe to orderedDictionary: using a passed argument to specify the key column name instead of explicitly writing it

python

dictionary

ordereddictionary

dataframe

pandas