将 Pandas df 中的数据提取到列表中
Extracting data out of Pandas df into a list
我有一个 Pandas 数据框 headers 和包含冗余数据的行,我想从中提取。例如,我有一个看起来像这样的 df:
df = pd.DataFrame({'Your availability: Wednesday, December 25th, 2019 5:00AM-6:00AM': ['Wednesday, December 25th, 2019 5:00AM-6:00AM', pd.NaN, pd.NaN, 'Wednesday, December 25th, 2019 5:00AM-6:00AM'],
'Your availability: Tuesday, December 10th 2019 8:00AM-5:00PM': [pd.NaN, 'Tuesday, December 10th 2019 8:00AM-5:00PM', pd.NaN, pd.NaN]})
...我想提取日期并将其放入字典以供参考:
datetimes = {'P1': "Wednesday, December 25th, 2019 5:00AM-6:00AM", 'P2' : "Tuesday, December 10th 2019 8:00AM-5:00PM", 'P3': NaN, 'P4': "Wednesday, December 25th, 2019 5:00AM-6:00AM}
是不是你想要的:
df.drop_duplicates().stack().to_list()
输出:
['Wednesday, December 25th, 2019 5:00AM-6:00AM',
'Tuesday, December 10th 2019 8:00AM-5:00PM']
IIUC,试试这个
df.ffill(1).iloc[:,-1].rename(lambda x: f'P{x+1}').to_dict()
Out[1159]:
{'P1': 'Wednesday, December 25th, 2019 5:00AM-6:00AM',
'P2': 'Tuesday, December 10th 2019 8:00AM-5:00PM',
'P3': nan,
'P4': 'Wednesday, December 25th, 2019 5:00AM-6:00AM'}
我有一个 Pandas 数据框 headers 和包含冗余数据的行,我想从中提取。例如,我有一个看起来像这样的 df:
df = pd.DataFrame({'Your availability: Wednesday, December 25th, 2019 5:00AM-6:00AM': ['Wednesday, December 25th, 2019 5:00AM-6:00AM', pd.NaN, pd.NaN, 'Wednesday, December 25th, 2019 5:00AM-6:00AM'],
'Your availability: Tuesday, December 10th 2019 8:00AM-5:00PM': [pd.NaN, 'Tuesday, December 10th 2019 8:00AM-5:00PM', pd.NaN, pd.NaN]})
...我想提取日期并将其放入字典以供参考:
datetimes = {'P1': "Wednesday, December 25th, 2019 5:00AM-6:00AM", 'P2' : "Tuesday, December 10th 2019 8:00AM-5:00PM", 'P3': NaN, 'P4': "Wednesday, December 25th, 2019 5:00AM-6:00AM}
是不是你想要的:
df.drop_duplicates().stack().to_list()
输出:
['Wednesday, December 25th, 2019 5:00AM-6:00AM',
'Tuesday, December 10th 2019 8:00AM-5:00PM']
IIUC,试试这个
df.ffill(1).iloc[:,-1].rename(lambda x: f'P{x+1}').to_dict()
Out[1159]:
{'P1': 'Wednesday, December 25th, 2019 5:00AM-6:00AM',
'P2': 'Tuesday, December 10th 2019 8:00AM-5:00PM',
'P3': nan,
'P4': 'Wednesday, December 25th, 2019 5:00AM-6:00AM'}