将字典转换为多索引数据框
Converting dictionary to a multi indexed dataframe
我有一个构造如下的 defaultdict:
data = defaultdict(dict)
symbol_list = [
'ETHUSDT',
'BTCUSDT'
]
for symbol in symbol_list:
data[symbol] = load_binance_data(c, symbol, '2021-12-23', timeframe='5m')
这是作为值存储在字典中的数据帧的轴:
[DatetimeIndex(['2021-12-23 00:05:00', '2021-12-23 00:10:00',
'2021-12-23 00:15:00', '2021-12-23 00:20:00',
'2021-12-23 00:25:00', '2021-12-23 00:30:00',
'2021-12-23 00:35:00', '2021-12-23 00:40:00',
'2021-12-23 00:45:00', '2021-12-23 00:50:00',
...
'2021-12-24 19:05:00', '2021-12-24 19:10:00',
'2021-12-24 19:15:00', '2021-12-24 19:20:00',
'2021-12-24 19:25:00', '2021-12-24 19:30:00',
'2021-12-24 19:35:00', '2021-12-24 19:40:00',
'2021-12-24 19:45:00', '2021-12-24 19:50:00'],
dtype='datetime64[ns]', name='time', length=526, freq=None), Index(['open', 'high', 'low', 'close', 'volume'],
dtype='object')]
我想将此字典转换为具有多个索引的单个数据框,如下所示:
[DatetimeIndex(['2021-12-23 00:05:00', '2021-12-23 00:10:00',
'2021-12-23 00:15:00', '2021-12-23 00:20:00',
'2021-12-23 00:25:00', '2021-12-23 00:30:00',
'2021-12-23 00:35:00', '2021-12-23 00:40:00',
'2021-12-23 00:45:00', '2021-12-23 00:50:00',
...
'2021-12-24 19:05:00', '2021-12-24 19:10:00',
'2021-12-24 19:15:00', '2021-12-24 19:20:00',
'2021-12-24 19:25:00', '2021-12-24 19:30:00',
'2021-12-24 19:35:00', '2021-12-24 19:40:00',
'2021-12-24 19:45:00', '2021-12-24 19:50:00'],
dtype='datetime64[ns]', name='time', freq=None),
MultiIndex([
('open', 'ETHUSDT'),
('open', 'BTCUSDT'),
('high', 'ETHUSDT'),
('high', 'BTCUSDT'),
('low', 'ETHUSDT'),
('low', 'BTCUSDT'),
('close', 'ETHUSDT'),
('close', 'BTCUSDT'),
('volume', 'ETHUSDT'),
('volume', 'BTCUSDT')],
names=['Attributes', 'Symbols'])]
我该如何进行这种转换?
提前致谢,
如果我没理解错的话,你有两个 DataFrame:
DataFrame 1,我们称它为SYMBOL1
:
open
high
low
close
volume
time
2021-12-23 00:05:00
1
3
5
7
9
2021-12-23 00:10:00
2
4
6
8
10
和 DataFrame 2,我们称它为 SYMBOL2
:
open
high
low
close
volume
time
2021-12-23 00:05:00
-1
-3
-5
-7
-9
2021-12-23 00:10:00
-2
-4
-6
-8
-10
您正试图将 DataFrame 转换为具有您上面给出的形状的轴。如果是这样,那么这是一种方法:
import pandas as pd
# Code to Create the DataFrames in the example :
d = {'open': [1, 2], 'high': [3, 4], 'low':[5,6], 'close':[7,8], 'volume':[9,10]}
df1 = pd.DataFrame(d, index=pd.DatetimeIndex(
['2021-12-23 00:05:00', '2021-12-23 00:10:00'], name='time'))
df2 = df1*-1
# The transformation :
new_axis_df1 = pd.MultiIndex.from_product(
[df1.axes[1].values, ['SYMBOL1']], names=['Attributes', 'Symbols'])
new_df1 = df1.set_axis(new_axis_df1, axis='columns')
new_axis_df2 = pd.MultiIndex.from_product(
[df2.axes[1].values, ['SYMBOL2']], names=['Attributes', 'Symbols'])
new_df2 = df2.set_axis(new_axis_df2, axis='columns')
# Merging the transformed DataFrames
final_df = new_df1.merge(new_df2, on='time')
#Result :
print(final_df.axes)
这将生成并输出以下 DataFrame 轴:
[DatetimeIndex(['2021-12-23 00:05:00', '2021-12-23 00:10:00'], dtype='datetime64[ns]', name='time', freq=None),
MultiIndex([( 'open', 'SYMBOL1'),
( 'high', 'SYMBOL1'),
( 'low', 'SYMBOL1'),
( 'close', 'SYMBOL1'),
('volume', 'SYMBOL1'),
( 'open', 'SYMBOL2'),
( 'high', 'SYMBOL2'),
( 'low', 'SYMBOL2'),
( 'close', 'SYMBOL2'),
('volume', 'SYMBOL2')],
names=['Attributes', 'Symbols'])]
final_df
的示例如下所示:
Attributes
open
high
low
close
volume
open
high
low
close
volume
Symbols
SYMBOL1
SYMBOL1
SYMBOL1
SYMBOL1
SYMBOL1
SYMBOL2
SYMBOL2
SYMBOL2
SYMBOL2
SYMBOL2
time
2021-12-23 00:05:00
1
3
5
7
9
-1
-3
-5
-7
-9
2021-12-23 00:10:00
2
4
6
8
10
-2
-4
-6
-8
-10
我有一个构造如下的 defaultdict:
data = defaultdict(dict)
symbol_list = [
'ETHUSDT',
'BTCUSDT'
]
for symbol in symbol_list:
data[symbol] = load_binance_data(c, symbol, '2021-12-23', timeframe='5m')
这是作为值存储在字典中的数据帧的轴:
[DatetimeIndex(['2021-12-23 00:05:00', '2021-12-23 00:10:00',
'2021-12-23 00:15:00', '2021-12-23 00:20:00',
'2021-12-23 00:25:00', '2021-12-23 00:30:00',
'2021-12-23 00:35:00', '2021-12-23 00:40:00',
'2021-12-23 00:45:00', '2021-12-23 00:50:00',
...
'2021-12-24 19:05:00', '2021-12-24 19:10:00',
'2021-12-24 19:15:00', '2021-12-24 19:20:00',
'2021-12-24 19:25:00', '2021-12-24 19:30:00',
'2021-12-24 19:35:00', '2021-12-24 19:40:00',
'2021-12-24 19:45:00', '2021-12-24 19:50:00'],
dtype='datetime64[ns]', name='time', length=526, freq=None), Index(['open', 'high', 'low', 'close', 'volume'],
dtype='object')]
我想将此字典转换为具有多个索引的单个数据框,如下所示:
[DatetimeIndex(['2021-12-23 00:05:00', '2021-12-23 00:10:00',
'2021-12-23 00:15:00', '2021-12-23 00:20:00',
'2021-12-23 00:25:00', '2021-12-23 00:30:00',
'2021-12-23 00:35:00', '2021-12-23 00:40:00',
'2021-12-23 00:45:00', '2021-12-23 00:50:00',
...
'2021-12-24 19:05:00', '2021-12-24 19:10:00',
'2021-12-24 19:15:00', '2021-12-24 19:20:00',
'2021-12-24 19:25:00', '2021-12-24 19:30:00',
'2021-12-24 19:35:00', '2021-12-24 19:40:00',
'2021-12-24 19:45:00', '2021-12-24 19:50:00'],
dtype='datetime64[ns]', name='time', freq=None),
MultiIndex([
('open', 'ETHUSDT'),
('open', 'BTCUSDT'),
('high', 'ETHUSDT'),
('high', 'BTCUSDT'),
('low', 'ETHUSDT'),
('low', 'BTCUSDT'),
('close', 'ETHUSDT'),
('close', 'BTCUSDT'),
('volume', 'ETHUSDT'),
('volume', 'BTCUSDT')],
names=['Attributes', 'Symbols'])]
我该如何进行这种转换?
提前致谢,
如果我没理解错的话,你有两个 DataFrame:
DataFrame 1,我们称它为
SYMBOL1
:open high low close volume time 2021-12-23 00:05:00 1 3 5 7 9 2021-12-23 00:10:00 2 4 6 8 10 和 DataFrame 2,我们称它为
SYMBOL2
:open high low close volume time 2021-12-23 00:05:00 -1 -3 -5 -7 -9 2021-12-23 00:10:00 -2 -4 -6 -8 -10
您正试图将 DataFrame 转换为具有您上面给出的形状的轴。如果是这样,那么这是一种方法:
import pandas as pd
# Code to Create the DataFrames in the example :
d = {'open': [1, 2], 'high': [3, 4], 'low':[5,6], 'close':[7,8], 'volume':[9,10]}
df1 = pd.DataFrame(d, index=pd.DatetimeIndex(
['2021-12-23 00:05:00', '2021-12-23 00:10:00'], name='time'))
df2 = df1*-1
# The transformation :
new_axis_df1 = pd.MultiIndex.from_product(
[df1.axes[1].values, ['SYMBOL1']], names=['Attributes', 'Symbols'])
new_df1 = df1.set_axis(new_axis_df1, axis='columns')
new_axis_df2 = pd.MultiIndex.from_product(
[df2.axes[1].values, ['SYMBOL2']], names=['Attributes', 'Symbols'])
new_df2 = df2.set_axis(new_axis_df2, axis='columns')
# Merging the transformed DataFrames
final_df = new_df1.merge(new_df2, on='time')
#Result :
print(final_df.axes)
这将生成并输出以下 DataFrame 轴:
[DatetimeIndex(['2021-12-23 00:05:00', '2021-12-23 00:10:00'], dtype='datetime64[ns]', name='time', freq=None),
MultiIndex([( 'open', 'SYMBOL1'),
( 'high', 'SYMBOL1'),
( 'low', 'SYMBOL1'),
( 'close', 'SYMBOL1'),
('volume', 'SYMBOL1'),
( 'open', 'SYMBOL2'),
( 'high', 'SYMBOL2'),
( 'low', 'SYMBOL2'),
( 'close', 'SYMBOL2'),
('volume', 'SYMBOL2')],
names=['Attributes', 'Symbols'])]
final_df
的示例如下所示:
Attributes | open | high | low | close | volume | open | high | low | close | volume |
---|---|---|---|---|---|---|---|---|---|---|
Symbols | SYMBOL1 | SYMBOL1 | SYMBOL1 | SYMBOL1 | SYMBOL1 | SYMBOL2 | SYMBOL2 | SYMBOL2 | SYMBOL2 | SYMBOL2 |
time | ||||||||||
2021-12-23 00:05:00 | 1 | 3 | 5 | 7 | 9 | -1 | -3 | -5 | -7 | -9 |
2021-12-23 00:10:00 | 2 | 4 | 6 | 8 | 10 | -2 | -4 | -6 | -8 | -10 |