用另一个 Dataframe 的值填充 Dataframe(不同的列名)
Fill Dataframe with values from another Dataframe (not the same column names)
我正在尝试用另一个数据框 (InputData) 中的值填充 Python 中的一个空数据框 (OutputData)。
InputData 有四列(“Strike”、“DTE”、“IV”、“Pred_IV”)
OutputData 将来自 InputData 的所有唯一 Strikes 作为索引,并将来自输入数据的所有唯一 DTE 作为列名。
我的目标是用来自 InputData 的相应“Pred_IV”值填充 OutputData。因为它需要同时匹配索引和列名,所以我不知道如何使用任何已知函数来实现它。
如果 InputData 中没有与索引和列名匹配的值,则该值可以保持为 NaN
在下面找到我使用 df.to_dict() 提取的数据帧以获取更多详细信息。
非常感谢您的帮助。
最好的,
弗洛
InputData.head()
Strike DTE IV Pred_IV
8 0.5131 2.784 0.3366 0.733360
9 0.5131 3.781 0.3291 0.735295
20 0.5864 2.784 0.3178 0.733476
21 0.5864 3.781 0.3129 0.735357
22 0.5864 4.778 0.3008 0.736143
InputData.head().to_dict()
{'Strike': {8: 0.5131, 9: 0.5131, 20: 0.5864, 21: 0.5864, 22: 0.5864},
'DTE': {8: 2.784, 9: 3.781, 20: 2.784, 21: 3.781, 22: 4.778},
'IV': {8: 0.33659999999999995,
9: 0.32909999999999995,
20: 0.3178,
21: 0.3129,
22: 0.30079999999999996},
'Pred_IV': {8: 0.7333602770095773,
9: 0.7352946387206533,
20: 0.7334762408944806,
21: 0.7353567361456718,
22: 0.7361431377881676}})
OutputData.head()
0.025 0.101 0.197 0.274 0.523 0.772 1.769 2.267 2.784 3.781 4.778 5.774
0.5131 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.5864 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.6597 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.7330 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.7697 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
OutputData.head(2).to_dict()
{0.025: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.101: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.197: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.274: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.523: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.772: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
1.769: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
2.267: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
2.784: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
3.781: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
4.778: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
5.774: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan}}
我认为您的问题是这样的:
import pandas as pd
import numpy as np
InputData = pd.DataFrame(
columns='Strike,DTE,IV,Pred_IV'.split(','),
index=[8,9,20,21,22],
data=[[0.5131, 2.784, 0.3366, 0.733360],
[0.5131, 3.781, 0.3291, 0.735295],
[0.5864, 2.784, 0.3178, 0.733476],
[0.5864, 3.781, 0.3129, 0.735357],
[0.5864, 4.778, 0.3008, 0.736143]])
OutputData = pd.DataFrame(data=np.NaN,
columns=pd.Index(name='DTE', data=list(set(InputData.DTE.to_list()))),
index=pd.Index(name='Strike', data=list(set(InputData.Strike.to_list()))))
def foo(x):
OutputData.loc[x.Strike, x.DTE] = x.Pred_IV
InputData.apply(foo, axis=1)
print(OutputData)
输出:
DTE 2.784 3.781 4.778
Strike
0.5131 0.733360 0.735295 NaN
0.5864 0.733476 0.735357 0.736143
如果您更喜欢未命名的索引,您可以这样做:
OutputData = pd.DataFrame(data=np.NaN,
columns=list(set(InputData.DTE.to_list())),
index=list(set(InputData.Strike.to_list())))
输出:
2.784 3.781 4.778
0.5131 0.733360 0.735295 NaN
0.5864 0.733476 0.735357 0.736143
我正在尝试用另一个数据框 (InputData) 中的值填充 Python 中的一个空数据框 (OutputData)。
InputData 有四列(“Strike”、“DTE”、“IV”、“Pred_IV”) OutputData 将来自 InputData 的所有唯一 Strikes 作为索引,并将来自输入数据的所有唯一 DTE 作为列名。
我的目标是用来自 InputData 的相应“Pred_IV”值填充 OutputData。因为它需要同时匹配索引和列名,所以我不知道如何使用任何已知函数来实现它。
如果 InputData 中没有与索引和列名匹配的值,则该值可以保持为 NaN
在下面找到我使用 df.to_dict() 提取的数据帧以获取更多详细信息。
非常感谢您的帮助。
最好的, 弗洛
InputData.head()
Strike DTE IV Pred_IV
8 0.5131 2.784 0.3366 0.733360
9 0.5131 3.781 0.3291 0.735295
20 0.5864 2.784 0.3178 0.733476
21 0.5864 3.781 0.3129 0.735357
22 0.5864 4.778 0.3008 0.736143
InputData.head().to_dict()
{'Strike': {8: 0.5131, 9: 0.5131, 20: 0.5864, 21: 0.5864, 22: 0.5864},
'DTE': {8: 2.784, 9: 3.781, 20: 2.784, 21: 3.781, 22: 4.778},
'IV': {8: 0.33659999999999995,
9: 0.32909999999999995,
20: 0.3178,
21: 0.3129,
22: 0.30079999999999996},
'Pred_IV': {8: 0.7333602770095773,
9: 0.7352946387206533,
20: 0.7334762408944806,
21: 0.7353567361456718,
22: 0.7361431377881676}})
OutputData.head()
0.025 0.101 0.197 0.274 0.523 0.772 1.769 2.267 2.784 3.781 4.778 5.774
0.5131 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.5864 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.6597 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.7330 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
0.7697 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
OutputData.head(2).to_dict()
{0.025: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.101: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.197: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.274: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.523: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
0.772: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
1.769: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
2.267: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
2.784: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
3.781: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
4.778: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan},
5.774: {0.5131: nan,
0.5864: nan,
0.6597: nan,
0.733: nan,
0.7696999999999999: nan}}
我认为您的问题是这样的:
import pandas as pd
import numpy as np
InputData = pd.DataFrame(
columns='Strike,DTE,IV,Pred_IV'.split(','),
index=[8,9,20,21,22],
data=[[0.5131, 2.784, 0.3366, 0.733360],
[0.5131, 3.781, 0.3291, 0.735295],
[0.5864, 2.784, 0.3178, 0.733476],
[0.5864, 3.781, 0.3129, 0.735357],
[0.5864, 4.778, 0.3008, 0.736143]])
OutputData = pd.DataFrame(data=np.NaN,
columns=pd.Index(name='DTE', data=list(set(InputData.DTE.to_list()))),
index=pd.Index(name='Strike', data=list(set(InputData.Strike.to_list()))))
def foo(x):
OutputData.loc[x.Strike, x.DTE] = x.Pred_IV
InputData.apply(foo, axis=1)
print(OutputData)
输出:
DTE 2.784 3.781 4.778
Strike
0.5131 0.733360 0.735295 NaN
0.5864 0.733476 0.735357 0.736143
如果您更喜欢未命名的索引,您可以这样做:
OutputData = pd.DataFrame(data=np.NaN,
columns=list(set(InputData.DTE.to_list())),
index=list(set(InputData.Strike.to_list())))
输出:
2.784 3.781 4.778
0.5131 0.733360 0.735295 NaN
0.5864 0.733476 0.735357 0.736143