使用数学模型格式化 Pandas 值并创建数据表
Formatting Pandas values and creating data sheets using mathematical models
我想做一个 Transaction Type
的组类型,其中 Buy and Sell
与 Short and Cover
分开。我想修改下面代码中分隔 buy and Sell
和 Short and Cover
的函数 g
。 Gains/Loss
和 Percentage Return
适用于 Buy and Sell
,但不适用于 Short and Cover
。我想修改代码以使其适用。
有一列跟踪该股票的 Gains/Loss
,因为它将从卖出价值中减去买入价值(买入 - 卖出),如 (2360.15-2160.36) + (1897-1936.2)
,因为 META
已被买入并在 2 个不同的场合出售两次,价值将是那样。对于 Short and Cover
它将是 -(13.60 - 21.60)
如果第二个值 21.60
高于第一个值 13.60
那么它将是一个正值作为输出。
由方程 (Buy-Sell)/Buy * 100
计算得到的 % Gain/Loss
所以对于 META
方程就像
((2366.15-2160.36)/2360.15 + (1897-1936.2)/1897)* 100)
。对于 Short and Cover
它将是 -(Cover - Short)/Short
所以它将是 (-(13.60 - 21.60)/21.60) * 100
。一些代码已经从这个 中实现。
我怎样才能修改下面的 table 以获得预期的输出?
import pandas as pd
a = pd.DataFrame({
'Date': {0: '2/4/2022 1:33:40 PM', 1: '2/7/2022 3:09:46 PM',
2: '2/11/2022 9:35:44 AM',3: '2/12/2022 12:16:30 PM', 4: '2/14/2022 2:55:33 PM',
5: '2/15/2022 3:55:33 PM', 6: '2/15/2022 9:15:33 PM', 7:'3/1/2022 10:16:40 AM'},
'TransactionType': {0: 'Buy', 1: 'Buy', 2: 'Sell', 3:'Short', 4: 'Sell', 5: 'Buy', 6:'Sell', 7:'Cover'},
'Symbol': {0: 'META', 1: 'BABA', 2:'META', 3:'RDFN', 4: 'BABA',5: 'META', 6: 'META', 7:'RDFN'},
'Price': {0: 12.79, 1: 116.16, 2: 12.93, 3:21.81, 4: 121.82, 5: 13.55, 6:13.83, 7:1853.85},
'Amount': {0: -2366.15, 1: -2439.36, 2: 160.0, 3:21.65 , 4: 2558.22, 5:-1897, 6:1936.2, 7:13.60}})
out = df.groupby(['Symbol','TransactionType'])['TransactionType'].count().unstack().add_prefix('Number of ').add_suffix('s')
g = df.groupby(['Symbol', df['TransactionType'].eq('Buy').groupby(df['Symbol']).cumsum()])['Amount']
out['Gains/Losses'] = g.sum().groupby(level=0).sum()
out['Percentage change'] = g.pct_change().groupby(df['Symbol']).sum()
out = out.reset_index().rename_axis([None], axis=1)
预期输出:
我不倾向于使用 groupby
虽然我相信其他经常使用它的人可能会评论它在这种情况下的适当性。
不清楚您是如何计算增益的 - 但我相信这个框架很容易编辑以允许您更改计算
a = pd.DataFrame({
'Date': {
0: '2/4/2022 1:33:40 PM',
1: '2/7/2022 3:09:46 PM',
2: '2/11/2022 9:35:44 AM',
3: '2/12/2022 12:16:30 PM',
4: '2/14/2022 2:55:33 PM',
5: '2/15/2022 3:55:33 PM',
6: '2/15/2022 9:15:33 PM',
7:'3/1/2022 10:16:40 AM'
},
'TransactionType': {
0: 'Buy',
1: 'Buy',
2: 'Sell',
3: 'Short',
4: 'Sell',
5: 'Buy',
6:'Sell',
7:'Cover'
},
'Symbol': {
0: 'META',
1: 'BABA',
2: 'META',
3: 'RDFN',
4: 'BABA',
5: 'META',
6: 'META',
7:'RDFN'
},
'Price': {
0: 12.79,
1: 116.16,
2: 12.93,
3: 21.81,
4: 121.82,
5: 13.55,
6: 13.83,
7: 1853.85
},
'Amount': {
0: -2366.15,
1: -2439.36,
2: 160.0,
3: 21.65 ,
4: 2558.22,
5: -1897,
6: 1936.2,
7: 13.60
}
})
print(a, '\n\n')
#### >>>>
#### ANSWER STARTS HERE ####
#### >>>>
results = {
'Symbol': [],
'Number of Buys': [],
'Number of Sells': [],
'Number of Shorts': [],
'Number of Covers': [],
'Gains/Loss ($)': [],
'Percentage Return (%)': [],
}
for symbol in set(a['Symbol']):
results['Symbol'].append(symbol)
# now extract the rest of the data by isolating matching orders from the original
# dataframe
temp = a[a['Symbol'] == symbol]
results['Number of Buys'].append(len(temp[temp['TransactionType']=='Buy']))
results['Number of Sells'].append(len(temp[temp['TransactionType']=='Sell']))
results['Number of Shorts'].append(len(temp[temp['TransactionType']=='Short']))
results['Number of Covers'].append(len(temp[temp['TransactionType']=='Cover']))
# do calculations
# for buys and sells, multiply the price by the amount of the relevant rows
buys = sum([
a['Price'][i] * a['Amount'][i] for i in temp[temp['TransactionType']=='Buy'].index
])
sells = sum([
a['Price'][i] * a['Amount'][i] for i in temp[temp['TransactionType']=='Sell'].index
])
# manage raw and percentage difference
# set to 0 if either is 0 - this avoids any divide by zero errors
gain = buys - sells
if buys == 0.0 or gain == 0.0:
percentage = 0.0
else:
percentage = 100 * gain / buys
results['Gains/Loss ($)'].append(gain)
results['Percentage Return (%)'].append(percentage)
# pass the dictionary to a dataframe and print
dataframe = pd.DataFrame(results)
print(dataframe)
运行如下:
Date TransactionType Symbol Price Amount
0 2/4/2022 1:33:40 PM Buy META 12.79 -2366.15
1 2/7/2022 3:09:46 PM Buy BABA 116.16 -2439.36
2 2/11/2022 9:35:44 AM Sell META 12.93 160.00
3 2/12/2022 12:16:30 PM Short RDFN 21.81 21.65
4 2/14/2022 2:55:33 PM Sell BABA 121.82 2558.22
5 2/15/2022 3:55:33 PM Buy META 13.55 -1897.00
6 2/15/2022 9:15:33 PM Sell META 13.83 1936.20
7 3/1/2022 10:16:40 AM Cover RDFN 1853.85 13.60
Symbol Number of Buys Number of Sells Number of Shorts Number of Covers Gains/Loss ($) Percentage Return (%)
0 META 2 2 0 0 -84813.8545 151.541507
1 RDFN 0 0 1 1 0.0000 0.000000
2 BABA 1 1 0 0 -594998.4180 209.982600
我想做一个 Transaction Type
的组类型,其中 Buy and Sell
与 Short and Cover
分开。我想修改下面代码中分隔 buy and Sell
和 Short and Cover
的函数 g
。 Gains/Loss
和 Percentage Return
适用于 Buy and Sell
,但不适用于 Short and Cover
。我想修改代码以使其适用。
有一列跟踪该股票的 Gains/Loss
,因为它将从卖出价值中减去买入价值(买入 - 卖出),如 (2360.15-2160.36) + (1897-1936.2)
,因为 META
已被买入并在 2 个不同的场合出售两次,价值将是那样。对于 Short and Cover
它将是 -(13.60 - 21.60)
如果第二个值 21.60
高于第一个值 13.60
那么它将是一个正值作为输出。
由方程 (Buy-Sell)/Buy * 100
计算得到的 % Gain/Loss
所以对于 META
方程就像
((2366.15-2160.36)/2360.15 + (1897-1936.2)/1897)* 100)
。对于 Short and Cover
它将是 -(Cover - Short)/Short
所以它将是 (-(13.60 - 21.60)/21.60) * 100
。一些代码已经从这个
import pandas as pd
a = pd.DataFrame({
'Date': {0: '2/4/2022 1:33:40 PM', 1: '2/7/2022 3:09:46 PM',
2: '2/11/2022 9:35:44 AM',3: '2/12/2022 12:16:30 PM', 4: '2/14/2022 2:55:33 PM',
5: '2/15/2022 3:55:33 PM', 6: '2/15/2022 9:15:33 PM', 7:'3/1/2022 10:16:40 AM'},
'TransactionType': {0: 'Buy', 1: 'Buy', 2: 'Sell', 3:'Short', 4: 'Sell', 5: 'Buy', 6:'Sell', 7:'Cover'},
'Symbol': {0: 'META', 1: 'BABA', 2:'META', 3:'RDFN', 4: 'BABA',5: 'META', 6: 'META', 7:'RDFN'},
'Price': {0: 12.79, 1: 116.16, 2: 12.93, 3:21.81, 4: 121.82, 5: 13.55, 6:13.83, 7:1853.85},
'Amount': {0: -2366.15, 1: -2439.36, 2: 160.0, 3:21.65 , 4: 2558.22, 5:-1897, 6:1936.2, 7:13.60}})
out = df.groupby(['Symbol','TransactionType'])['TransactionType'].count().unstack().add_prefix('Number of ').add_suffix('s')
g = df.groupby(['Symbol', df['TransactionType'].eq('Buy').groupby(df['Symbol']).cumsum()])['Amount']
out['Gains/Losses'] = g.sum().groupby(level=0).sum()
out['Percentage change'] = g.pct_change().groupby(df['Symbol']).sum()
out = out.reset_index().rename_axis([None], axis=1)
预期输出:
我不倾向于使用 groupby
虽然我相信其他经常使用它的人可能会评论它在这种情况下的适当性。
不清楚您是如何计算增益的 - 但我相信这个框架很容易编辑以允许您更改计算
a = pd.DataFrame({
'Date': {
0: '2/4/2022 1:33:40 PM',
1: '2/7/2022 3:09:46 PM',
2: '2/11/2022 9:35:44 AM',
3: '2/12/2022 12:16:30 PM',
4: '2/14/2022 2:55:33 PM',
5: '2/15/2022 3:55:33 PM',
6: '2/15/2022 9:15:33 PM',
7:'3/1/2022 10:16:40 AM'
},
'TransactionType': {
0: 'Buy',
1: 'Buy',
2: 'Sell',
3: 'Short',
4: 'Sell',
5: 'Buy',
6:'Sell',
7:'Cover'
},
'Symbol': {
0: 'META',
1: 'BABA',
2: 'META',
3: 'RDFN',
4: 'BABA',
5: 'META',
6: 'META',
7:'RDFN'
},
'Price': {
0: 12.79,
1: 116.16,
2: 12.93,
3: 21.81,
4: 121.82,
5: 13.55,
6: 13.83,
7: 1853.85
},
'Amount': {
0: -2366.15,
1: -2439.36,
2: 160.0,
3: 21.65 ,
4: 2558.22,
5: -1897,
6: 1936.2,
7: 13.60
}
})
print(a, '\n\n')
#### >>>>
#### ANSWER STARTS HERE ####
#### >>>>
results = {
'Symbol': [],
'Number of Buys': [],
'Number of Sells': [],
'Number of Shorts': [],
'Number of Covers': [],
'Gains/Loss ($)': [],
'Percentage Return (%)': [],
}
for symbol in set(a['Symbol']):
results['Symbol'].append(symbol)
# now extract the rest of the data by isolating matching orders from the original
# dataframe
temp = a[a['Symbol'] == symbol]
results['Number of Buys'].append(len(temp[temp['TransactionType']=='Buy']))
results['Number of Sells'].append(len(temp[temp['TransactionType']=='Sell']))
results['Number of Shorts'].append(len(temp[temp['TransactionType']=='Short']))
results['Number of Covers'].append(len(temp[temp['TransactionType']=='Cover']))
# do calculations
# for buys and sells, multiply the price by the amount of the relevant rows
buys = sum([
a['Price'][i] * a['Amount'][i] for i in temp[temp['TransactionType']=='Buy'].index
])
sells = sum([
a['Price'][i] * a['Amount'][i] for i in temp[temp['TransactionType']=='Sell'].index
])
# manage raw and percentage difference
# set to 0 if either is 0 - this avoids any divide by zero errors
gain = buys - sells
if buys == 0.0 or gain == 0.0:
percentage = 0.0
else:
percentage = 100 * gain / buys
results['Gains/Loss ($)'].append(gain)
results['Percentage Return (%)'].append(percentage)
# pass the dictionary to a dataframe and print
dataframe = pd.DataFrame(results)
print(dataframe)
运行如下:
Date TransactionType Symbol Price Amount
0 2/4/2022 1:33:40 PM Buy META 12.79 -2366.15
1 2/7/2022 3:09:46 PM Buy BABA 116.16 -2439.36
2 2/11/2022 9:35:44 AM Sell META 12.93 160.00
3 2/12/2022 12:16:30 PM Short RDFN 21.81 21.65
4 2/14/2022 2:55:33 PM Sell BABA 121.82 2558.22
5 2/15/2022 3:55:33 PM Buy META 13.55 -1897.00
6 2/15/2022 9:15:33 PM Sell META 13.83 1936.20
7 3/1/2022 10:16:40 AM Cover RDFN 1853.85 13.60
Symbol Number of Buys Number of Sells Number of Shorts Number of Covers Gains/Loss ($) Percentage Return (%)
0 META 2 2 0 0 -84813.8545 151.541507
1 RDFN 0 0 1 1 0.0000 0.000000
2 BABA 1 1 0 0 -594998.4180 209.982600