使用 Python pandas 计算调整后的成本基础(股票投资组合分析 Buy/Sell)
Calculate adjusted cost base using Python pandas ( Portfolio Analysis of Stocks Buy/Sell)
我正在尝试对我的交易进行投资组合分析,并尝试计算调整后的成本基价。我几乎尝试了所有方法,但似乎没有任何效果。我可以计算出调整后的数量,但无法获得调整后的购买价格有人可以帮忙吗?
这是样本交易日志原始数据
import pandas as pd
import numpy as np
raw_data = {'Date': ['04-23-2020', '05-05-2020', '05-05-2020', '05-11-2020', '05-11-2020',
'05-12-2020', '05-12-2020', '05-27-2020', '06-03-2020', '06-03-2020',
'06-03-2020', '06-03-2020', '06-03-2020'],
'Type': ['Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy',
'Sell', 'Sell', 'Sell', 'Buy', 'Sell', 'Sell'],
'Symbol': ['TSE:AC', 'TSE:AC', 'TSE:HEXO', 'TSE:BPY.UN', 'TSE:BPY.UN',
'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN',
'TSE:AC', 'TSE:BPY.UN', 'TSE:HEXO'],
'Quantity': [75, 100, 1450, 200, 50, 80, 150, 100, 125, 100, 100, 50, 1450],
'Amount per unit': [18.04, 17.29, 0.73, 13.04, 13.06, 12.65, 15.9, 15.01,
18.05, 14.75, 15.8, 14.7, 1.07],
'Turnover': [1353, 1729, 1058.5, 2608, 653, 1012, 2385, 1501, 2256.25, 1475, 1580, 735, 1551.5],
}
df = pd.DataFrame (raw_data, columns = ['Date','Type','Symbol','Quantity','Amount per unit', 'Turnover']).sort_values(['Date','Symbol']).reset_index(drop = True)
我能够毫无问题地获得调整后的数量,但我无法获得正确的调整后单价。这里的条件是,如果我卖出一只股票,我的调整后单价不应改变,并保持与买入该股票时的最后调整价相同。
#to calculate adjusted quantity. this works as expected
df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()
#section where I am having problem. Works good until I reach the row where sell was made
df['Adjusted Price Per Unit'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
df['Adjusted Price Per Unit'] = df.groupby('Symbol')['Adjusted Price Per Unit'].cumsum().div(df['Adjusted Quantity'])
运行 此代码将产生以下结果
例如:索引 7 处行的调整后价格应为 12.948(与索引 6 处的行相同)而不是 12.052。此外,最后一行调整后的价格应为 0.73(与索引 2 处的行相同),因为我正在买卖相同数量的股票。
示例 2:在指数 6,我以 12.65 的价格购买了 80 股 BPY,这使我的平均价格降至 12.94,总共 330 股 (250+80)。现在,我以 15.01(指数 7)的价格卖出 100 股。我的代码将调整后的成本调整为 12.05。我需要调整后的成本为 12.94 而不是 12.05。简单地说,如果交易类型是卖出,则忽略调整价格。使用该特定股票的最后一次购买类型交易中的最后调整价格。
我的代码的最后两行不正确。你能帮我正确计算调整后的单价吗?谢谢:)
如果您不计算销售的调整价格,就像您评论的那样,那么您可以将销售行处理为 NA,并用同一股票的前一个值填充它。作为您代码中的确认,您在开始计算'Adjusted Quantity'时是否不需要考虑相同的股票?
df.sort_values(['Symbol','Date','Type'], ascending=[True, True, True], inplace=True)
# your code
df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()
df['Adjusted Price Per Unit'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
df['Adjusted Price Per Unit'] = df.groupby('Symbol')['Adjusted Price Per Unit'].cumsum().div(df['Adjusted Quantity'])
df.loc[df['Type'] == 'Sell',['Adjusted Price Per Unit']] = np.NaN
df.fillna(method='ffill', inplace=True)
| | Date | Type | Symbol | Quantity | Amount per unit | Turnover | Adjusted Quantity | Adjusted Price Per Unit |
|---:|:-----------|:-------|:-----------|-----------:|------------------:|-----------:|--------------------:|--------------------------:|
| 0 | 04-23-2020 | Buy | TSE:AC | 75 | 18.04 | 1353 | 75 | 18.04 |
| 1 | 05-05-2020 | Buy | TSE:AC | 100 | 17.29 | 1729 | 175 | 17.6114 |
| 5 | 05-12-2020 | Buy | TSE:AC | 150 | 15.9 | 2385 | 325 | 16.8215 |
| 9 | 06-03-2020 | Buy | TSE:AC | 100 | 15.8 | 1580 | 425 | 16.5812 |
| 8 | 06-03-2020 | Sell | TSE:AC | 125 | 18.05 | 2256.25 | 300 | 16.5812 |
| 3 | 05-11-2020 | Buy | TSE:BPY.UN | 200 | 13.04 | 2608 | 200 | 13.04 |
| 4 | 05-11-2020 | Buy | TSE:BPY.UN | 50 | 13.06 | 653 | 250 | 13.044 |
| 6 | 05-12-2020 | Buy | TSE:BPY.UN | 80 | 12.65 | 1012 | 330 | 12.9485 |
| 7 | 05-27-2020 | Sell | TSE:BPY.UN | 100 | 15.01 | 1501 | 230 | 12.9485 |
| 10 | 06-03-2020 | Sell | TSE:BPY.UN | 100 | 14.75 | 1475 | 130 | 12.9485 |
| 11 | 06-03-2020 | Sell | TSE:BPY.UN | 50 | 14.7 | 735 | 80 | 12.9485 |
| 2 | 05-05-2020 | Buy | TSE:HEXO | 1450 | 0.73 | 1058.5 | 1450 | 0.73 |
| 12 | 06-03-2020 | Sell | TSE:HEXO | 1450 | 1.07 | 1551.5 | 0 | 0.73 |
我正在尝试对我的交易进行投资组合分析,并尝试计算调整后的成本基价。我几乎尝试了所有方法,但似乎没有任何效果。我可以计算出调整后的数量,但无法获得调整后的购买价格有人可以帮忙吗?
这是样本交易日志原始数据
import pandas as pd
import numpy as np
raw_data = {'Date': ['04-23-2020', '05-05-2020', '05-05-2020', '05-11-2020', '05-11-2020',
'05-12-2020', '05-12-2020', '05-27-2020', '06-03-2020', '06-03-2020',
'06-03-2020', '06-03-2020', '06-03-2020'],
'Type': ['Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy', 'Buy',
'Sell', 'Sell', 'Sell', 'Buy', 'Sell', 'Sell'],
'Symbol': ['TSE:AC', 'TSE:AC', 'TSE:HEXO', 'TSE:BPY.UN', 'TSE:BPY.UN',
'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN', 'TSE:AC', 'TSE:BPY.UN',
'TSE:AC', 'TSE:BPY.UN', 'TSE:HEXO'],
'Quantity': [75, 100, 1450, 200, 50, 80, 150, 100, 125, 100, 100, 50, 1450],
'Amount per unit': [18.04, 17.29, 0.73, 13.04, 13.06, 12.65, 15.9, 15.01,
18.05, 14.75, 15.8, 14.7, 1.07],
'Turnover': [1353, 1729, 1058.5, 2608, 653, 1012, 2385, 1501, 2256.25, 1475, 1580, 735, 1551.5],
}
df = pd.DataFrame (raw_data, columns = ['Date','Type','Symbol','Quantity','Amount per unit', 'Turnover']).sort_values(['Date','Symbol']).reset_index(drop = True)
我能够毫无问题地获得调整后的数量,但我无法获得正确的调整后单价。这里的条件是,如果我卖出一只股票,我的调整后单价不应改变,并保持与买入该股票时的最后调整价相同。
#to calculate adjusted quantity. this works as expected
df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()
#section where I am having problem. Works good until I reach the row where sell was made
df['Adjusted Price Per Unit'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
df['Adjusted Price Per Unit'] = df.groupby('Symbol')['Adjusted Price Per Unit'].cumsum().div(df['Adjusted Quantity'])
运行 此代码将产生以下结果
例如:索引 7 处行的调整后价格应为 12.948(与索引 6 处的行相同)而不是 12.052。此外,最后一行调整后的价格应为 0.73(与索引 2 处的行相同),因为我正在买卖相同数量的股票。
示例 2:在指数 6,我以 12.65 的价格购买了 80 股 BPY,这使我的平均价格降至 12.94,总共 330 股 (250+80)。现在,我以 15.01(指数 7)的价格卖出 100 股。我的代码将调整后的成本调整为 12.05。我需要调整后的成本为 12.94 而不是 12.05。简单地说,如果交易类型是卖出,则忽略调整价格。使用该特定股票的最后一次购买类型交易中的最后调整价格。
我的代码的最后两行不正确。你能帮我正确计算调整后的单价吗?谢谢:)
如果您不计算销售的调整价格,就像您评论的那样,那么您可以将销售行处理为 NA,并用同一股票的前一个值填充它。作为您代码中的确认,您在开始计算'Adjusted Quantity'时是否不需要考虑相同的股票?
df.sort_values(['Symbol','Date','Type'], ascending=[True, True, True], inplace=True)
# your code
df['Adjusted Quantity'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Quantity'], axis = 1)
df['Adjusted Quantity'] = df.groupby('Symbol')['Adjusted Quantity'].cumsum()
df['Adjusted Price Per Unit'] = df.apply(lambda x: ((x.Type == "Buy") - (x.Type == "Sell")) * x['Turnover'], axis = 1)
df['Adjusted Price Per Unit'] = df.groupby('Symbol')['Adjusted Price Per Unit'].cumsum().div(df['Adjusted Quantity'])
df.loc[df['Type'] == 'Sell',['Adjusted Price Per Unit']] = np.NaN
df.fillna(method='ffill', inplace=True)
| | Date | Type | Symbol | Quantity | Amount per unit | Turnover | Adjusted Quantity | Adjusted Price Per Unit |
|---:|:-----------|:-------|:-----------|-----------:|------------------:|-----------:|--------------------:|--------------------------:|
| 0 | 04-23-2020 | Buy | TSE:AC | 75 | 18.04 | 1353 | 75 | 18.04 |
| 1 | 05-05-2020 | Buy | TSE:AC | 100 | 17.29 | 1729 | 175 | 17.6114 |
| 5 | 05-12-2020 | Buy | TSE:AC | 150 | 15.9 | 2385 | 325 | 16.8215 |
| 9 | 06-03-2020 | Buy | TSE:AC | 100 | 15.8 | 1580 | 425 | 16.5812 |
| 8 | 06-03-2020 | Sell | TSE:AC | 125 | 18.05 | 2256.25 | 300 | 16.5812 |
| 3 | 05-11-2020 | Buy | TSE:BPY.UN | 200 | 13.04 | 2608 | 200 | 13.04 |
| 4 | 05-11-2020 | Buy | TSE:BPY.UN | 50 | 13.06 | 653 | 250 | 13.044 |
| 6 | 05-12-2020 | Buy | TSE:BPY.UN | 80 | 12.65 | 1012 | 330 | 12.9485 |
| 7 | 05-27-2020 | Sell | TSE:BPY.UN | 100 | 15.01 | 1501 | 230 | 12.9485 |
| 10 | 06-03-2020 | Sell | TSE:BPY.UN | 100 | 14.75 | 1475 | 130 | 12.9485 |
| 11 | 06-03-2020 | Sell | TSE:BPY.UN | 50 | 14.7 | 735 | 80 | 12.9485 |
| 2 | 05-05-2020 | Buy | TSE:HEXO | 1450 | 0.73 | 1058.5 | 1450 | 0.73 |
| 12 | 06-03-2020 | Sell | TSE:HEXO | 1450 | 1.07 | 1551.5 | 0 | 0.73 |