使用数学模型格式化 Pandas 值并创建数据表

Formatting Pandas values and creating data sheets using mathematical models

我想做一个 Transaction Type 的组类型,其中 Buy and SellShort and Cover 分开。我想修改下面代码中分隔 buy and SellShort and Cover 的函数 gGains/LossPercentage Return 适用于 Buy and Sell,但不适用于 Short and Cover。我想修改代码以使其适用。

有一列跟踪该股票的 Gains/Loss,因为它将从卖出价值中减去买入价值(买入 - 卖出),如 (2360.15-2160.36) + (1897-1936.2),因为 META 已被买入并在 2 个不同的场合出售两次,价值将是那样。对于 Short and Cover 它将是 -(13.60 - 21.60) 如果第二个值 21.60 高于第一个值 13.60 那么它将是一个正值作为输出。

由方程 (Buy-Sell)/Buy * 100 计算得到的 % Gain/Loss 所以对于 META 方程就像 ((2366.15-2160.36)/2360.15 + (1897-1936.2)/1897)* 100)。对于 Short and Cover 它将是 -(Cover - Short)/Short 所以它将是 (-(13.60 - 21.60)/21.60) * 100。一些代码已经从这个 中实现。 我怎样才能修改下面的 table 以获得预期的输出?

import pandas as pd

a = pd.DataFrame({
'Date': {0: '2/4/2022 1:33:40 PM', 1: '2/7/2022 3:09:46 PM',
         2: '2/11/2022 9:35:44 AM',3: '2/12/2022 12:16:30 PM', 4: '2/14/2022 2:55:33 PM',
         5: '2/15/2022 3:55:33 PM', 6:  '2/15/2022 9:15:33 PM', 7:'3/1/2022 10:16:40 AM'},
'TransactionType': {0: 'Buy', 1: 'Buy', 2: 'Sell', 3:'Short', 4: 'Sell', 5: 'Buy', 6:'Sell', 7:'Cover'},
'Symbol': {0: 'META', 1: 'BABA', 2:'META', 3:'RDFN', 4: 'BABA',5: 'META', 6: 'META', 7:'RDFN'},
'Price': {0: 12.79, 1: 116.16, 2: 12.93, 3:21.81, 4: 121.82, 5: 13.55, 6:13.83, 7:1853.85},
'Amount': {0: -2366.15, 1: -2439.36, 2: 160.0, 3:21.65 , 4: 2558.22, 5:-1897, 6:1936.2, 7:13.60}})

out = df.groupby(['Symbol','TransactionType'])['TransactionType'].count().unstack().add_prefix('Number of ').add_suffix('s')
g = df.groupby(['Symbol', df['TransactionType'].eq('Buy').groupby(df['Symbol']).cumsum()])['Amount']
out['Gains/Losses'] = g.sum().groupby(level=0).sum()
out['Percentage change'] = g.pct_change().groupby(df['Symbol']).sum()
out = out.reset_index().rename_axis([None], axis=1)

预期输出:

我不倾向于使用 groupby 虽然我相信其他经常使用它的人可能会评论它在这种情况下的适当性。

不清楚您是如何计算增益的 - 但我相信这个框架很容易编辑以允许您更改计算

a = pd.DataFrame({
    'Date': {
        0: '2/4/2022 1:33:40 PM',
        1: '2/7/2022 3:09:46 PM',
        2: '2/11/2022 9:35:44 AM',
        3: '2/12/2022 12:16:30 PM',
        4: '2/14/2022 2:55:33 PM',
        5: '2/15/2022 3:55:33 PM',
        6: '2/15/2022 9:15:33 PM',
        7:'3/1/2022 10:16:40 AM'
    },
    'TransactionType': {
        0: 'Buy',
        1: 'Buy',
        2: 'Sell',
        3: 'Short',
        4: 'Sell',
        5: 'Buy',
        6:'Sell',
        7:'Cover'
    },
    'Symbol': {
        0: 'META',
        1: 'BABA',
        2: 'META',
        3: 'RDFN',
        4: 'BABA',
        5: 'META',
        6: 'META',
        7:'RDFN'
    },
    'Price': {
        0: 12.79,
        1: 116.16,
        2: 12.93,
        3: 21.81,
        4: 121.82,
        5: 13.55,
        6: 13.83,
        7: 1853.85
    },
    'Amount': {
        0: -2366.15,
        1: -2439.36,
        2: 160.0,
        3: 21.65 ,
        4: 2558.22,
        5: -1897,
        6: 1936.2,
        7: 13.60
    }
})
print(a, '\n\n')

#### >>>>
#### ANSWER STARTS HERE ####
#### >>>>
results = {
    'Symbol': [],
    'Number of Buys': [],
    'Number of Sells': [],
    'Number of Shorts': [],
    'Number of Covers': [],
    'Gains/Loss ($)': [],
    'Percentage Return (%)': [],
}

for symbol in set(a['Symbol']):
    results['Symbol'].append(symbol)

    # now extract the rest of the data by isolating matching orders from the original
    # dataframe
    temp = a[a['Symbol'] == symbol]

    results['Number of Buys'].append(len(temp[temp['TransactionType']=='Buy']))
    results['Number of Sells'].append(len(temp[temp['TransactionType']=='Sell']))
    results['Number of Shorts'].append(len(temp[temp['TransactionType']=='Short']))
    results['Number of Covers'].append(len(temp[temp['TransactionType']=='Cover']))

    # do calculations

    # for buys and sells, multiply the price by the amount of the relevant rows
    buys = sum([
        a['Price'][i] * a['Amount'][i] for i in temp[temp['TransactionType']=='Buy'].index
    ])
    sells = sum([
        a['Price'][i] * a['Amount'][i] for i in temp[temp['TransactionType']=='Sell'].index
    ])

    # manage raw and percentage difference
    # set to 0 if either is 0 - this avoids any divide by zero errors
    gain = buys - sells
    if buys == 0.0 or gain == 0.0:
        percentage = 0.0
    else:
        percentage = 100 * gain / buys

    results['Gains/Loss ($)'].append(gain)
    results['Percentage Return (%)'].append(percentage)

# pass the dictionary to a dataframe and print
dataframe = pd.DataFrame(results)
print(dataframe)

运行如下:

                    Date TransactionType Symbol    Price   Amount
0    2/4/2022 1:33:40 PM             Buy   META    12.79 -2366.15
1    2/7/2022 3:09:46 PM             Buy   BABA   116.16 -2439.36
2   2/11/2022 9:35:44 AM            Sell   META    12.93   160.00
3  2/12/2022 12:16:30 PM           Short   RDFN    21.81    21.65
4   2/14/2022 2:55:33 PM            Sell   BABA   121.82  2558.22
5   2/15/2022 3:55:33 PM             Buy   META    13.55 -1897.00
6   2/15/2022 9:15:33 PM            Sell   META    13.83  1936.20
7   3/1/2022 10:16:40 AM           Cover   RDFN  1853.85    13.60 


  Symbol  Number of Buys  Number of Sells  Number of Shorts  Number of Covers  Gains/Loss ($)  Percentage Return (%)
0   META               2                2                 0                 0     -84813.8545             151.541507
1   RDFN               0                0                 1                 1          0.0000               0.000000
2   BABA               1                1                 0                 0    -594998.4180             209.982600