如何根据条件和 col(值重复)中相同值的出现作为一个因素,对新列中的值求和

how to sum values in a new column, based on conditions and occurrence of the same value in col (value recurrence) as a factor

我正在尝试找到一种方法来更新新列中的值,该列编写了一段代码,该代码在每个步骤中(逐行)显示 buy/sell 个具有最佳价格的订单的总和。

stock_buy_sell = {
    "Id":[1, 2, 3, 4, 3, 5],
    "Order":["Buy", "Sell", "Buy", "Buy", "Buy", "Sell"],
    "Type":["Add", "Add", "Add", "Add", "Remove", "Add"],
    "Price":[21.0, 25.0, 23.0, 23.0, 23.0, 28],
    "Quantity":[100, 200, 50, 70, 50, 100]}

    Id  Order   Type    Price   Quantity
0   1   Buy     Add     21.0    100
1   2   Sell    Add     25.0    200
2   3   Buy     Add     23.0    50
3   4   Buy     Add     23.0    70
4   3   Buy     Remove  23.0    50
5   5   Sell    Add     28.0    100

由于特定 ID 可能会发生更新,我需要找到一种方法来使用此因素来正确填充新列:Sum of incomeStock quantity.


    Id  Order   Type    Price   Quantity    Sum Of Income   Stock Quantity  Total Profit
0   1   Buy     Add     21.0    100               0                 0                  0
1   2   Sell    Add     25.0    200               0                 0                  0
2   3   Buy     Add     23.0    50                0                 0                  0
3   4   Buy     Add     23.0    70                0                 0                  0
4   3   Buy     Remove  23.0    50                0                 0                  0
5   5   Sell    Add     28.0    100               0                 0                  0

在这个简单的例子中,除了我需要根据前几行(逐行)计算Sum of incomeStock quantity基于buy/sell操作之外,问题出现了在第 4 行中,id: 3 应基于第 2 行中的 id。换句话说,为了正确填充 Sum of incomeStock quantity,我需要找到一种方法来减去 pricequantity 值,该方法基于一个函数,当某些 id 以前存在。

我试图找到一种方法来使用 df.apply()pd.series.apply()。我也研究了实施 pd.shift 方法的可能性。但是,我不知道如何构建逻辑以及使用什么方法。

预期产量(我手动算的):

 Id  Order   Type Price Quantity Sum of Income Stock Quantity Total Profit
1 1   Buy     Add    21      100           -21            100        -2100
2 2  Sell     Add    25      200             4           -100         5000
3 3   Buy     Add    23       50           -19            -50        -1150
4 4   Buy     Add    23       70           -42             20        -1610
5 3   Buy  Remove    23       50           -19            -30         1150
6 5  Sell     Add    28      100             9           -130         2800

============================================= =======================

我的post以下部分与问题没有直接关系,所以回答者可能会省略。 以下部分是问题的解决方案,在这种情况下,我们将输入作为后续字典类型的对象,并且 - 马上 - 我们可以构建一个完整的数据库(与问题中的相同)。

也就是说,一开始我没有数据, 股东执行买入/卖出操作(第一步),例如

apples_dct1 = {1: [" Buy "," Add ", 20.0, 100]}.

接下来就是下一步:

apples_dct2 = {2: ["Sell", "Add", 25.0, 200]}

等等

import pandas as pd

apples_dct1 = {1:["Buy", "Add", 21.0, 100]}
apples_dct2 = {2:["Sell", "Add", 25.0, 200]}
apples_dct3 = {3:["Buy", "Add", 23.0, 50]}
apples_dct4 = {4:["Buy", "Add", 23.0, 70]}
apples_dct5 = {3:["Buy", "Remove", 23.0, 50]}
apples_dct6 = {5:["Sell", "Add", 28.0, 100]}

engine_dict = {}
def magic_engine(dict_apples):
    """
    creating objects from dict_apples:
    """
    dict_key = list(dict_apples.keys())[0]
    order = dict_apples[dict_key][0]
    type_buy_sell = dict_apples[dict_key][1]
    price = dict_apples[dict_key][2]
    quantity = dict_apples[dict_key][3]
#     print(dict_key)
#     print("dict_key[1] ", dict_apples[dict_key][1]) # test
    """
    First instance of data in a new dict `engine_dict`:
    """
    if (bool(engine_dict) == False and 
        dict_apples[dict_key][1] == "Add" and 
        dict_apples[dict_key][0] == "Buy"):
            
            sum_of_income_extend = -price
            stock_quantity_extended = quantity
            profit_extended = -(price * quantity)
            base_list = [
                order,
                type_buy_sell,
                price,
                quantity,
                sum_of_income_extend,
                stock_quantity_extended,
                profit_extended
            ]
    #         print("base_list ", base_list)
            engine_dict[dict_key] = base_list
    #         print(engine_dict) # Test
            return engine_dict
    
    elif (bool(engine_dict) == False and 
        dict_apples[dict_key][1] == "Add" and 
        dict_apples[dict_key][0] == "Sell"):
            
            sum_of_income_extend = price
            stock_quantity_extended = quantity
            profit_extended = price * quantity
            base_list = [
                order, type_buy_sell,
                price,
                quantity,
                sum_of_income_extend,
                stock_quantity_extended,
                profit_extended
            ]
    #         print("base_list ", base_list)
            engine_dict[dict_key] = base_list
    #         print(engine_dict) # Test
            return engine_dict
    
    """
    Adding new key-value pairs to `engine_dict` 
    where 
    `update_sum_of_income_extend`,
    `stock_quantity_extend`,
    `profit_extended`
    are based on the previous `engine_dict` key. 
    With that, we can update the income, 
    stock quantity and total profit for stock holder.
    """
    if (bool(engine_dict) == True and 
        dict_apples[dict_key][1] == "Add" and 
        dict_apples[dict_key][0] == "Buy"):
        
        update_sum_of_income_extend = (
            engine_dict[list(engine_dict.keys())[-1]][4] - (price)
        )
        stock_quantity_extend = (
        engine_dict[list(engine_dict.keys())[-1]][5] + quantity
        )
        profit_extended = -(price * quantity)
        base_list = [
            order,
            type_buy_sell,
            price,
            quantity,
            update_sum_of_income_extend,
            stock_quantity_extend,
            profit_extended
        ]
#         print("base_list ", base_list)
        engine_dict[dict_key] = base_list
#         print(engine_dict) # Test
        return engine_dict

    elif (bool(engine_dict) == True and
          dict_apples[dict_key][1] == "Add" and 
          dict_apples[dict_key][0] == "Sell"):
        
        update_sum_of_income_extend = (
                engine_dict[list(engine_dict.keys())[-1]][4] + (price)
            )
        stock_quantity_extend = (
            engine_dict[list(engine_dict.keys())[-1]][5] - quantity
        )
        profit_extended = price * quantity
#         print("engine_dict[list(engine_dict.keys())[-1]][2] ", engine_dict[list(engine_dict.keys())[-1]][2])
#         print("price ", price)
        base_list = [
            order,
            type_buy_sell,
            price,
            quantity,
            update_sum_of_income_extend,
            stock_quantity_extend,
            profit_extended
        ]
        engine_dict[dict_key] = base_list
        return engine_dict
    
    elif (bool(engine_dict) == True and
          dict_apples[dict_key][1] == "Remove" and 
          dict_apples[dict_key][0] == "Buy"):

        
        update_sum_of_income_extend = (
            engine_dict[list(engine_dict.keys())[-1]][4] + (price)
            )
        stock_quantity_extend = (
            engine_dict[list(engine_dict.keys())[-1]][5] - quantity
            )
        profit_extended = price * quantity
#         print("engine_dict[list(engine_dict.keys())[-1]][2] ", engine_dict[list(engine_dict.keys())[-1]][2])
#         print("price ", price)
        base_list = [
            order,
            type_buy_sell,
            price,
            quantity,
            update_sum_of_income_extend,
            stock_quantity_extend,
            profit_extended
        ]
        """
        Because a dictionary can have just unique keys, for "removing action"
        I create a new key build: key + instance number of action.
        With that, it will be easy to find all removing actions (they will be floats)
        If there would be more "removing action" instances, then I will have for example:
        main key 3
        first "removing action" with key 3.1
        second "removing action" with key 3.2
        third "removing action" with key 3.3
        ect.
        """
        for i in list(engine_dict.keys())[:]:
            if i == dict_key:
                dict_key = dict_key + 0.1
                engine_dict[dict_key] = base_list
        
        return engine_dict
    
"""
Below I have all the steps taken by the shareholder
"""    
magic_engine(apples_dct1)
magic_engine(apples_dct2)
magic_engine(apples_dct3)
magic_engine(apples_dct4)
magic_engine(apples_dct5)
magic_engine(apples_dct6)

"""
Based on a dictionary that includes all shareholder activities, 
I am building a dataframe in Pandas:
"""

df_col = [
    'Order',
    'Type',
    'Price',
    'Quantity',
    'Sum of income',
    'Stock quantity',
    'total profit'
]

new_table_buy_sell = pd.DataFrame(engine_dict)
final_table = new_table_buy_sell.transpose()
final_table.set_index([pd.Index([1,2,3,4,5,6]), list(engine_dict.keys())], inplace=True)
final_table.columns = df_col
final_table.columns = final_table.columns.rename("id")
final_table

输出:

  Id    Order  Type Price Quantity Sum Of Income Stock Quantity Total Profit
1 1.0   Buy     Add    21      100           -21            100        -2100
2 2.0  Sell     Add    25      200             4           -100         5000
3 3.0   Buy     Add    23       50           -19            -50        -1150
4 4.0   Buy     Add    23       70           -42             20        -1610
5 3.1   Buy  Remove    23       50           -19            -30         1150
6 5.0  Sell     Add    28      100             9           -130         2800

我们可以使用映射字典来使用“订单”和“类型”来计算累计价格和数量(我们使用 cumsum 计算)。最后,通过将“数量”乘以累计价格(重命名为“收入总和”)来分配“总计”列:

order_mapping = {'Buy': 1, 'Sell': -1}
type_mapping = {'Add': 1, 'Remove': -1}

df = (df.join(df[['Price','Quantity']]
             .mul(df['Order'].map(order_mapping) * df['Type'].map(type_mapping), axis=0)
             .assign(Price=lambda x: -x['Price'])
             .cumsum()
             .rename(columns={'Price':'Sum of income', 'Quantity':'Stock quantity'}))
      .assign(Total=lambda x: x['Quantity']*x['Price']))

输出:

   Id Order    Type  Price  Quantity  Sum of income  Stock quantity   Total
0   1   Buy     Add   21.0       100          -21.0             100  -2100.0
1   2  Sell     Add   25.0       200            4.0            -100   5000.0
2   3   Buy     Add   23.0        50          -19.0             -50   1150.0
3   4   Buy     Add   23.0        70          -42.0              20   1610.0
4   3   Buy  Remove   23.0        50          -19.0             -30   1150.0
5   5  Sell     Add   28.0       100            9.0            -130   2800.0

一般的想法是,我们希望使用“订单”列来确定我们是要添加还是减去值,因为我们会找到“价格”和“数量”的累计总和。这就是我们用 map + mul 所做的。然后在我们找到这些列的累积和之后(注意累积和适用于特定列),我们通过将两列相乘来找到总和(这使用两列)。