如何根据条件和 col(值重复)中相同值的出现作为一个因素,对新列中的值求和
how to sum values in a new column, based on conditions and occurrence of the same value in col (value recurrence) as a factor
我正在尝试找到一种方法来更新新列中的值,该列编写了一段代码,该代码在每个步骤中(逐行)显示 buy/sell 个具有最佳价格的订单的总和。
stock_buy_sell = {
"Id":[1, 2, 3, 4, 3, 5],
"Order":["Buy", "Sell", "Buy", "Buy", "Buy", "Sell"],
"Type":["Add", "Add", "Add", "Add", "Remove", "Add"],
"Price":[21.0, 25.0, 23.0, 23.0, 23.0, 28],
"Quantity":[100, 200, 50, 70, 50, 100]}
Id Order Type Price Quantity
0 1 Buy Add 21.0 100
1 2 Sell Add 25.0 200
2 3 Buy Add 23.0 50
3 4 Buy Add 23.0 70
4 3 Buy Remove 23.0 50
5 5 Sell Add 28.0 100
由于特定 ID 可能会发生更新,我需要找到一种方法来使用此因素来正确填充新列:Sum of income
和 Stock quantity
.
Id Order Type Price Quantity Sum Of Income Stock Quantity Total Profit
0 1 Buy Add 21.0 100 0 0 0
1 2 Sell Add 25.0 200 0 0 0
2 3 Buy Add 23.0 50 0 0 0
3 4 Buy Add 23.0 70 0 0 0
4 3 Buy Remove 23.0 50 0 0 0
5 5 Sell Add 28.0 100 0 0 0
在这个简单的例子中,除了我需要根据前几行(逐行)计算Sum of income
和Stock quantity
基于buy/sell操作之外,问题出现了在第 4 行中,id:
3 应基于第 2 行中的 id
。换句话说,为了正确填充 Sum of income
和 Stock quantity
,我需要找到一种方法来减去 price
和 quantity
值,该方法基于一个函数,当某些 id
以前存在。
我试图找到一种方法来使用 df.apply()
、pd.series.apply()
。我也研究了实施 pd.shift
方法的可能性。但是,我不知道如何构建逻辑以及使用什么方法。
预期产量(我手动算的):
Id Order Type Price Quantity Sum of Income Stock Quantity Total Profit
1 1 Buy Add 21 100 -21 100 -2100
2 2 Sell Add 25 200 4 -100 5000
3 3 Buy Add 23 50 -19 -50 -1150
4 4 Buy Add 23 70 -42 20 -1610
5 3 Buy Remove 23 50 -19 -30 1150
6 5 Sell Add 28 100 9 -130 2800
============================================= =======================
我的post以下部分与问题没有直接关系,所以回答者可能会省略。
以下部分是问题的解决方案,在这种情况下,我们将输入作为后续字典类型的对象,并且 - 马上 - 我们可以构建一个完整的数据库(与问题中的相同)。
也就是说,一开始我没有数据,
股东执行买入/卖出操作(第一步),例如
apples_dct1 = {1: [" Buy "," Add ", 20.0, 100]}
.
接下来就是下一步:
apples_dct2 = {2: ["Sell", "Add", 25.0, 200]}
等等
import pandas as pd
apples_dct1 = {1:["Buy", "Add", 21.0, 100]}
apples_dct2 = {2:["Sell", "Add", 25.0, 200]}
apples_dct3 = {3:["Buy", "Add", 23.0, 50]}
apples_dct4 = {4:["Buy", "Add", 23.0, 70]}
apples_dct5 = {3:["Buy", "Remove", 23.0, 50]}
apples_dct6 = {5:["Sell", "Add", 28.0, 100]}
engine_dict = {}
def magic_engine(dict_apples):
"""
creating objects from dict_apples:
"""
dict_key = list(dict_apples.keys())[0]
order = dict_apples[dict_key][0]
type_buy_sell = dict_apples[dict_key][1]
price = dict_apples[dict_key][2]
quantity = dict_apples[dict_key][3]
# print(dict_key)
# print("dict_key[1] ", dict_apples[dict_key][1]) # test
"""
First instance of data in a new dict `engine_dict`:
"""
if (bool(engine_dict) == False and
dict_apples[dict_key][1] == "Add" and
dict_apples[dict_key][0] == "Buy"):
sum_of_income_extend = -price
stock_quantity_extended = quantity
profit_extended = -(price * quantity)
base_list = [
order,
type_buy_sell,
price,
quantity,
sum_of_income_extend,
stock_quantity_extended,
profit_extended
]
# print("base_list ", base_list)
engine_dict[dict_key] = base_list
# print(engine_dict) # Test
return engine_dict
elif (bool(engine_dict) == False and
dict_apples[dict_key][1] == "Add" and
dict_apples[dict_key][0] == "Sell"):
sum_of_income_extend = price
stock_quantity_extended = quantity
profit_extended = price * quantity
base_list = [
order, type_buy_sell,
price,
quantity,
sum_of_income_extend,
stock_quantity_extended,
profit_extended
]
# print("base_list ", base_list)
engine_dict[dict_key] = base_list
# print(engine_dict) # Test
return engine_dict
"""
Adding new key-value pairs to `engine_dict`
where
`update_sum_of_income_extend`,
`stock_quantity_extend`,
`profit_extended`
are based on the previous `engine_dict` key.
With that, we can update the income,
stock quantity and total profit for stock holder.
"""
if (bool(engine_dict) == True and
dict_apples[dict_key][1] == "Add" and
dict_apples[dict_key][0] == "Buy"):
update_sum_of_income_extend = (
engine_dict[list(engine_dict.keys())[-1]][4] - (price)
)
stock_quantity_extend = (
engine_dict[list(engine_dict.keys())[-1]][5] + quantity
)
profit_extended = -(price * quantity)
base_list = [
order,
type_buy_sell,
price,
quantity,
update_sum_of_income_extend,
stock_quantity_extend,
profit_extended
]
# print("base_list ", base_list)
engine_dict[dict_key] = base_list
# print(engine_dict) # Test
return engine_dict
elif (bool(engine_dict) == True and
dict_apples[dict_key][1] == "Add" and
dict_apples[dict_key][0] == "Sell"):
update_sum_of_income_extend = (
engine_dict[list(engine_dict.keys())[-1]][4] + (price)
)
stock_quantity_extend = (
engine_dict[list(engine_dict.keys())[-1]][5] - quantity
)
profit_extended = price * quantity
# print("engine_dict[list(engine_dict.keys())[-1]][2] ", engine_dict[list(engine_dict.keys())[-1]][2])
# print("price ", price)
base_list = [
order,
type_buy_sell,
price,
quantity,
update_sum_of_income_extend,
stock_quantity_extend,
profit_extended
]
engine_dict[dict_key] = base_list
return engine_dict
elif (bool(engine_dict) == True and
dict_apples[dict_key][1] == "Remove" and
dict_apples[dict_key][0] == "Buy"):
update_sum_of_income_extend = (
engine_dict[list(engine_dict.keys())[-1]][4] + (price)
)
stock_quantity_extend = (
engine_dict[list(engine_dict.keys())[-1]][5] - quantity
)
profit_extended = price * quantity
# print("engine_dict[list(engine_dict.keys())[-1]][2] ", engine_dict[list(engine_dict.keys())[-1]][2])
# print("price ", price)
base_list = [
order,
type_buy_sell,
price,
quantity,
update_sum_of_income_extend,
stock_quantity_extend,
profit_extended
]
"""
Because a dictionary can have just unique keys, for "removing action"
I create a new key build: key + instance number of action.
With that, it will be easy to find all removing actions (they will be floats)
If there would be more "removing action" instances, then I will have for example:
main key 3
first "removing action" with key 3.1
second "removing action" with key 3.2
third "removing action" with key 3.3
ect.
"""
for i in list(engine_dict.keys())[:]:
if i == dict_key:
dict_key = dict_key + 0.1
engine_dict[dict_key] = base_list
return engine_dict
"""
Below I have all the steps taken by the shareholder
"""
magic_engine(apples_dct1)
magic_engine(apples_dct2)
magic_engine(apples_dct3)
magic_engine(apples_dct4)
magic_engine(apples_dct5)
magic_engine(apples_dct6)
"""
Based on a dictionary that includes all shareholder activities,
I am building a dataframe in Pandas:
"""
df_col = [
'Order',
'Type',
'Price',
'Quantity',
'Sum of income',
'Stock quantity',
'total profit'
]
new_table_buy_sell = pd.DataFrame(engine_dict)
final_table = new_table_buy_sell.transpose()
final_table.set_index([pd.Index([1,2,3,4,5,6]), list(engine_dict.keys())], inplace=True)
final_table.columns = df_col
final_table.columns = final_table.columns.rename("id")
final_table
输出:
Id Order Type Price Quantity Sum Of Income Stock Quantity Total Profit
1 1.0 Buy Add 21 100 -21 100 -2100
2 2.0 Sell Add 25 200 4 -100 5000
3 3.0 Buy Add 23 50 -19 -50 -1150
4 4.0 Buy Add 23 70 -42 20 -1610
5 3.1 Buy Remove 23 50 -19 -30 1150
6 5.0 Sell Add 28 100 9 -130 2800
我们可以使用映射字典来使用“订单”和“类型”来计算累计价格和数量(我们使用 cumsum
计算)。最后,通过将“数量”乘以累计价格(重命名为“收入总和”)来分配“总计”列:
order_mapping = {'Buy': 1, 'Sell': -1}
type_mapping = {'Add': 1, 'Remove': -1}
df = (df.join(df[['Price','Quantity']]
.mul(df['Order'].map(order_mapping) * df['Type'].map(type_mapping), axis=0)
.assign(Price=lambda x: -x['Price'])
.cumsum()
.rename(columns={'Price':'Sum of income', 'Quantity':'Stock quantity'}))
.assign(Total=lambda x: x['Quantity']*x['Price']))
输出:
Id Order Type Price Quantity Sum of income Stock quantity Total
0 1 Buy Add 21.0 100 -21.0 100 -2100.0
1 2 Sell Add 25.0 200 4.0 -100 5000.0
2 3 Buy Add 23.0 50 -19.0 -50 1150.0
3 4 Buy Add 23.0 70 -42.0 20 1610.0
4 3 Buy Remove 23.0 50 -19.0 -30 1150.0
5 5 Sell Add 28.0 100 9.0 -130 2800.0
一般的想法是,我们希望使用“订单”列来确定我们是要添加还是减去值,因为我们会找到“价格”和“数量”的累计总和。这就是我们用 map
+ mul
所做的。然后在我们找到这些列的累积和之后(注意累积和适用于特定列),我们通过将两列相乘来找到总和(这使用两列)。
我正在尝试找到一种方法来更新新列中的值,该列编写了一段代码,该代码在每个步骤中(逐行)显示 buy/sell 个具有最佳价格的订单的总和。
stock_buy_sell = {
"Id":[1, 2, 3, 4, 3, 5],
"Order":["Buy", "Sell", "Buy", "Buy", "Buy", "Sell"],
"Type":["Add", "Add", "Add", "Add", "Remove", "Add"],
"Price":[21.0, 25.0, 23.0, 23.0, 23.0, 28],
"Quantity":[100, 200, 50, 70, 50, 100]}
Id Order Type Price Quantity
0 1 Buy Add 21.0 100
1 2 Sell Add 25.0 200
2 3 Buy Add 23.0 50
3 4 Buy Add 23.0 70
4 3 Buy Remove 23.0 50
5 5 Sell Add 28.0 100
由于特定 ID 可能会发生更新,我需要找到一种方法来使用此因素来正确填充新列:Sum of income
和 Stock quantity
.
Id Order Type Price Quantity Sum Of Income Stock Quantity Total Profit
0 1 Buy Add 21.0 100 0 0 0
1 2 Sell Add 25.0 200 0 0 0
2 3 Buy Add 23.0 50 0 0 0
3 4 Buy Add 23.0 70 0 0 0
4 3 Buy Remove 23.0 50 0 0 0
5 5 Sell Add 28.0 100 0 0 0
在这个简单的例子中,除了我需要根据前几行(逐行)计算Sum of income
和Stock quantity
基于buy/sell操作之外,问题出现了在第 4 行中,id:
3 应基于第 2 行中的 id
。换句话说,为了正确填充 Sum of income
和 Stock quantity
,我需要找到一种方法来减去 price
和 quantity
值,该方法基于一个函数,当某些 id
以前存在。
我试图找到一种方法来使用 df.apply()
、pd.series.apply()
。我也研究了实施 pd.shift
方法的可能性。但是,我不知道如何构建逻辑以及使用什么方法。
预期产量(我手动算的):
Id Order Type Price Quantity Sum of Income Stock Quantity Total Profit
1 1 Buy Add 21 100 -21 100 -2100
2 2 Sell Add 25 200 4 -100 5000
3 3 Buy Add 23 50 -19 -50 -1150
4 4 Buy Add 23 70 -42 20 -1610
5 3 Buy Remove 23 50 -19 -30 1150
6 5 Sell Add 28 100 9 -130 2800
============================================= =======================
我的post以下部分与问题没有直接关系,所以回答者可能会省略。 以下部分是问题的解决方案,在这种情况下,我们将输入作为后续字典类型的对象,并且 - 马上 - 我们可以构建一个完整的数据库(与问题中的相同)。
也就是说,一开始我没有数据, 股东执行买入/卖出操作(第一步),例如
apples_dct1 = {1: [" Buy "," Add ", 20.0, 100]}
.
接下来就是下一步:
apples_dct2 = {2: ["Sell", "Add", 25.0, 200]}
等等
import pandas as pd
apples_dct1 = {1:["Buy", "Add", 21.0, 100]}
apples_dct2 = {2:["Sell", "Add", 25.0, 200]}
apples_dct3 = {3:["Buy", "Add", 23.0, 50]}
apples_dct4 = {4:["Buy", "Add", 23.0, 70]}
apples_dct5 = {3:["Buy", "Remove", 23.0, 50]}
apples_dct6 = {5:["Sell", "Add", 28.0, 100]}
engine_dict = {}
def magic_engine(dict_apples):
"""
creating objects from dict_apples:
"""
dict_key = list(dict_apples.keys())[0]
order = dict_apples[dict_key][0]
type_buy_sell = dict_apples[dict_key][1]
price = dict_apples[dict_key][2]
quantity = dict_apples[dict_key][3]
# print(dict_key)
# print("dict_key[1] ", dict_apples[dict_key][1]) # test
"""
First instance of data in a new dict `engine_dict`:
"""
if (bool(engine_dict) == False and
dict_apples[dict_key][1] == "Add" and
dict_apples[dict_key][0] == "Buy"):
sum_of_income_extend = -price
stock_quantity_extended = quantity
profit_extended = -(price * quantity)
base_list = [
order,
type_buy_sell,
price,
quantity,
sum_of_income_extend,
stock_quantity_extended,
profit_extended
]
# print("base_list ", base_list)
engine_dict[dict_key] = base_list
# print(engine_dict) # Test
return engine_dict
elif (bool(engine_dict) == False and
dict_apples[dict_key][1] == "Add" and
dict_apples[dict_key][0] == "Sell"):
sum_of_income_extend = price
stock_quantity_extended = quantity
profit_extended = price * quantity
base_list = [
order, type_buy_sell,
price,
quantity,
sum_of_income_extend,
stock_quantity_extended,
profit_extended
]
# print("base_list ", base_list)
engine_dict[dict_key] = base_list
# print(engine_dict) # Test
return engine_dict
"""
Adding new key-value pairs to `engine_dict`
where
`update_sum_of_income_extend`,
`stock_quantity_extend`,
`profit_extended`
are based on the previous `engine_dict` key.
With that, we can update the income,
stock quantity and total profit for stock holder.
"""
if (bool(engine_dict) == True and
dict_apples[dict_key][1] == "Add" and
dict_apples[dict_key][0] == "Buy"):
update_sum_of_income_extend = (
engine_dict[list(engine_dict.keys())[-1]][4] - (price)
)
stock_quantity_extend = (
engine_dict[list(engine_dict.keys())[-1]][5] + quantity
)
profit_extended = -(price * quantity)
base_list = [
order,
type_buy_sell,
price,
quantity,
update_sum_of_income_extend,
stock_quantity_extend,
profit_extended
]
# print("base_list ", base_list)
engine_dict[dict_key] = base_list
# print(engine_dict) # Test
return engine_dict
elif (bool(engine_dict) == True and
dict_apples[dict_key][1] == "Add" and
dict_apples[dict_key][0] == "Sell"):
update_sum_of_income_extend = (
engine_dict[list(engine_dict.keys())[-1]][4] + (price)
)
stock_quantity_extend = (
engine_dict[list(engine_dict.keys())[-1]][5] - quantity
)
profit_extended = price * quantity
# print("engine_dict[list(engine_dict.keys())[-1]][2] ", engine_dict[list(engine_dict.keys())[-1]][2])
# print("price ", price)
base_list = [
order,
type_buy_sell,
price,
quantity,
update_sum_of_income_extend,
stock_quantity_extend,
profit_extended
]
engine_dict[dict_key] = base_list
return engine_dict
elif (bool(engine_dict) == True and
dict_apples[dict_key][1] == "Remove" and
dict_apples[dict_key][0] == "Buy"):
update_sum_of_income_extend = (
engine_dict[list(engine_dict.keys())[-1]][4] + (price)
)
stock_quantity_extend = (
engine_dict[list(engine_dict.keys())[-1]][5] - quantity
)
profit_extended = price * quantity
# print("engine_dict[list(engine_dict.keys())[-1]][2] ", engine_dict[list(engine_dict.keys())[-1]][2])
# print("price ", price)
base_list = [
order,
type_buy_sell,
price,
quantity,
update_sum_of_income_extend,
stock_quantity_extend,
profit_extended
]
"""
Because a dictionary can have just unique keys, for "removing action"
I create a new key build: key + instance number of action.
With that, it will be easy to find all removing actions (they will be floats)
If there would be more "removing action" instances, then I will have for example:
main key 3
first "removing action" with key 3.1
second "removing action" with key 3.2
third "removing action" with key 3.3
ect.
"""
for i in list(engine_dict.keys())[:]:
if i == dict_key:
dict_key = dict_key + 0.1
engine_dict[dict_key] = base_list
return engine_dict
"""
Below I have all the steps taken by the shareholder
"""
magic_engine(apples_dct1)
magic_engine(apples_dct2)
magic_engine(apples_dct3)
magic_engine(apples_dct4)
magic_engine(apples_dct5)
magic_engine(apples_dct6)
"""
Based on a dictionary that includes all shareholder activities,
I am building a dataframe in Pandas:
"""
df_col = [
'Order',
'Type',
'Price',
'Quantity',
'Sum of income',
'Stock quantity',
'total profit'
]
new_table_buy_sell = pd.DataFrame(engine_dict)
final_table = new_table_buy_sell.transpose()
final_table.set_index([pd.Index([1,2,3,4,5,6]), list(engine_dict.keys())], inplace=True)
final_table.columns = df_col
final_table.columns = final_table.columns.rename("id")
final_table
输出:
Id Order Type Price Quantity Sum Of Income Stock Quantity Total Profit
1 1.0 Buy Add 21 100 -21 100 -2100
2 2.0 Sell Add 25 200 4 -100 5000
3 3.0 Buy Add 23 50 -19 -50 -1150
4 4.0 Buy Add 23 70 -42 20 -1610
5 3.1 Buy Remove 23 50 -19 -30 1150
6 5.0 Sell Add 28 100 9 -130 2800
我们可以使用映射字典来使用“订单”和“类型”来计算累计价格和数量(我们使用 cumsum
计算)。最后,通过将“数量”乘以累计价格(重命名为“收入总和”)来分配“总计”列:
order_mapping = {'Buy': 1, 'Sell': -1}
type_mapping = {'Add': 1, 'Remove': -1}
df = (df.join(df[['Price','Quantity']]
.mul(df['Order'].map(order_mapping) * df['Type'].map(type_mapping), axis=0)
.assign(Price=lambda x: -x['Price'])
.cumsum()
.rename(columns={'Price':'Sum of income', 'Quantity':'Stock quantity'}))
.assign(Total=lambda x: x['Quantity']*x['Price']))
输出:
Id Order Type Price Quantity Sum of income Stock quantity Total
0 1 Buy Add 21.0 100 -21.0 100 -2100.0
1 2 Sell Add 25.0 200 4.0 -100 5000.0
2 3 Buy Add 23.0 50 -19.0 -50 1150.0
3 4 Buy Add 23.0 70 -42.0 20 1610.0
4 3 Buy Remove 23.0 50 -19.0 -30 1150.0
5 5 Sell Add 28.0 100 9.0 -130 2800.0
一般的想法是,我们希望使用“订单”列来确定我们是要添加还是减去值,因为我们会找到“价格”和“数量”的累计总和。这就是我们用 map
+ mul
所做的。然后在我们找到这些列的累积和之后(注意累积和适用于特定列),我们通过将两列相乘来找到总和(这使用两列)。