数据框中的净额
Netting in a dataframe
我有一个数据框,我想通过删除一些偏移线(盒装位置)并做一些净额来清理它。
这是来源table:
Type Name Strike Maturity Nominal
Call Amazon 10 10/12/2018 1000
Put Amazon 10 10/12/2018 1000
Call Ebay 8 2/8/2018 800
Put Ebay 8 2/8/2018 500
Call Facebook 5 5/5/2018 900
Call Google 2 23/4/2018 250
Put Google 2 23/4/2018 350
Call Microsoft 2 19/3/2018 250
Put Microsoft 2.5 19/3/2018 350
Put Ebay 8 2/8/2018 100
此处代码的结果:
Type Name Strike Maturity Nominal
Call Ebay 8 2/8/2018 200
Call Facebook 5 5/5/2018 900
Put Google 2 23/4/2018 100
Call Microsoft 2 19/3/2018 250
Put Microsoft 2.5 19/3/2018 350
我正在尝试用 R 编写代码来执行这 3 个任务:
1// 删除所有相互抵消的对。
相互抵消的一对是满足这两个条件的一对:
- 2 行具有相同的名称、行使价、到期日和标称。
- 1 行是 "Call" 而另一行是 "Put"
示例:从 table
中删除的 2 "Amazon" 行
2// 对没有完全相互抵消的线在标称上做一个净额。
没有完全相互抵消的一对是满足这两个标准的一对:
- 2 行具有相同的名称、行使价和到期日但不同的标称值
- 1 行是 "Call" 而另一行是 "Put"
示例:2 "Ebay" 行 在看涨期权中获利,或 2 "Google" 行在看跌期权中获利。
3// 不要在所有其他行做任何事情
示例:2 "Microsoft" 行。他们有不同的罢工所以根本不应该做网
请看下面我的第一次尝试。
我的想法是首先创建一个具有唯一键的新列,然后按字母顺序排序,然后逐行测试。
我觉得很费力所以我想知道是否有人可以帮助我找到更直接有效的解决方案?
非常感谢!
library(data.table)
dt <- data.table(Type=c("Call", "Put", "Call", "Put", "Call", "Call", "Put", "Call", "Put","Put"),
Name=c("Amazon", "Amazon", "Ebay", "Ebay", "Facebook", "Google", "Google", "Microsoft", "Microsoft","Ebay"),
Strike=c(10,10,8,8,5,2,2,2,2.5,8),
Maturity=c("10/12/2018", "10/12/2018", "2/8/2018", "2/8/2018", "5/5/2018", "23/4/2018", "23/4/2018", "19/3/2018", "19/3/2018","2/8/2018),
Nominal=c(1000,1000,800,500,900,250,350,250,35,100))
##idea
dt$key <- paste(dt$Name,dt$Strike,dt$Maturity)
dt[order(dt$key,decreasing = FALSE),]
dt$Type2 <- ifelse(dt$Type = "Call",1,0)
#for each line k, test value in the column "Key" and the column "Type2":
#if key(k) = key(k+1) and Type2(k)+Type2(k+1)=1 then
#if Nominal (k)> Nominal (k+1), delete the line k+1 and do the netting on nominal of the line k
#else Nomnial (k+1)< Nominal (k), delete the line k and do the netting on nominal of the line k+1
#next k
dt <- dt[dt$Nominal!=0,]
dt$key <- NULL
根据推荐的想法,我尝试了 dcast 解决方案,但看起来它没有进行正确的网络处理,如下所示:
> dt <- data.table(Type=c("Call", "Put", "Call", "Put", "Call", "Call", "Put", "Call", "Put","Put"),
+ Name=c("Amazon", "Amazon", "Ebay", "Ebay", "Facebook", "Google", "Google", "Microsoft", "Microsoft","Ebay"),
+ Strike=c(10,10,8,8,5,2,2,2,2.5,8),
+ Maturity=c("10/12/2018", "10/12/2018", "2/8/2018", "2/8/2018", "5/5/2018", "23/4/2018", "23/4/2018", "19/3/2018", "19/3/2018","2/8/2018"),
+ Nominal=c(1000,1000,800,500,900,250,350,250,350,100))
> dcast(dt, Name + Maturity + Strike ~ Type, value.var="Nominal", fill = 0)[, Net := Call - Put][Net != 0]
Aggregate function missing, defaulting to 'length'
Name Maturity Strike Call Put Net
1: Ebay 2/8/2018 8.0 1 2 -1
2: Facebook 5/5/2018 5.0 1 0 1
3: Microsoft 19/3/2018 2.0 1 0 1
4: Microsoft 19/3/2018 2.5 0 1 -1
这是一个tidyverse
解决方案。基本上,由于您想将具有相同 Name
、Strike
和 Maturity
的所有行分组,我认为将 Call
和 Put
转换为实际最简单数字并使用 summarise
。您的特殊抵消案例实际上只是删除总数最终为 0 的净案例。
方法是:
- 使用
ifelse
和mutate
、 将Put
转换为Nominal
的负值
- 使用
group_by
和 summarise
将组减少为每组一个值`,
- 删除
filter
、 的完美偏移
- 替换
Type
列并将负值变为正值。
代码:
library(tidyverse)
tbl <- read_table2(
"Type Name Strike Maturity Nominal
Call Amazon 10 10/12/2018 1000
Put Amazon 10 10/12/2018 1000
Call Ebay 8 2/8/2018 800
Put Ebay 8 2/8/2018 500
Call Facebook 5 5/5/2018 900
Call Google 2 23/4/2018 250
Put Google 2 23/4/2018 350
Call Microsoft 2 19/3/2018 250
Put Microsoft 2.5 19/3/2018 350
Put Ebay 8 2/8/2018 100"
)
tbl %>%
mutate(actual = ifelse(Type == "Call", Nominal, -Nominal)) %>%
group_by(Name, Strike, Maturity) %>%
summarise(Net = sum(actual)) %>%
filter(Net != 0) %>%
mutate(
Type = ifelse(Net > 0, "Call", "Put"),
Net = abs(Net)
)
# A tibble: 5 x 5
# Groups: Name, Strike [5]
Name Strike Maturity Net Type
<chr> <dbl> <chr> <int> <chr>
1 Ebay 8.00 2/8/2018 200 Call
2 Facebook 5.00 5/5/2018 900 Call
3 Google 2.00 23/4/2018 100 Put
4 Microsoft 2.00 19/3/2018 250 Call
5 Microsoft 2.50 19/3/2018 350 Put
我有一个数据框,我想通过删除一些偏移线(盒装位置)并做一些净额来清理它。
这是来源table:
Type Name Strike Maturity Nominal
Call Amazon 10 10/12/2018 1000
Put Amazon 10 10/12/2018 1000
Call Ebay 8 2/8/2018 800
Put Ebay 8 2/8/2018 500
Call Facebook 5 5/5/2018 900
Call Google 2 23/4/2018 250
Put Google 2 23/4/2018 350
Call Microsoft 2 19/3/2018 250
Put Microsoft 2.5 19/3/2018 350
Put Ebay 8 2/8/2018 100
此处代码的结果:
Type Name Strike Maturity Nominal
Call Ebay 8 2/8/2018 200
Call Facebook 5 5/5/2018 900
Put Google 2 23/4/2018 100
Call Microsoft 2 19/3/2018 250
Put Microsoft 2.5 19/3/2018 350
我正在尝试用 R 编写代码来执行这 3 个任务:
1// 删除所有相互抵消的对。 相互抵消的一对是满足这两个条件的一对:
- 2 行具有相同的名称、行使价、到期日和标称。
- 1 行是 "Call" 而另一行是 "Put"
示例:从 table
中删除的 2 "Amazon" 行2// 对没有完全相互抵消的线在标称上做一个净额。 没有完全相互抵消的一对是满足这两个标准的一对:
- 2 行具有相同的名称、行使价和到期日但不同的标称值
- 1 行是 "Call" 而另一行是 "Put"
示例:2 "Ebay" 行 在看涨期权中获利,或 2 "Google" 行在看跌期权中获利。
3// 不要在所有其他行做任何事情
示例:2 "Microsoft" 行。他们有不同的罢工所以根本不应该做网
请看下面我的第一次尝试。 我的想法是首先创建一个具有唯一键的新列,然后按字母顺序排序,然后逐行测试。 我觉得很费力所以我想知道是否有人可以帮助我找到更直接有效的解决方案? 非常感谢!
library(data.table)
dt <- data.table(Type=c("Call", "Put", "Call", "Put", "Call", "Call", "Put", "Call", "Put","Put"),
Name=c("Amazon", "Amazon", "Ebay", "Ebay", "Facebook", "Google", "Google", "Microsoft", "Microsoft","Ebay"),
Strike=c(10,10,8,8,5,2,2,2,2.5,8),
Maturity=c("10/12/2018", "10/12/2018", "2/8/2018", "2/8/2018", "5/5/2018", "23/4/2018", "23/4/2018", "19/3/2018", "19/3/2018","2/8/2018),
Nominal=c(1000,1000,800,500,900,250,350,250,35,100))
##idea
dt$key <- paste(dt$Name,dt$Strike,dt$Maturity)
dt[order(dt$key,decreasing = FALSE),]
dt$Type2 <- ifelse(dt$Type = "Call",1,0)
#for each line k, test value in the column "Key" and the column "Type2":
#if key(k) = key(k+1) and Type2(k)+Type2(k+1)=1 then
#if Nominal (k)> Nominal (k+1), delete the line k+1 and do the netting on nominal of the line k
#else Nomnial (k+1)< Nominal (k), delete the line k and do the netting on nominal of the line k+1
#next k
dt <- dt[dt$Nominal!=0,]
dt$key <- NULL
根据推荐的想法,我尝试了 dcast 解决方案,但看起来它没有进行正确的网络处理,如下所示:
> dt <- data.table(Type=c("Call", "Put", "Call", "Put", "Call", "Call", "Put", "Call", "Put","Put"),
+ Name=c("Amazon", "Amazon", "Ebay", "Ebay", "Facebook", "Google", "Google", "Microsoft", "Microsoft","Ebay"),
+ Strike=c(10,10,8,8,5,2,2,2,2.5,8),
+ Maturity=c("10/12/2018", "10/12/2018", "2/8/2018", "2/8/2018", "5/5/2018", "23/4/2018", "23/4/2018", "19/3/2018", "19/3/2018","2/8/2018"),
+ Nominal=c(1000,1000,800,500,900,250,350,250,350,100))
> dcast(dt, Name + Maturity + Strike ~ Type, value.var="Nominal", fill = 0)[, Net := Call - Put][Net != 0]
Aggregate function missing, defaulting to 'length'
Name Maturity Strike Call Put Net
1: Ebay 2/8/2018 8.0 1 2 -1
2: Facebook 5/5/2018 5.0 1 0 1
3: Microsoft 19/3/2018 2.0 1 0 1
4: Microsoft 19/3/2018 2.5 0 1 -1
这是一个tidyverse
解决方案。基本上,由于您想将具有相同 Name
、Strike
和 Maturity
的所有行分组,我认为将 Call
和 Put
转换为实际最简单数字并使用 summarise
。您的特殊抵消案例实际上只是删除总数最终为 0 的净案例。
方法是:
- 使用
ifelse
和mutate
、 将 - 使用
group_by
和summarise
将组减少为每组一个值`, - 删除
filter
、 的完美偏移
- 替换
Type
列并将负值变为正值。
Put
转换为Nominal
的负值
代码:
library(tidyverse)
tbl <- read_table2(
"Type Name Strike Maturity Nominal
Call Amazon 10 10/12/2018 1000
Put Amazon 10 10/12/2018 1000
Call Ebay 8 2/8/2018 800
Put Ebay 8 2/8/2018 500
Call Facebook 5 5/5/2018 900
Call Google 2 23/4/2018 250
Put Google 2 23/4/2018 350
Call Microsoft 2 19/3/2018 250
Put Microsoft 2.5 19/3/2018 350
Put Ebay 8 2/8/2018 100"
)
tbl %>%
mutate(actual = ifelse(Type == "Call", Nominal, -Nominal)) %>%
group_by(Name, Strike, Maturity) %>%
summarise(Net = sum(actual)) %>%
filter(Net != 0) %>%
mutate(
Type = ifelse(Net > 0, "Call", "Put"),
Net = abs(Net)
)
# A tibble: 5 x 5
# Groups: Name, Strike [5]
Name Strike Maturity Net Type
<chr> <dbl> <chr> <int> <chr>
1 Ebay 8.00 2/8/2018 200 Call
2 Facebook 5.00 5/5/2018 900 Call
3 Google 2.00 23/4/2018 100 Put
4 Microsoft 2.00 19/3/2018 250 Call
5 Microsoft 2.50 19/3/2018 350 Put