数据框中的净额

Netting in a dataframe

我有一个数据框,我想通过删除一些偏移线(盒装位置)并做一些净额来清理它。

这是来源table:

    Type  Name     Strike  Maturity    Nominal
    Call  Amazon    10     10/12/2018  1000
    Put   Amazon    10     10/12/2018  1000
    Call  Ebay      8      2/8/2018    800
    Put   Ebay      8      2/8/2018    500
    Call  Facebook  5      5/5/2018    900
    Call  Google    2      23/4/2018   250
    Put   Google    2      23/4/2018   350
    Call  Microsoft 2      19/3/2018   250
    Put   Microsoft 2.5    19/3/2018   350
    Put   Ebay      8      2/8/2018    100

此处代码的结果:

    Type  Name      Strike  Maturity   Nominal
    Call  Ebay      8       2/8/2018   200
    Call  Facebook  5       5/5/2018   900
    Put   Google    2       23/4/2018  100
    Call  Microsoft 2       19/3/2018  250
    Put   Microsoft 2.5     19/3/2018  350

我正在尝试用 R 编写代码来执行这 3 个任务:

1// 删除所有相互抵消的对。 相互抵消的一对是满足这两个条件的一对:

示例:从 table

中删除的 2 "Amazon" 行

2// 对没有完全相互抵消的线在标称上做一个净额。 没有完全相互抵消的一对是满足这两个标准的一对:

示例:2 "Ebay" 行 在看涨期权中获利,或 2 "Google" 行在看跌期权中获利。

3// 不要在所有其他行做任何事情

示例:2 "Microsoft" 行。他们有不同的罢工所以根本不应该做网

请看下面我的第一次尝试。 我的想法是首先创建一个具有唯一键的新列,然后按字母顺序排序,然后逐行测试。 我觉得很费力所以我想知道是否有人可以帮助我找到更直接有效的解决方案? 非常感谢!

library(data.table)

dt <- data.table(Type=c("Call", "Put", "Call", "Put", "Call", "Call", "Put", "Call", "Put","Put"),
                 Name=c("Amazon", "Amazon", "Ebay", "Ebay", "Facebook", "Google", "Google", "Microsoft", "Microsoft","Ebay"),
                 Strike=c(10,10,8,8,5,2,2,2,2.5,8),
                 Maturity=c("10/12/2018", "10/12/2018", "2/8/2018", "2/8/2018", "5/5/2018", "23/4/2018", "23/4/2018", "19/3/2018", "19/3/2018","2/8/2018),
                 Nominal=c(1000,1000,800,500,900,250,350,250,35,100))

##idea
dt$key <- paste(dt$Name,dt$Strike,dt$Maturity)
dt[order(dt$key,decreasing = FALSE),]
dt$Type2 <- ifelse(dt$Type = "Call",1,0)

#for each line k, test value in the column "Key" and the column "Type2":
#if key(k) = key(k+1) and Type2(k)+Type2(k+1)=1 then 
    #if Nominal (k)> Nominal (k+1), delete the line k+1 and do the netting on nominal of the line k
    #else Nomnial (k+1)< Nominal (k), delete the line k and do the netting on nominal of the line k+1
#next k

dt <- dt[dt$Nominal!=0,]
dt$key <- NULL

根据推荐的想法,我尝试了 dcast 解决方案,但看起来它没有进行正确的网络处理,如下所示:

> dt <- data.table(Type=c("Call", "Put", "Call", "Put", "Call", "Call", "Put", "Call", "Put","Put"),
+                  Name=c("Amazon", "Amazon", "Ebay", "Ebay", "Facebook", "Google", "Google", "Microsoft", "Microsoft","Ebay"),
+                  Strike=c(10,10,8,8,5,2,2,2,2.5,8),
+                  Maturity=c("10/12/2018", "10/12/2018", "2/8/2018", "2/8/2018", "5/5/2018", "23/4/2018", "23/4/2018", "19/3/2018", "19/3/2018","2/8/2018"),
+                  Nominal=c(1000,1000,800,500,900,250,350,250,350,100))
> dcast(dt, Name + Maturity + Strike ~ Type, value.var="Nominal", fill = 0)[, Net := Call - Put][Net != 0]
Aggregate function missing, defaulting to 'length'
        Name  Maturity Strike Call Put Net
1:      Ebay  2/8/2018    8.0    1   2  -1
2:  Facebook  5/5/2018    5.0    1   0   1
3: Microsoft 19/3/2018    2.0    1   0   1
4: Microsoft 19/3/2018    2.5    0   1  -1

这是一个tidyverse解决方案。基本上,由于您想将具有相同 NameStrikeMaturity 的所有行分组,我认为将 CallPut 转换为实际最简单数字并使用 summarise。您的特殊抵消案例实际上只是删除总数最终为 0 的净案例。

方法是:

  1. 使用ifelsemutate
  2. Put转换为Nominal的负值
  3. 使用 group_bysummarise 将组减少为每组一个值`,
  4. 删除 filter
  5. 的完美偏移
  6. 替换 Type 列并将负值变为正值。

代码:

library(tidyverse)
tbl <- read_table2(
  "Type  Name     Strike  Maturity    Nominal
  Call  Amazon    10     10/12/2018  1000
  Put   Amazon    10     10/12/2018  1000
  Call  Ebay      8      2/8/2018    800
  Put   Ebay      8      2/8/2018    500
  Call  Facebook  5      5/5/2018    900
  Call  Google    2      23/4/2018   250
  Put   Google    2      23/4/2018   350
  Call  Microsoft 2      19/3/2018   250
  Put   Microsoft 2.5    19/3/2018   350
  Put   Ebay      8      2/8/2018    100"
)

tbl %>%
  mutate(actual = ifelse(Type == "Call", Nominal, -Nominal)) %>%
  group_by(Name, Strike, Maturity) %>%
  summarise(Net = sum(actual)) %>%
  filter(Net != 0) %>%
  mutate(
    Type = ifelse(Net > 0, "Call", "Put"),
    Net = abs(Net)
    )
# A tibble: 5 x 5
# Groups:   Name, Strike [5]
  Name      Strike Maturity    Net Type 
  <chr>      <dbl> <chr>     <int> <chr>
1 Ebay        8.00 2/8/2018    200 Call 
2 Facebook    5.00 5/5/2018    900 Call 
3 Google      2.00 23/4/2018   100 Put  
4 Microsoft   2.00 19/3/2018   250 Call 
5 Microsoft   2.50 19/3/2018   350 Put