根据重复观察创建新列
Create new column based on replicate observation
我有一个问题如何基于另一个创建新列。
这是我的部分data
:
Category Brand Time1 value Time2 number
2 HTC 2015-01-01 1724 NA 1
6 APPLE 2015-10-10 3000 2015-10-30 1
2 APPLE 2016-01-15 430 NA 1
NA Samsung 2016-10-20 860 2016-12-20 1
我显示了 4 个 obs。以上 data
,我再解释一下我的 data
:
首先,看结构。
> str(data)
Classes ‘data.table’ and 'data.frame': 105907 obs. of 6 variables:
$ Category : num 2 2 2 2 2 2 2 2 2 2 ...
$ Brand : chr "HTC" "APPLE" "INFOCUS" "APPLE" ...
$ Time1 : POSIXct, format: "2015-01-01" "2015-01-01" "2015-01-01" "2015-01-01" ...
$ value : num 1724 2946 330 2946 2946 ...
$ Time2 : POSIXct, format: NA NA NA "2015-01-03" ...
$ number : chr "1" "1" "1" "1" ...
- attr(*, ".internal.selfref")=<externalptr>
其次,我想复制每个obs。基于 Time1
.
这是我的代码:
data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)][]
现在,data
看起来像:
Category Brand Time1 value Time2 number rep
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
NA Samsung 2016-10-20 860 2016-12-20 1 3
NA Samsung 2016-10-20 860 2016-12-20 1 3
NA Samsung 2016-10-20 860 2016-12-20 1 3
三、我想新建一个列Lapse
,我要的结果是:
Category Brand Time1 value Time2 number rep Lapse
2 HTC 2015-01-01 1724 NA 1 13 0
2 HTC 2015-01-01 1724 NA 1 13 1
2 HTC 2015-01-01 1724 NA 1 13 2
2 HTC 2015-01-01 1724 NA 1 13 3
2 HTC 2015-01-01 1724 NA 1 13 4
2 HTC 2015-01-01 1724 NA 1 13 5
2 HTC 2015-01-01 1724 NA 1 13 6
2 HTC 2015-01-01 1724 NA 1 13 7
2 HTC 2015-01-01 1724 NA 1 13 8
2 HTC 2015-01-01 1724 NA 1 13 9
2 HTC 2015-01-01 1724 NA 1 13 10
2 HTC 2015-01-01 1724 NA 1 13 11
2 HTC 2015-01-01 1724 NA 1 13 12
6 APPLE 2015-10-10 3000 2015-10-30 1 13 0
6 APPLE 2015-10-10 3000 2015-10-30 1 13 1
6 APPLE 2015-10-10 3000 2015-10-30 1 13 2
6 APPLE 2015-10-10 3000 2015-10-30 1 13 3
6 APPLE 2015-10-10 3000 2015-10-30 1 13 4
6 APPLE 2015-10-10 3000 2015-10-30 1 13 5
6 APPLE 2015-10-10 3000 2015-10-30 1 13 6
6 APPLE 2015-10-10 3000 2015-10-30 1 13 7
6 APPLE 2015-10-10 3000 2015-10-30 1 13 8
6 APPLE 2015-10-10 3000 2015-10-30 1 13 9
6 APPLE 2015-10-10 3000 2015-10-30 1 13 10
6 APPLE 2015-10-10 3000 2015-10-30 1 13 11
6 APPLE 2015-10-10 3000 2015-10-30 1 13 12
2 APPLE 2016-01-15 430 NA 1 12 0
2 APPLE 2016-01-15 430 NA 1 12 1
2 APPLE 2016-01-15 430 NA 1 12 2
2 APPLE 2016-01-15 430 NA 1 12 3
2 APPLE 2016-01-15 430 NA 1 12 4
2 APPLE 2016-01-15 430 NA 1 12 5
2 APPLE 2016-01-15 430 NA 1 12 6
2 APPLE 2016-01-15 430 NA 1 12 7
2 APPLE 2016-01-15 430 NA 1 12 8
2 APPLE 2016-01-15 430 NA 1 12 9
2 APPLE 2016-01-15 430 NA 1 12 10
2 APPLE 2016-01-15 430 NA 1 12 11
NA Samsung 2016-10-20 860 2016-12-20 1 3 0
NA Samsung 2016-10-20 860 2016-12-20 1 3 1
NA Samsung 2016-10-20 860 2016-12-20 1 3 2
以上是我想要的结果,我试试这样的代码:
data[,Lapse := seq_len(.N)-1, by = (Category,Brand,Time1,value,Time2,number)]
然而,这是错误的。
如果是对的,
uniqie(data$Lapse)
[1] 0 1 2 3 4 5 6 7 8 9 10 11 12
但是,我得到了0~999
。我认为我的代码有误。
有什么建议吗?
或者也许还有其他好的方法可以做到这一点?
更新
data <- " Category Brand Time1 value Time2 number
1: 2 HTC 2015-01-01 1724 NA 1
2: 2 APPLE 2015-01-01 2946 NA 1
3: 2 INFOCUS 2015-01-01 330 NA 1
4: 2 APPLE 2015-01-01 2946 2015-01-03 1
5: 2 APPLE 2015-01-01 2946 NA 1
6: 2 Samsung 2015-01-01 2189 NA 1
7: 2 HTC 2015-01-01 730 NA 1
8: 2 Samsung 2015-01-01 2189 NA 1
9: 2 Samsung 2015-01-01 2189 NA 1
10: 2 HTC 2015-01-01 1296 NA 1
11: 2 HTC 2015-01-01 730 NA 1
12: 2 APPLE 2015-01-01 2189 NA 1
13: 2 INFOCUS 2015-01-01 330 2015-01-02 1
14: 2 HTC 2015-01-01 2189 NA 1
15: 2 SONY 2015-01-01 1296 NA 1
16: 2 HTC 2015-01-01 730 NA 1
17: 2 APPLE 2015-01-01 2946 NA 1
18: 2 APPLE 2015-01-01 2946 NA 1
19: 2 HTC 2015-01-01 1724 NA 1
20: 2 Samsung 2015-01-02 1724 NA 1
21: 2 Samsung 2015-01-02 2189 NA 1
22: 2 HTC 2015-01-02 730 NA 1
23: 2 Samsung 2015-01-02 2189 NA 1
24: 2 HTC 2015-01-02 730 NA 1
25: 2 APPLE 2015-01-02 2946 NA 1
26: 2 HTC 2015-01-02 1724 NA 1
27: 2 HTC 2015-01-02 730 NA 1
28: 2 ASUS 2015-01-02 330 NA 1
29: 2 ASUS 2015-01-02 330 NA 1
30: 2 Samsung 2015-01-02 1724 NA 1
31: 2 APPLE 2015-01-02 2189 NA 1
32: 2 HTC 2015-01-02 730 NA 1
33: 2 Samsung 2015-01-02 730 NA 1
34: 2 HTC 2015-01-02 730 NA 1
35: 2 HTC 2015-01-02 730 NA 1
36: 2 HTC 2015-01-02 730 NA 1
37: 2 Samsung 2015-01-02 730 NA 1
38: 2 APPLE 2015-01-03 2189 NA 1
39: 2 APPLE 2015-01-03 2946 NA 1
40: 2 Benten 2015-01-03 330 NA 1
41: 2 APPLE 2015-01-03 2946 NA 1
42: 2 INFOCUS 2015-01-03 330 NA 1
43: 2 Samsung 2015-01-03 1296 NA 1
44: 2 HTC 2015-01-03 730 NA 1
45: 2 Samsung 2015-01-03 2189 NA 1
46: 2 SONY 2015-01-03 2189 NA 1
47: 2 TaiwanMobile 2015-01-03 730 NA 1
48: 2 HTC 2015-01-03 1296 NA 1
49: 2 HTC 2015-01-03 730 NA 1
50: 2 APPLE 2015-01-03 2189 NA 1
51: 2 APPLE 2015-01-03 2189 NA 1
52: 2 HTC 2015-01-03 730 NA 1
53: 2 Samsung 2015-01-03 330 NA 1
54: 2 TaiwanMobile 2015-01-03 730 NA 1
55: 2 HTC 2015-01-03 730 NA 1
56: 2 HTC 2015-01-03 730 NA 1
57: 2 TaiwanMobile 2015-01-03 330 NA 1
58: 2 Samsung 2015-01-03 1724 2015-01-04 1
59: 2 HTC 2015-01-03 730 NA 1
60: 2 INFOCUS 2015-01-03 330 NA 1
61: 2 SONY 2015-01-03 730 NA 1
62: 2 HTC 2015-01-04 730 NA 1
63: 2 HTC 2015-01-04 730 NA 1
64: 2 APPLE 2015-01-04 2189 2015-01-05 1
65: 2 TaiwanMobile 2015-01-04 730 2015-01-05 1"
data <- read.table(text=data, header = TRUE)
data <- as.data.table(data)
data <- data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)][]
data[, Lapse := seq_len(.N)-1 , .(Category, Brand, Time1, value, Time2, number)]
dput(droplevels(head(data,65)))
structure(list(Category = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Brand = c("HTC", "APPLE",
"INFOCUS", "APPLE", "APPLE", "Samsung", "HTC", "Samsung", "Samsung",
"HTC", "HTC", "APPLE", "INFOCUS", "HTC", "SONY", "HTC", "APPLE",
"APPLE", "HTC", "Samsung", "Samsung", "HTC", "Samsung", "HTC",
"APPLE", "HTC", "HTC", "ASUS", "ASUS", "Samsung", "APPLE", "HTC",
"Samsung", "HTC", "HTC", "HTC", "Samsung", "APPLE", "APPLE",
"Benten", "APPLE", "INFOCUS", "Samsung", "HTC", "Samsung", "SONY",
"TaiwanMobile", "HTC", "HTC", "APPLE", "APPLE", "HTC", "Samsung",
"TaiwanMobile", "HTC", "HTC", "TaiwanMobile", "Samsung", "HTC",
"INFOCUS", "SONY", "HTC", "HTC", "APPLE", "TaiwanMobile"), Time1 = structure(c(1420070400,
1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 1420070400,
1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 1420070400,
1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 1420070400,
1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 1420156800,
1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 1420156800,
1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 1420156800,
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200,
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200,
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200,
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200,
1420329600, 1420329600, 1420329600, 1420329600), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), value = c(1724, 2946, 330, 2946,
2946, 2189, 730, 2189, 2189, 1296, 730, 2189, 330, 2189, 1296,
730, 2946, 2946, 1724, 1724, 2189, 730, 2189, 730, 2946, 1724,
730, 330, 330, 1724, 2189, 730, 730, 730, 730, 730, 730, 2189,
2946, 330, 2946, 330, 1296, 730, 2189, 2189, 730, 1296, 730,
2189, 2189, 730, 330, 730, 730, 730, 330, 1724, 730, 330, 730,
730, 730, 2189, 730), Time2 = structure(c(NA, NA, NA, 1420243200,
NA, NA, NA, NA, NA, NA, NA, NA, 1420156800, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 1420329600, NA, NA, NA, NA, NA, 1420416000,
1420416000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
number = c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1")), .Names = c("Category",
"Brand", "Time1", "value", "Time2", "number"), row.names = c(NA,
-65L), .internal.selfref = <pointer: 0x003e24a0>, class = c("data.table",
"data.frame"))
问题是结果很奇怪。
unique(data$Lapse)
[1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
[38] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
[75] 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
问题是原始数据没有更新,因为它的行数与以前相同。如果我们检查
的输出
data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)]
然后是
data
很明显。
因此,我们将这两个步骤的输出分配回原始对象 ('data') 或另一个对象(如果我们不想更改原始对象)
data <- data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)]
并根据组
的序列创建 'Lapse' 列
data[, Lapse := seq_len(.N)-1 , .(Category, Brand, Time1, value, Time2, number)]
data
Category Brand Time1 value Time2 number rep Lapse
1: 2 HTC 2015-01-01 1724 <NA> 1 13 0
2: 2 HTC 2015-01-01 1724 <NA> 1 13 1
3: 2 HTC 2015-01-01 1724 <NA> 1 13 2
4: 2 HTC 2015-01-01 1724 <NA> 1 13 3
5: 2 HTC 2015-01-01 1724 <NA> 1 13 4
6: 2 HTC 2015-01-01 1724 <NA> 1 13 5
7: 2 HTC 2015-01-01 1724 <NA> 1 13 6
8: 2 HTC 2015-01-01 1724 <NA> 1 13 7
9: 2 HTC 2015-01-01 1724 <NA> 1 13 8
10: 2 HTC 2015-01-01 1724 <NA> 1 13 9
11: 2 HTC 2015-01-01 1724 <NA> 1 13 10
12: 2 HTC 2015-01-01 1724 <NA> 1 13 11
13: 2 HTC 2015-01-01 1724 <NA> 1 13 12
14: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 0
15: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 1
16: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 2
17: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 3
18: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 4
19: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 5
20: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 6
21: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 7
22: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 8
23: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 9
24: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 10
25: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 11
26: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 12
27: 2 APPLE 2016-01-15 430 <NA> 1 12 0
28: 2 APPLE 2016-01-15 430 <NA> 1 12 1
29: 2 APPLE 2016-01-15 430 <NA> 1 12 2
30: 2 APPLE 2016-01-15 430 <NA> 1 12 3
31: 2 APPLE 2016-01-15 430 <NA> 1 12 4
32: 2 APPLE 2016-01-15 430 <NA> 1 12 5
33: 2 APPLE 2016-01-15 430 <NA> 1 12 6
34: 2 APPLE 2016-01-15 430 <NA> 1 12 7
35: 2 APPLE 2016-01-15 430 <NA> 1 12 8
36: 2 APPLE 2016-01-15 430 <NA> 1 12 9
37: 2 APPLE 2016-01-15 430 <NA> 1 12 10
38: 2 APPLE 2016-01-15 430 <NA> 1 12 11
39: NA Samsung 2016-10-20 860 2016-12-20 1 3 0
40: NA Samsung 2016-10-20 860 2016-12-20 1 3 1
41: NA Samsung 2016-10-20 860 2016-12-20 1 3 2
我有一个问题如何基于另一个创建新列。
这是我的部分data
:
Category Brand Time1 value Time2 number
2 HTC 2015-01-01 1724 NA 1
6 APPLE 2015-10-10 3000 2015-10-30 1
2 APPLE 2016-01-15 430 NA 1
NA Samsung 2016-10-20 860 2016-12-20 1
我显示了 4 个 obs。以上 data
,我再解释一下我的 data
:
首先,看结构。
> str(data)
Classes ‘data.table’ and 'data.frame': 105907 obs. of 6 variables:
$ Category : num 2 2 2 2 2 2 2 2 2 2 ...
$ Brand : chr "HTC" "APPLE" "INFOCUS" "APPLE" ...
$ Time1 : POSIXct, format: "2015-01-01" "2015-01-01" "2015-01-01" "2015-01-01" ...
$ value : num 1724 2946 330 2946 2946 ...
$ Time2 : POSIXct, format: NA NA NA "2015-01-03" ...
$ number : chr "1" "1" "1" "1" ...
- attr(*, ".internal.selfref")=<externalptr>
其次,我想复制每个obs。基于 Time1
.
这是我的代码:
data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)][]
现在,data
看起来像:
Category Brand Time1 value Time2 number rep
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
2 HTC 2015-01-01 1724 NA 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
6 APPLE 2015-10-10 3000 2015-10-30 1 13
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
2 APPLE 2016-01-15 430 NA 1 12
NA Samsung 2016-10-20 860 2016-12-20 1 3
NA Samsung 2016-10-20 860 2016-12-20 1 3
NA Samsung 2016-10-20 860 2016-12-20 1 3
三、我想新建一个列Lapse
,我要的结果是:
Category Brand Time1 value Time2 number rep Lapse
2 HTC 2015-01-01 1724 NA 1 13 0
2 HTC 2015-01-01 1724 NA 1 13 1
2 HTC 2015-01-01 1724 NA 1 13 2
2 HTC 2015-01-01 1724 NA 1 13 3
2 HTC 2015-01-01 1724 NA 1 13 4
2 HTC 2015-01-01 1724 NA 1 13 5
2 HTC 2015-01-01 1724 NA 1 13 6
2 HTC 2015-01-01 1724 NA 1 13 7
2 HTC 2015-01-01 1724 NA 1 13 8
2 HTC 2015-01-01 1724 NA 1 13 9
2 HTC 2015-01-01 1724 NA 1 13 10
2 HTC 2015-01-01 1724 NA 1 13 11
2 HTC 2015-01-01 1724 NA 1 13 12
6 APPLE 2015-10-10 3000 2015-10-30 1 13 0
6 APPLE 2015-10-10 3000 2015-10-30 1 13 1
6 APPLE 2015-10-10 3000 2015-10-30 1 13 2
6 APPLE 2015-10-10 3000 2015-10-30 1 13 3
6 APPLE 2015-10-10 3000 2015-10-30 1 13 4
6 APPLE 2015-10-10 3000 2015-10-30 1 13 5
6 APPLE 2015-10-10 3000 2015-10-30 1 13 6
6 APPLE 2015-10-10 3000 2015-10-30 1 13 7
6 APPLE 2015-10-10 3000 2015-10-30 1 13 8
6 APPLE 2015-10-10 3000 2015-10-30 1 13 9
6 APPLE 2015-10-10 3000 2015-10-30 1 13 10
6 APPLE 2015-10-10 3000 2015-10-30 1 13 11
6 APPLE 2015-10-10 3000 2015-10-30 1 13 12
2 APPLE 2016-01-15 430 NA 1 12 0
2 APPLE 2016-01-15 430 NA 1 12 1
2 APPLE 2016-01-15 430 NA 1 12 2
2 APPLE 2016-01-15 430 NA 1 12 3
2 APPLE 2016-01-15 430 NA 1 12 4
2 APPLE 2016-01-15 430 NA 1 12 5
2 APPLE 2016-01-15 430 NA 1 12 6
2 APPLE 2016-01-15 430 NA 1 12 7
2 APPLE 2016-01-15 430 NA 1 12 8
2 APPLE 2016-01-15 430 NA 1 12 9
2 APPLE 2016-01-15 430 NA 1 12 10
2 APPLE 2016-01-15 430 NA 1 12 11
NA Samsung 2016-10-20 860 2016-12-20 1 3 0
NA Samsung 2016-10-20 860 2016-12-20 1 3 1
NA Samsung 2016-10-20 860 2016-12-20 1 3 2
以上是我想要的结果,我试试这样的代码:
data[,Lapse := seq_len(.N)-1, by = (Category,Brand,Time1,value,Time2,number)]
然而,这是错误的。
如果是对的,
uniqie(data$Lapse)
[1] 0 1 2 3 4 5 6 7 8 9 10 11 12
但是,我得到了0~999
。我认为我的代码有误。
有什么建议吗?
或者也许还有其他好的方法可以做到这一点?
更新
data <- " Category Brand Time1 value Time2 number
1: 2 HTC 2015-01-01 1724 NA 1
2: 2 APPLE 2015-01-01 2946 NA 1
3: 2 INFOCUS 2015-01-01 330 NA 1
4: 2 APPLE 2015-01-01 2946 2015-01-03 1
5: 2 APPLE 2015-01-01 2946 NA 1
6: 2 Samsung 2015-01-01 2189 NA 1
7: 2 HTC 2015-01-01 730 NA 1
8: 2 Samsung 2015-01-01 2189 NA 1
9: 2 Samsung 2015-01-01 2189 NA 1
10: 2 HTC 2015-01-01 1296 NA 1
11: 2 HTC 2015-01-01 730 NA 1
12: 2 APPLE 2015-01-01 2189 NA 1
13: 2 INFOCUS 2015-01-01 330 2015-01-02 1
14: 2 HTC 2015-01-01 2189 NA 1
15: 2 SONY 2015-01-01 1296 NA 1
16: 2 HTC 2015-01-01 730 NA 1
17: 2 APPLE 2015-01-01 2946 NA 1
18: 2 APPLE 2015-01-01 2946 NA 1
19: 2 HTC 2015-01-01 1724 NA 1
20: 2 Samsung 2015-01-02 1724 NA 1
21: 2 Samsung 2015-01-02 2189 NA 1
22: 2 HTC 2015-01-02 730 NA 1
23: 2 Samsung 2015-01-02 2189 NA 1
24: 2 HTC 2015-01-02 730 NA 1
25: 2 APPLE 2015-01-02 2946 NA 1
26: 2 HTC 2015-01-02 1724 NA 1
27: 2 HTC 2015-01-02 730 NA 1
28: 2 ASUS 2015-01-02 330 NA 1
29: 2 ASUS 2015-01-02 330 NA 1
30: 2 Samsung 2015-01-02 1724 NA 1
31: 2 APPLE 2015-01-02 2189 NA 1
32: 2 HTC 2015-01-02 730 NA 1
33: 2 Samsung 2015-01-02 730 NA 1
34: 2 HTC 2015-01-02 730 NA 1
35: 2 HTC 2015-01-02 730 NA 1
36: 2 HTC 2015-01-02 730 NA 1
37: 2 Samsung 2015-01-02 730 NA 1
38: 2 APPLE 2015-01-03 2189 NA 1
39: 2 APPLE 2015-01-03 2946 NA 1
40: 2 Benten 2015-01-03 330 NA 1
41: 2 APPLE 2015-01-03 2946 NA 1
42: 2 INFOCUS 2015-01-03 330 NA 1
43: 2 Samsung 2015-01-03 1296 NA 1
44: 2 HTC 2015-01-03 730 NA 1
45: 2 Samsung 2015-01-03 2189 NA 1
46: 2 SONY 2015-01-03 2189 NA 1
47: 2 TaiwanMobile 2015-01-03 730 NA 1
48: 2 HTC 2015-01-03 1296 NA 1
49: 2 HTC 2015-01-03 730 NA 1
50: 2 APPLE 2015-01-03 2189 NA 1
51: 2 APPLE 2015-01-03 2189 NA 1
52: 2 HTC 2015-01-03 730 NA 1
53: 2 Samsung 2015-01-03 330 NA 1
54: 2 TaiwanMobile 2015-01-03 730 NA 1
55: 2 HTC 2015-01-03 730 NA 1
56: 2 HTC 2015-01-03 730 NA 1
57: 2 TaiwanMobile 2015-01-03 330 NA 1
58: 2 Samsung 2015-01-03 1724 2015-01-04 1
59: 2 HTC 2015-01-03 730 NA 1
60: 2 INFOCUS 2015-01-03 330 NA 1
61: 2 SONY 2015-01-03 730 NA 1
62: 2 HTC 2015-01-04 730 NA 1
63: 2 HTC 2015-01-04 730 NA 1
64: 2 APPLE 2015-01-04 2189 2015-01-05 1
65: 2 TaiwanMobile 2015-01-04 730 2015-01-05 1"
data <- read.table(text=data, header = TRUE)
data <- as.data.table(data)
data <- data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)][]
data[, Lapse := seq_len(.N)-1 , .(Category, Brand, Time1, value, Time2, number)]
dput(droplevels(head(data,65)))
structure(list(Category = c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2), Brand = c("HTC", "APPLE",
"INFOCUS", "APPLE", "APPLE", "Samsung", "HTC", "Samsung", "Samsung",
"HTC", "HTC", "APPLE", "INFOCUS", "HTC", "SONY", "HTC", "APPLE",
"APPLE", "HTC", "Samsung", "Samsung", "HTC", "Samsung", "HTC",
"APPLE", "HTC", "HTC", "ASUS", "ASUS", "Samsung", "APPLE", "HTC",
"Samsung", "HTC", "HTC", "HTC", "Samsung", "APPLE", "APPLE",
"Benten", "APPLE", "INFOCUS", "Samsung", "HTC", "Samsung", "SONY",
"TaiwanMobile", "HTC", "HTC", "APPLE", "APPLE", "HTC", "Samsung",
"TaiwanMobile", "HTC", "HTC", "TaiwanMobile", "Samsung", "HTC",
"INFOCUS", "SONY", "HTC", "HTC", "APPLE", "TaiwanMobile"), Time1 = structure(c(1420070400,
1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 1420070400,
1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 1420070400,
1420070400, 1420070400, 1420070400, 1420070400, 1420070400, 1420070400,
1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 1420156800,
1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 1420156800,
1420156800, 1420156800, 1420156800, 1420156800, 1420156800, 1420156800,
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200,
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200,
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200,
1420243200, 1420243200, 1420243200, 1420243200, 1420243200, 1420243200,
1420329600, 1420329600, 1420329600, 1420329600), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), value = c(1724, 2946, 330, 2946,
2946, 2189, 730, 2189, 2189, 1296, 730, 2189, 330, 2189, 1296,
730, 2946, 2946, 1724, 1724, 2189, 730, 2189, 730, 2946, 1724,
730, 330, 330, 1724, 2189, 730, 730, 730, 730, 730, 730, 2189,
2946, 330, 2946, 330, 1296, 730, 2189, 2189, 730, 1296, 730,
2189, 2189, 730, 330, 730, 730, 730, 330, 1724, 730, 330, 730,
730, 730, 2189, 730), Time2 = structure(c(NA, NA, NA, 1420243200,
NA, NA, NA, NA, NA, NA, NA, NA, 1420156800, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, 1420329600, NA, NA, NA, NA, NA, 1420416000,
1420416000), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
number = c("1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1")), .Names = c("Category",
"Brand", "Time1", "value", "Time2", "number"), row.names = c(NA,
-65L), .internal.selfref = <pointer: 0x003e24a0>, class = c("data.table",
"data.frame"))
问题是结果很奇怪。
unique(data$Lapse)
[1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
[38] 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
[75] 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
问题是原始数据没有更新,因为它的行数与以前相同。如果我们检查
的输出data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)]
然后是
data
很明显。
因此,我们将这两个步骤的输出分配回原始对象 ('data') 或另一个对象(如果我们不想更改原始对象)
data <- data[,rep:=ifelse(year(Time1)==2016, 12-month(Time1)+1, 13)][rep(1:.N,rep)]
并根据组
的序列创建 'Lapse' 列data[, Lapse := seq_len(.N)-1 , .(Category, Brand, Time1, value, Time2, number)]
data
Category Brand Time1 value Time2 number rep Lapse
1: 2 HTC 2015-01-01 1724 <NA> 1 13 0
2: 2 HTC 2015-01-01 1724 <NA> 1 13 1
3: 2 HTC 2015-01-01 1724 <NA> 1 13 2
4: 2 HTC 2015-01-01 1724 <NA> 1 13 3
5: 2 HTC 2015-01-01 1724 <NA> 1 13 4
6: 2 HTC 2015-01-01 1724 <NA> 1 13 5
7: 2 HTC 2015-01-01 1724 <NA> 1 13 6
8: 2 HTC 2015-01-01 1724 <NA> 1 13 7
9: 2 HTC 2015-01-01 1724 <NA> 1 13 8
10: 2 HTC 2015-01-01 1724 <NA> 1 13 9
11: 2 HTC 2015-01-01 1724 <NA> 1 13 10
12: 2 HTC 2015-01-01 1724 <NA> 1 13 11
13: 2 HTC 2015-01-01 1724 <NA> 1 13 12
14: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 0
15: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 1
16: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 2
17: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 3
18: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 4
19: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 5
20: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 6
21: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 7
22: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 8
23: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 9
24: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 10
25: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 11
26: 6 APPLE 2015-10-10 3000 2015-10-30 1 13 12
27: 2 APPLE 2016-01-15 430 <NA> 1 12 0
28: 2 APPLE 2016-01-15 430 <NA> 1 12 1
29: 2 APPLE 2016-01-15 430 <NA> 1 12 2
30: 2 APPLE 2016-01-15 430 <NA> 1 12 3
31: 2 APPLE 2016-01-15 430 <NA> 1 12 4
32: 2 APPLE 2016-01-15 430 <NA> 1 12 5
33: 2 APPLE 2016-01-15 430 <NA> 1 12 6
34: 2 APPLE 2016-01-15 430 <NA> 1 12 7
35: 2 APPLE 2016-01-15 430 <NA> 1 12 8
36: 2 APPLE 2016-01-15 430 <NA> 1 12 9
37: 2 APPLE 2016-01-15 430 <NA> 1 12 10
38: 2 APPLE 2016-01-15 430 <NA> 1 12 11
39: NA Samsung 2016-10-20 860 2016-12-20 1 3 0
40: NA Samsung 2016-10-20 860 2016-12-20 1 3 1
41: NA Samsung 2016-10-20 860 2016-12-20 1 3 2