如何使用引用 table 将行插入到 R 中的数据框中?
How to Use reference table to insert rows into a data frame in R?
我有一个数据框(标签),我想将其用作参考或查找 table 形式:
V1 V2
1 1 WALKING
2 2 WALKING_UPSTAIRS
3 3 WALKING_DOWNSTAIRS
4 4 SITTING
5 5 STANDING
6 6 LAYING
使用引用table的数据框是(test, ncol = 564, nrow = 2947) 其中前三个colnames是(test_subject, test_label(num 1 -6), data_set) 其中 test_label(1-6) 等于上面引用的字符串。
有人能帮我弄清楚如何使用我的查找 table 插入一个名为 "activity_label" 的新列,并且该列的每个观察结果都对应于引用数字的字符串等效项参考 table.
例如,如果 test_label 第 1 行等于 5,则 activity_label 第 1 行将等于 "Standing"
非常感谢您的帮助!
#
使用合并方法后:
> test2[1:10, 564: 565]
angle(Z,gravityMean) activity_label
1 0.04404283 walking
2 0.04134032 walking
3 0.04295217 walking
4 0.03611571 walking
5 -0.09080307 walking
6 -0.08602478 walking
7 -0.07997668 walking
8 0.04372663 walking
9 0.19900166 walking
10 0.20350821 walking
正在分析剩余dfs的结构
> str(test1)
'data.frame': 2947 obs. of 565 variables:
$ test_labels : int 1 1 1 1 1 1 1 1 1 1 ...
$ test_subject : int 12 12 12 12 4 4 4 12 9 9 ...
$ observ_set : Factor w/ 1 level "test": 1 1 1 1 1 1 1 1 1 1 ...
$ tBodyAcc-mean()-X : num 0.228 0.303 0.237 0.306 0.29 ...
> str(train1)
'data.frame': 7352 obs. of 565 variables:
$ train_labels : int 1 1 1 1 1 1 1 1 1 1 ...
$ V1 : int 27 7 7 26 7 26 6 6 6 7 ...
$ observ_set : Factor w/ 1 level "train": 1 1 1 1 1 1 1 1 1 1 ...
$ tBodyAcc-mean()-X : num 0.262 0.354 0.344 0.292 0.314 ...
我会做如下。映射由 'test_label' 和 'id' 完成,它们使用 merge()
合并。如果要保留 df
中的所有值,请使用 all.x = T
。否则删除它。
set.seed(1237)
lookup <- data.frame(id = 1:6, activity = LETTERS[1:6])
df <- data.frame(test_label = sample(1:6, 10, replace = T))
merge(df, lookup, by.x = "test_label", by.y ="id", all.x = T)
test_label activity
1 1 A
2 1 A
3 2 B
4 2 B
5 3 C
6 5 E
7 5 E
8 6 F
9 6 F
10 6 F
一种方法是使用 ifelse
:
if data frame = test and activity number column = activitynum,
test$activitylabel <- ifelse(test$activitynum == 1, "walking, ifelse(test$activitynum == 2, "walking_upstairs", ifelse(test$activitynum == 3, "walking_downstairs", ifelse(test$activitynum == 4, "sitting", ifelse(test$activitynum == 5, "standing", ifelse(test$activitynum == 6, "laying", NA))))))
另一种方法是创建查找 table,然后按照@Jaehyeon 的建议执行 merge
:
lookup <- data.frame(activitynum = c(1,2,3,4,5,6), activity = c("walking", "walking_upstairs", "walking_downstairs", "standing", "sitting", "laying"))
survey <- data.frame(id = c(seq(1:10)), activitynum = floor(runif(10, 1, 7)), var1 = runif(10, 1, 100))
merge(survey, lookup, by = "activitynum", all.x = TRUE)
> str(lookup)
'data.frame': 6 obs. of 2 variables:
$ activitynum: num 1 2 3 4 5 6
$ activity : Factor w/ 6 levels "laying","sitting",..: 4 6 5 3 2 1
> str(survey)
'data.frame': 10 obs. of 3 variables:
$ id : int 1 2 3 4 5 6 7 8 9 10
$ activitynum: num 1 2 4 1 4 6 2 4 2 2
$ var1 : num 52.3 60.5 53.3 49.8 73.1 ...
我有一个数据框(标签),我想将其用作参考或查找 table 形式:
V1 V2
1 1 WALKING
2 2 WALKING_UPSTAIRS
3 3 WALKING_DOWNSTAIRS
4 4 SITTING
5 5 STANDING
6 6 LAYING
使用引用table的数据框是(test, ncol = 564, nrow = 2947) 其中前三个colnames是(test_subject, test_label(num 1 -6), data_set) 其中 test_label(1-6) 等于上面引用的字符串。
有人能帮我弄清楚如何使用我的查找 table 插入一个名为 "activity_label" 的新列,并且该列的每个观察结果都对应于引用数字的字符串等效项参考 table.
例如,如果 test_label 第 1 行等于 5,则 activity_label 第 1 行将等于 "Standing"
非常感谢您的帮助!
#
使用合并方法后:
> test2[1:10, 564: 565]
angle(Z,gravityMean) activity_label
1 0.04404283 walking
2 0.04134032 walking
3 0.04295217 walking
4 0.03611571 walking
5 -0.09080307 walking
6 -0.08602478 walking
7 -0.07997668 walking
8 0.04372663 walking
9 0.19900166 walking
10 0.20350821 walking
正在分析剩余dfs的结构
> str(test1)
'data.frame': 2947 obs. of 565 variables:
$ test_labels : int 1 1 1 1 1 1 1 1 1 1 ...
$ test_subject : int 12 12 12 12 4 4 4 12 9 9 ...
$ observ_set : Factor w/ 1 level "test": 1 1 1 1 1 1 1 1 1 1 ...
$ tBodyAcc-mean()-X : num 0.228 0.303 0.237 0.306 0.29 ...
> str(train1)
'data.frame': 7352 obs. of 565 variables:
$ train_labels : int 1 1 1 1 1 1 1 1 1 1 ...
$ V1 : int 27 7 7 26 7 26 6 6 6 7 ...
$ observ_set : Factor w/ 1 level "train": 1 1 1 1 1 1 1 1 1 1 ...
$ tBodyAcc-mean()-X : num 0.262 0.354 0.344 0.292 0.314 ...
我会做如下。映射由 'test_label' 和 'id' 完成,它们使用 merge()
合并。如果要保留 df
中的所有值,请使用 all.x = T
。否则删除它。
set.seed(1237)
lookup <- data.frame(id = 1:6, activity = LETTERS[1:6])
df <- data.frame(test_label = sample(1:6, 10, replace = T))
merge(df, lookup, by.x = "test_label", by.y ="id", all.x = T)
test_label activity
1 1 A
2 1 A
3 2 B
4 2 B
5 3 C
6 5 E
7 5 E
8 6 F
9 6 F
10 6 F
一种方法是使用 ifelse
:
if data frame = test and activity number column = activitynum,
test$activitylabel <- ifelse(test$activitynum == 1, "walking, ifelse(test$activitynum == 2, "walking_upstairs", ifelse(test$activitynum == 3, "walking_downstairs", ifelse(test$activitynum == 4, "sitting", ifelse(test$activitynum == 5, "standing", ifelse(test$activitynum == 6, "laying", NA))))))
另一种方法是创建查找 table,然后按照@Jaehyeon 的建议执行 merge
:
lookup <- data.frame(activitynum = c(1,2,3,4,5,6), activity = c("walking", "walking_upstairs", "walking_downstairs", "standing", "sitting", "laying"))
survey <- data.frame(id = c(seq(1:10)), activitynum = floor(runif(10, 1, 7)), var1 = runif(10, 1, 100))
merge(survey, lookup, by = "activitynum", all.x = TRUE)
> str(lookup)
'data.frame': 6 obs. of 2 variables:
$ activitynum: num 1 2 3 4 5 6
$ activity : Factor w/ 6 levels "laying","sitting",..: 4 6 5 3 2 1
> str(survey)
'data.frame': 10 obs. of 3 variables:
$ id : int 1 2 3 4 5 6 7 8 9 10
$ activitynum: num 1 2 4 1 4 6 2 4 2 2
$ var1 : num 52.3 60.5 53.3 49.8 73.1 ...