在 R 中定义具有多个索引值的组
Defining group with multiple index value in R
我想用以下数据做 Bartlett 检验
Sample
RdRp
E
N
Day
Temp
1
103307.0863
118355.7563
288771.7766
3
4
2
107374.2944
119575.9188
266808.8528
3
4
3
102086.9239
119169.198
278197.0355
3
4
4
103994.1128
115458.0308
251387.3436
3
-20
5
103584.6872
114639.1795
253843.8974
3
-20
6
106450.6667
108497.7949
255072.1744
3
-20
7
105704.8475
120569.5917
251049.0129
3
-80
8
107769.3953
117679.2248
249397.3747
3
-80
9
104466.1189
123459.9587
253113.5607
3
-80
10
105046.6462
104636.3077
260975.2615
5
4
11
105867.3231
109970.7077
266720
5
4
12
101763.9385
107098.3385
263026.9538
5
4
13
102250.4304
109121.0127
246532.6582
5
-20
14
103462.8861
123670.481
238045.4684
5
-20
15
96996.4557
113162.5316
254615.6962
5
-20
16
104711.3684
113938.807
242320.5614
5
-80
17
104310.1754
108723.2982
251949.193
5
-80
18
97891.08772
118351.9298
249943.2281
5
-80
19
101149.3671
100744.7696
253682.6127
8
4
20
97507.98987
102363.1595
267034.3291
8
4
21
101553.9646
102767.757
258537.7823
8
4
22
97977.74359
113555.7949
245969.2308
8
-20
23
102897.1282
113145.8462
255808
8
-20
24
109046.359
115195.5897
256627.8974
8
-20
25
105661.883
110944.9771
247899.0331
8
-80
26
106474.6667
111757.7608
250743.7761
8
-80
27
105661.883
116228.0712
258871.6132
8
-80
28
101923.4694
112931.2041
270301.0408
1
25
29
103554.2449
110077.3469
259293.3061
1
25
30
104369.6327
113338.898
242577.8571
1
25
31
58826.07634
57158.11705
130995.827
3
25
32
61185.6285
56425.84224
168016.3868
3
25
33
60860.17303
53496.743
172084.5802
3
25
然后我编写了以下工作代码。
stab=read.csv("519stability.csv")
bartlett.test(data=stab, RdRp ~ Day, subset = Temp == -80)
bartlett.test(data=stab, RdRp ~ Day, subset = Temp == -20)
bartlett.test(data=stab, RdRp ~ Day, subset = Temp == 4)
bartlett.test(data=stab, RdRp ~ Day, subset = Temp == 25)
bartlett.test(data=stab, E ~ Day, subset = Temp == -80)
bartlett.test(data=stab, E ~ Day, subset = Temp == -20)
bartlett.test(data=stab, E ~ Day, subset = Temp == 4)
bartlett.test(data=stab, E ~ Day, subset = Temp == 25)
bartlett.test(data=stab, N ~ Day, subset = Temp == -80)
bartlett.test(data=stab, N ~ Day, subset = Temp == -20)
bartlett.test(data=stab, N ~ Day, subset = Temp == 4)
bartlett.test(data=stab, N ~ Day, subset = Temp == 25)
这段代码产生了我想要的结果。但是是否有一些库(dplyr、reshape2 等)的智能和简单代码?
重点是定义一个有两个索引值的组;日期和温度
编辑
我在这里用 dput() 添加我的数据。
structure(list(Sample = 1:33, RdRp = c(103307.0863, 107374.2944,
102086.9239, 103994.1128, 103584.6872, 106450.6667, 105704.8475,
107769.3953, 104466.1189, 105046.6462, 105867.3231, 101763.9385,
102250.4304, 103462.8861, 96996.4557, 104711.3684, 104310.1754,
97891.08772, 101149.3671, 97507.98987, 101553.9646, 97977.74359,
102897.1282, 109046.359, 105661.883, 106474.6667, 105661.883,
101923.4694, 103554.2449, 104369.6327, 58826.07634, 61185.6285,
60860.17303), E = c(118355.7563, 119575.9188, 119169.198, 115458.0308,
114639.1795, 108497.7949, 120569.5917, 117679.2248, 123459.9587,
104636.3077, 109970.7077, 107098.3385, 109121.0127, 123670.481,
113162.5316, 113938.807, 108723.2982, 118351.9298, 100744.7696,
102363.1595, 102767.757, 113555.7949, 113145.8462, 115195.5897,
110944.9771, 111757.7608, 116228.0712, 112931.2041, 110077.3469,
113338.898, 57158.11705, 56425.84224, 53496.743), N = c(288771.7766,
266808.8528, 278197.0355, 251387.3436, 253843.8974, 255072.1744,
251049.0129, 249397.3747, 253113.5607, 260975.2615, 266720, 263026.9538,
246532.6582, 238045.4684, 254615.6962, 242320.5614, 251949.193,
249943.2281, 253682.6127, 267034.3291, 258537.7823, 245969.2308,
255808, 256627.8974, 247899.0331, 250743.7761, 258871.6132, 270301.0408,
259293.3061, 242577.8571, 130995.827, 168016.3868, 172084.5802
), Day = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 1L, 1L,
1L, 3L, 3L, 3L), Temp = c(4L, 4L, 4L, -20L, -20L, -20L, -80L,
-80L, -80L, 4L, 4L, 4L, -20L, -20L, -20L, -80L, -80L, -80L, 4L,
4L, 4L, -20L, -20L, -20L, -80L, -80L, -80L, 25L, 25L, 25L, 25L,
25L, 25L)), class = "data.frame", row.names = c(NA, -33L))
您可以使用循环:
library(dplyr)
library(purrr)
map(unique(stab$Temp), ~ bartlett.test(data=stab, RdRp ~ Day, subset = Temp==.x))
map(unique(stab$Temp), ~ bartlett.test(data=stab, E ~ Day, subset = Temp ==.x))
map(unique(stab$Temp), ~ bartlett.test(data=stab, N ~ Day, subset = Temp ==.x))
这可以通过嵌套循环进一步简化:
library(purrr)
formulas<-c('RdRp ~ Day', 'E ~ Day', 'N ~ Day')
map(formulas, function(form)
map(unique(stab$Temp), ~ bartlett.test(data=stab,
as.formula(form),
subset = Temp == .x)))
不过,如果没有您的数据样本代码,我无法对此进行测试。
你可以试试lapply
coln <- c("RdRp", "E", "N")
temps <- c(-80, -20, 4, 25)
lapply(coln, function(x) {
lapply(temps, function(y){
bartlett.test(stab[[x]] ~ stab$Day, subset = stab$Temp == y)
})
})
部分结果如
[[1]]
[[1]][[1]]
Bartlett test of homogeneity of variances
data: stab[[x]] by stab$Day
Bartlett's K-squared = 5.1079, df = 2, p-value = 0.07777
[[1]][[2]]
Bartlett test of homogeneity of variances
data: stab[[x]] by stab$Day
Bartlett's K-squared = 2.2096, df = 2, p-value = 0.3313
您可以使用 outer
方法,首先将因变量和温度放入向量中,然后在 FUN
函数中定义 bartlett 骨架。给出一个可以很容易地 split
放入列表中的矩阵。要知道哪个是哪个,我们应该 setNames
.
dpv <- c('RdRp', 'E', 'N')
tmp <- c(-80, -20, 4, 25)
FUN <- Vectorize(\(x, y)
bartlett.test(data=stab, as.formula(paste(x, '~ Day')),
subset=Temp == y),
SIMPLIFY=F)
res <- outer(dpv, tmp, FUN) |>
split(rep(seq(tmp), each=length(dpv))) |>
setNames(make.names(tmp))
结果
res$X.80 ## the first list element, temp -80
# [[1]]
#
# Bartlett test of homogeneity of variances
#
# data: RdRp by Day
# Bartlett's K-squared = 5.1079, df = 2, p-value = 0.07777
#
#
# [[2]]
#
# Bartlett test of homogeneity of variances
#
# data: E by Day
# Bartlett's K-squared = 0.63384, df = 2, p-value = 0.7284
#
#
# [[3]]
#
# Bartlett test of homogeneity of variances
#
# data: N by Day
# Bartlett's K-squared = 1.797, df = 2, p-value = 0.4072
注:需要R > 4.1.*。
我想用以下数据做 Bartlett 检验
Sample | RdRp | E | N | Day | Temp |
---|---|---|---|---|---|
1 | 103307.0863 | 118355.7563 | 288771.7766 | 3 | 4 |
2 | 107374.2944 | 119575.9188 | 266808.8528 | 3 | 4 |
3 | 102086.9239 | 119169.198 | 278197.0355 | 3 | 4 |
4 | 103994.1128 | 115458.0308 | 251387.3436 | 3 | -20 |
5 | 103584.6872 | 114639.1795 | 253843.8974 | 3 | -20 |
6 | 106450.6667 | 108497.7949 | 255072.1744 | 3 | -20 |
7 | 105704.8475 | 120569.5917 | 251049.0129 | 3 | -80 |
8 | 107769.3953 | 117679.2248 | 249397.3747 | 3 | -80 |
9 | 104466.1189 | 123459.9587 | 253113.5607 | 3 | -80 |
10 | 105046.6462 | 104636.3077 | 260975.2615 | 5 | 4 |
11 | 105867.3231 | 109970.7077 | 266720 | 5 | 4 |
12 | 101763.9385 | 107098.3385 | 263026.9538 | 5 | 4 |
13 | 102250.4304 | 109121.0127 | 246532.6582 | 5 | -20 |
14 | 103462.8861 | 123670.481 | 238045.4684 | 5 | -20 |
15 | 96996.4557 | 113162.5316 | 254615.6962 | 5 | -20 |
16 | 104711.3684 | 113938.807 | 242320.5614 | 5 | -80 |
17 | 104310.1754 | 108723.2982 | 251949.193 | 5 | -80 |
18 | 97891.08772 | 118351.9298 | 249943.2281 | 5 | -80 |
19 | 101149.3671 | 100744.7696 | 253682.6127 | 8 | 4 |
20 | 97507.98987 | 102363.1595 | 267034.3291 | 8 | 4 |
21 | 101553.9646 | 102767.757 | 258537.7823 | 8 | 4 |
22 | 97977.74359 | 113555.7949 | 245969.2308 | 8 | -20 |
23 | 102897.1282 | 113145.8462 | 255808 | 8 | -20 |
24 | 109046.359 | 115195.5897 | 256627.8974 | 8 | -20 |
25 | 105661.883 | 110944.9771 | 247899.0331 | 8 | -80 |
26 | 106474.6667 | 111757.7608 | 250743.7761 | 8 | -80 |
27 | 105661.883 | 116228.0712 | 258871.6132 | 8 | -80 |
28 | 101923.4694 | 112931.2041 | 270301.0408 | 1 | 25 |
29 | 103554.2449 | 110077.3469 | 259293.3061 | 1 | 25 |
30 | 104369.6327 | 113338.898 | 242577.8571 | 1 | 25 |
31 | 58826.07634 | 57158.11705 | 130995.827 | 3 | 25 |
32 | 61185.6285 | 56425.84224 | 168016.3868 | 3 | 25 |
33 | 60860.17303 | 53496.743 | 172084.5802 | 3 | 25 |
然后我编写了以下工作代码。
stab=read.csv("519stability.csv")
bartlett.test(data=stab, RdRp ~ Day, subset = Temp == -80)
bartlett.test(data=stab, RdRp ~ Day, subset = Temp == -20)
bartlett.test(data=stab, RdRp ~ Day, subset = Temp == 4)
bartlett.test(data=stab, RdRp ~ Day, subset = Temp == 25)
bartlett.test(data=stab, E ~ Day, subset = Temp == -80)
bartlett.test(data=stab, E ~ Day, subset = Temp == -20)
bartlett.test(data=stab, E ~ Day, subset = Temp == 4)
bartlett.test(data=stab, E ~ Day, subset = Temp == 25)
bartlett.test(data=stab, N ~ Day, subset = Temp == -80)
bartlett.test(data=stab, N ~ Day, subset = Temp == -20)
bartlett.test(data=stab, N ~ Day, subset = Temp == 4)
bartlett.test(data=stab, N ~ Day, subset = Temp == 25)
这段代码产生了我想要的结果。但是是否有一些库(dplyr、reshape2 等)的智能和简单代码?
重点是定义一个有两个索引值的组;日期和温度
编辑 我在这里用 dput() 添加我的数据。
structure(list(Sample = 1:33, RdRp = c(103307.0863, 107374.2944,
102086.9239, 103994.1128, 103584.6872, 106450.6667, 105704.8475,
107769.3953, 104466.1189, 105046.6462, 105867.3231, 101763.9385,
102250.4304, 103462.8861, 96996.4557, 104711.3684, 104310.1754,
97891.08772, 101149.3671, 97507.98987, 101553.9646, 97977.74359,
102897.1282, 109046.359, 105661.883, 106474.6667, 105661.883,
101923.4694, 103554.2449, 104369.6327, 58826.07634, 61185.6285,
60860.17303), E = c(118355.7563, 119575.9188, 119169.198, 115458.0308,
114639.1795, 108497.7949, 120569.5917, 117679.2248, 123459.9587,
104636.3077, 109970.7077, 107098.3385, 109121.0127, 123670.481,
113162.5316, 113938.807, 108723.2982, 118351.9298, 100744.7696,
102363.1595, 102767.757, 113555.7949, 113145.8462, 115195.5897,
110944.9771, 111757.7608, 116228.0712, 112931.2041, 110077.3469,
113338.898, 57158.11705, 56425.84224, 53496.743), N = c(288771.7766,
266808.8528, 278197.0355, 251387.3436, 253843.8974, 255072.1744,
251049.0129, 249397.3747, 253113.5607, 260975.2615, 266720, 263026.9538,
246532.6582, 238045.4684, 254615.6962, 242320.5614, 251949.193,
249943.2281, 253682.6127, 267034.3291, 258537.7823, 245969.2308,
255808, 256627.8974, 247899.0331, 250743.7761, 258871.6132, 270301.0408,
259293.3061, 242577.8571, 130995.827, 168016.3868, 172084.5802
), Day = c(3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 1L, 1L,
1L, 3L, 3L, 3L), Temp = c(4L, 4L, 4L, -20L, -20L, -20L, -80L,
-80L, -80L, 4L, 4L, 4L, -20L, -20L, -20L, -80L, -80L, -80L, 4L,
4L, 4L, -20L, -20L, -20L, -80L, -80L, -80L, 25L, 25L, 25L, 25L,
25L, 25L)), class = "data.frame", row.names = c(NA, -33L))
您可以使用循环:
library(dplyr)
library(purrr)
map(unique(stab$Temp), ~ bartlett.test(data=stab, RdRp ~ Day, subset = Temp==.x))
map(unique(stab$Temp), ~ bartlett.test(data=stab, E ~ Day, subset = Temp ==.x))
map(unique(stab$Temp), ~ bartlett.test(data=stab, N ~ Day, subset = Temp ==.x))
这可以通过嵌套循环进一步简化:
library(purrr)
formulas<-c('RdRp ~ Day', 'E ~ Day', 'N ~ Day')
map(formulas, function(form)
map(unique(stab$Temp), ~ bartlett.test(data=stab,
as.formula(form),
subset = Temp == .x)))
不过,如果没有您的数据样本代码,我无法对此进行测试。
你可以试试lapply
coln <- c("RdRp", "E", "N")
temps <- c(-80, -20, 4, 25)
lapply(coln, function(x) {
lapply(temps, function(y){
bartlett.test(stab[[x]] ~ stab$Day, subset = stab$Temp == y)
})
})
部分结果如
[[1]]
[[1]][[1]]
Bartlett test of homogeneity of variances
data: stab[[x]] by stab$Day
Bartlett's K-squared = 5.1079, df = 2, p-value = 0.07777
[[1]][[2]]
Bartlett test of homogeneity of variances
data: stab[[x]] by stab$Day
Bartlett's K-squared = 2.2096, df = 2, p-value = 0.3313
您可以使用 outer
方法,首先将因变量和温度放入向量中,然后在 FUN
函数中定义 bartlett 骨架。给出一个可以很容易地 split
放入列表中的矩阵。要知道哪个是哪个,我们应该 setNames
.
dpv <- c('RdRp', 'E', 'N')
tmp <- c(-80, -20, 4, 25)
FUN <- Vectorize(\(x, y)
bartlett.test(data=stab, as.formula(paste(x, '~ Day')),
subset=Temp == y),
SIMPLIFY=F)
res <- outer(dpv, tmp, FUN) |>
split(rep(seq(tmp), each=length(dpv))) |>
setNames(make.names(tmp))
结果
res$X.80 ## the first list element, temp -80
# [[1]]
#
# Bartlett test of homogeneity of variances
#
# data: RdRp by Day
# Bartlett's K-squared = 5.1079, df = 2, p-value = 0.07777
#
#
# [[2]]
#
# Bartlett test of homogeneity of variances
#
# data: E by Day
# Bartlett's K-squared = 0.63384, df = 2, p-value = 0.7284
#
#
# [[3]]
#
# Bartlett test of homogeneity of variances
#
# data: N by Day
# Bartlett's K-squared = 1.797, df = 2, p-value = 0.4072
注:需要R > 4.1.*。