无法按年和按月找到变量的相关性并存储在结果列表或单独的数据框中
unable to find group by year and month wise correlation of the variables and store in either a list of results or separate dataframe
我有一个 daatframe df,其中包含变量:年、月和另外 4 个变量,如下所示:
structure(list(Year = c(2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L), Month = c(1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L,
6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L,
9L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L,
12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L,
8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L,
6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L,
9L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L,
12L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L), `NC 1234(BAR)` = c(86.68,
81.52, 82.53, 84.17, 82.52, 79.11, 77.2, 81.13, 81.56, 78.6,
80.8, 81.09, 82.45, 88.19, 83.83, 82.77, 79.83, 77.42, 72.42,
73.45, 73.66, 74.36, 80.79, 83.66, 84.18, 82.46, 82.62, 84.03,
91.04, 92.28, 97, 95.4, 100.1, 101.83, 98.16, 98.15, 99.44, 95.53,
95.37, 97.88, 94.85, 98.25, 97.46, 99.31, 98.66, 95.21, 93.36,
96.84, 96.78, 102.59, 103.97, 103.88, 105.25, 106.27, 106.88,
107.42, 107.14, 102.47, 103.43, 110.6, 103.6, 100.02, 95.2, 92.41,
90.68, 92.43, 93.81, 93.26, 95.25, 101.35, 101.64, 107.75, 109.82,
108.89, 113.3, 118.09, 116.27, 117.26, 117.58, 118.49, 122.57,
120.7, 120.41, 118.29, 117.68, 117.39, 118.73, 114.48, 115.74,
113.01, 113.58, 113.98, 108.81, 107.05, 108.23, 103.08, 103.66,
103.59, 98.76, 99.95, 101.95, 102.4, 100.56, 102.55, 99.54, 97.61,
99.3, 99.78, 99.5, 96.92, 90.71, 94.64, 96.04, 96.7, 93.25, 90.51,
89.17, 78.3, 85.93, 89.34, 87.72, 86.91, 85.86, 83.24, 81.06,
78.71, 71.39, 72.72, 71.53, 68.81, 72.48, 76.51, 73.98, 72.88,
67.09, 68.88, 64.85, 62.5, 62.32, 65.38, 65.42, 65.94, 64.99,
65.71, 66.97, 67.43, 68.34, 68.22, 66.78, 67.28, 68.95, 67.62,
67.33, 66.31, 66.5, 66.19, 64.8, 68.67, 73.78, 68.45, 66.83,
66.86, 69.32, 68.21, 66.77, 65.99, 65.42, 66.57, 68.37, 68.33,
63.54, 58.59, 53.74, 50.76, 51.71, 50.83, 53.98, 54.9, 55.25,
52.75, 50.48, 51.17, 51.58, 52.72, 52.48, 51.72, 50.73, 51.42,
49.52, 50.19, 47.99, 48.17, 49.85, 53.66, 59.97, 58.21, 57.73,
54.42, 60.09, 60.35, 62.45, 62.53, 62.82, 68.22, 72.31, 77.91,
83.96, 85.31, 83.72, 80.78, 87.98, 87.92, 90.46, 85.65, 88.4,
85.79, 82.16, 90.84, 90.3, 92.54, 103.95, 99.38, 92.7, 91.75,
90.89, 91.71), `TPI 2 1300(BAR)` = c(86.34, 88.75, 90.78, 88.04,
83.64, 82.25, 80.36, 82.78, 78.15, 74.46, 73.23, 71.04, 69.87,
77.56, 75.23, 74.85, 74.06, 73.66, 72.45, 74.6, 77.4, 77.37,
78.99, 78.87, 80.26, 80.35, 83.83, 84.41, 83.04, 82.48, 83.27,
84.38, 86.28, 87.34, 88.35, 89.36, 92.84, 94.2, 92.02, 91.99,
90.33, 93.2, 92.55, 94.94, 94.69, 94.14, 92.04, 91.9, 93.28,
95.94, 95.57, 95.83, 96.2, 95.76, 95.14, 93.91, 90.5, 84.11,
81.86, 86.61, 80.74, 80.89, 80.93, 78.46, 76.25, 80.09, 82.18,
81.75, 83.19, 85.47, 88.22, 92.79, 91.38, 94.8, 95.79, 97.33,
97.26, 96.99, 99.93, 102.71, 101.8, 98.8, 94.56, 96.51, 99.43,
98.16, 99.56, 100.64, 100.45, 99.38, 101.23, 101.45, 100.01,
98.93, 100.74, 97.66, 92.71, 88.39, 81.37, 82.3, 86.07, 88.1,
89.46, 86.18, 84.34, 81.21, 81.88, 79.77, 79.13, 74.25, 71.47,
72.87, 74.9, 73.64, 70.3, 67.29, 61.74, 54.62, 60.16, 59.64,
58.82, 58.1, 59.35, 58.2, 55.48, 51.12, 50.55, 49.65, 47.79,
47.68, 53.54, 59.03, 59.34, 61.55, 58.65, 55.44, 54.95, 53.43,
48.68, 56.82, 62.25, 62.89, 59.49, 59.82, 60.94, 60.41, 57.99,
56.22, 56.29, 54.7, 55.72, 57.4, 55.58, 53.06, 52.73, 53.16,
52.09, 52.81, 51.05, 47.67, 48.35, 48.78, 49.23, 48.99, 46.1,
47.25, 45.68, 48.03, 50.63, 48.1, 45.64, 44.61, 41.46, 38.26,
39.33, 38.26, 38.8, 38.27, 45.99, 44.91, 44.64, 46.82, 50.68,
50.04, 48.73, 49.94, 50.15, 51.45, 49.68, 47.58, 47.42, 51.78,
51.61, 52.55, 53.32, 57.34, 57.77, 56.17, 56.39, 54.19, 51.41,
52.28, 53.98, 57.48, 62.02, 64.56, 68.58, 69.71, 68.99, 65.75,
70.31, 66.44, 67.51, 64.86, 68.52, 64.48, 65.7, 66.35, 65.96,
69.21, 72.25, 69.9, 68.38, 70.8, 73.92, 75.79), `TPI 4 1300(BAR)` = c(82,
84.25, 89.48, 88.25, 85.97, 83.48, 81.23, 82.36, 83.24, 80.1,
78.22, 76.19, 75.87, 80.31, 77.92, 76.95, 74.14, 74.2, 71.3,
71.98, 72.2, 74.56, 79.84, 79.04, 77.41, 77.16, 82.04, 82.51,
82.36, 79.45, 85.16, 86.65, 87.43, 87.46, 89.36, 89.86, 94.31,
93.33, 90.76, 90.73, 90.13, 91.5, 91.61, 94.13, 92.67, 89.73,
88.64, 94.41, 94.5, 95.58, 95.09, 94.48, 95.81, 97.29, 97.77,
98.36, 96.27, 90.1, 93.41, 98.34, 94.42, 92.24, 90.81, 90.2,
89.32, 91.75, 94.34, 95.37, 95.09, 98.17, 102.66, 104.7, 103.64,
103.98, 105.38, 105.81, 103.34, 103.79, 107.84, 108.95, 108.19,
106.57, 103.43, 101.62, 98.16, 95.66, 94.91, 97.35, 100.3, 97.33,
98.01, 100.75, 99.55, 98.52, 98.99, 98.38, 96.69, 91.65, 85.98,
89.27, 93.37, 95.42, 97.69, 96.87, 94.07, 90.23, 90.88, 89.33,
88.72, 84.06, 77.46, 78.79, 82.3, 80.44, 75.19, 71.33, 67.96,
65.52, 70.74, 69.72, 68.32, 70.42, 71.54, 66.47, 62.08, 60.18,
62.1, 63.98, 62.19, 64.32, 65.41, 65.85, 65.36, 66.1, 63.06,
59.62, 59.54, 57.89, 56.25, 58.97, 59.87, 60.69, 58.73, 62.48,
68.18, 69.58, 70.18, 67.38, 70.55, 73.48, 81.09, 86.66, 82.2,
82.33, 84.09, 87.48, 77.37, 88.17, 92.28, 85.54, 87.73, 86.9,
85.51, 81.65, 76.05, 70.4, 62.12, 61.76, 66.71, 71.04, 59.66,
51.58, 44.56, 49.49, 51.87, 53.15, 54.88, 55.26, 59.22, 55.3,
54.2, 50.09, 53.13, 53.6, 54.8, 53.34, 52.27, 54.01, 54.19, 56.4,
55.78, 57.08, 55.83, 56.21, 58.98, 61.7, 59.93, 57.18, 57.37,
59.76, 61.34, 67.09, 69.74, 77.7, 81.86, 84.32, 97.91, 97.32,
91.24, 87.68, 95.43, 90.13, 87.67, 83.9, 90.14, 89.31, 85.65,
90.91, 91.61, 96.39, 100.02, 96.7, 96.03, 89.93, 90.58, 93.07
), `TCC 1200(BAR)` = c(37.25, 37.76, 39.13, 41, 42.69, 43.31,
43.86, 43.97, 44.6, 45.11, 45.56, 45.72, 46.01, 45.47, 44.69,
42.19, 39.28, 38.11, 37.2, 36.2, 36.7, 37.3, 39.11, 41.59, 42.04,
41.65, 40.91, 40.8, 41.85, 42.28, 42.55, 42.52, 42.41, 42.75,
43.02, 43.85, 44.75, 46.03, 46.35, 46.37, 46.41, 46.8, 46.44,
46.09, 44.88, 44.16, 43.99, 44.2, 45.38, 46.09, 46.34, 46.81,
47.31, 48.02, 48.59, 48.84, 49.58, 50.16, 50.32, 51.04, 49.98,
48.45, 46.12, 43.55, 42.19, 41.48, 41.08, 41.13, 41.85, 42.79,
44.02, 45.55, 45.85, 46.82, 47.74, 48.08, 48.71, 47.82, 47.63,
47.64, 47.48, 45.06, 42, 40.48, 39.66, 38.09, 37.72, 37.96, 38.39,
38.72, 38.56, 38.88, 38.43, 38.13, 37.31, 36.07, 34.64, 32.29,
29.87, 28.85, 29.56, 30.43, 30.54, 30.29, 30.26, 31, 32.3, 33.51,
34.37, 34.45, 35.4, 37.01, 38.48, 40.32, 39.76, 37.8, 36.75,
37.3, 37.59, 38.23, 38.44, 39, 39.15, 38.78, 38, 37.36, 36.66,
35.84, 36.06, 35.81, 35.99, 35.89, 34.71, 33.7, 33.32, 32.34,
31.45, 30.96, 31.22, 31.57, 32.17, 32.69, 33.1, 33.19, 33.65,
34.36, 35.1, 35.06, 34.51, 33.88, 33.52, 33.71, 34.43, 34.52,
34.16, 33.83, 33.77, 34.27, 35.08, 35.44, 35.48, 36.4, 36.67,
35.92, 34.82, 33.79, 32.77, 31.84, 31.5, 30.83, 30.34, 28.23,
26.03, 24.24, 24.15, 25.71, 27.98, 29.37, 27.66, 25.38, 24.34,
23.89, 23.56, 23.39, 24.24, 25.36, 25.48, 25.12, 24.05, 23.57,
22.86, 22.63, 22.65, 23.57, 25.13, 26.03, 27.02, 29.13, 29.47,
28.91, 28.93, 28.94, 29.88, 31.78, 33.47, 36.11, 41.11, 44.8,
44.9, 45.56, 47.78, 49.95, 46.69, 39.51, 38.41, 39.57, 39.4,
38.67, 37.45, 36.81, 39.2, 43.9, 45.44, 47.71, 50.35, 52.19)), row.names = c(NA,
-226L), class = c("tbl_df", "tbl", "data.frame"))
下面是我使用的代码:
library(readr)
library(tsibble)
library(dplyr)
library(tidyverse)
library(data.table)
not_all_na <- function(x) any(!is.na(x))
corrD <- df %>%
group_by(Year, Month) %>%
select(-Year, -Month) %>%
select_if(not_all_na) %>%
cor(use = "pairwise.complete.obs", method = "sp")
我也试过:
corrD <- df3 %>%
group_by(Year, Month) %>%
select(-Year, -Month) %>%
select_if(not_all_na) %>%
summarise(correlations = cor(., use = "pairwise.complete.obs", method = "sp"))
但这也行不通。
我想要的是:
- 对于每个年和月(即观察必须先按年分组,然后按月分组,然后再计算)数据中其余四个变量之间的相关性
- 如果任何变量中有 NA,则不应考虑这些
- 对于每个组,即年月(例如 2017-1 或 2017-2 等等),应该创建一个列表或单独的数据框,以便我能够下载它并用于进一步分析.
- 最终结果(相关性 table)不应包括年份和月份(在我的例子中这些仍然出现)
我的代码没有返回预期的结果。相反,它是这样的:
有人可以帮忙吗?
您可以return每年每个月的相关矩阵列表。
library(dplyr)
library(purrr)
not_all_na <- function(x) any(!is.na(x))
df %>%
group_split(Year, Month, .keep = FALSE) %>%
map(~.x %>%
select(where(not_all_na)) %>%
cor(use = "pairwise.complete.obs", method = "sp"))
#[[1]]
# NC 1234(BAR) TPI 2 1300(BAR) TPI 4 1300(BAR) TCC 1200(BAR)
#NC 1234(BAR) 1.0 -0.8 -0.4 -0.2
#TPI 2 1300(BAR) -0.8 1.0 0.8 0.4
#TPI 4 1300(BAR) -0.4 0.8 1.0 0.8
#TCC 1200(BAR) -0.2 0.4 0.8 1.0
#[[2]]
# NC 1234(BAR) TPI 2 1300(BAR) TPI 4 1300(BAR) TCC 1200(BAR)
#NC 1234(BAR) 1.0 1.0 0.8 -0.4
#TPI 2 1300(BAR) 1.0 1.0 0.8 -0.4
#TPI 4 1300(BAR) 0.8 0.8 1.0 -0.8
#TCC 1200(BAR) -0.4 -0.4 -0.8 1.0
#...
#...
定义要执行关联的变量,将它们存储在 vars_keep
中。然后使用 split
模拟 group_by
。 Select 感兴趣的变量并执行相关:
注意:split
函数将df
数据帧分成一组,returns数据帧列表。这些数据帧中的每一个都是由变量定义的原始数据帧的子集。
vars_keep <- names(df)[c(3, 4, 5,6)]
some <- data %>%
split(list(data$Year,data$Month)) %>%
map(select, vars_keep) %>%
map(cor)
这导致列表中每个拆分的相关列表 some
。
1 的输出:
some[[1]]
我有一个 daatframe df,其中包含变量:年、月和另外 4 个变量,如下所示:
structure(list(Year = c(2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2019L,
2019L, 2019L, 2019L, 2019L, 2019L, 2019L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L, 2020L,
2020L, 2020L, 2020L, 2020L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L), Month = c(1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L,
6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L,
9L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L,
12L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L,
8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 11L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L,
6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L,
9L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L,
12L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L,
7L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 10L, 10L, 10L, 10L, 10L,
11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L), `NC 1234(BAR)` = c(86.68,
81.52, 82.53, 84.17, 82.52, 79.11, 77.2, 81.13, 81.56, 78.6,
80.8, 81.09, 82.45, 88.19, 83.83, 82.77, 79.83, 77.42, 72.42,
73.45, 73.66, 74.36, 80.79, 83.66, 84.18, 82.46, 82.62, 84.03,
91.04, 92.28, 97, 95.4, 100.1, 101.83, 98.16, 98.15, 99.44, 95.53,
95.37, 97.88, 94.85, 98.25, 97.46, 99.31, 98.66, 95.21, 93.36,
96.84, 96.78, 102.59, 103.97, 103.88, 105.25, 106.27, 106.88,
107.42, 107.14, 102.47, 103.43, 110.6, 103.6, 100.02, 95.2, 92.41,
90.68, 92.43, 93.81, 93.26, 95.25, 101.35, 101.64, 107.75, 109.82,
108.89, 113.3, 118.09, 116.27, 117.26, 117.58, 118.49, 122.57,
120.7, 120.41, 118.29, 117.68, 117.39, 118.73, 114.48, 115.74,
113.01, 113.58, 113.98, 108.81, 107.05, 108.23, 103.08, 103.66,
103.59, 98.76, 99.95, 101.95, 102.4, 100.56, 102.55, 99.54, 97.61,
99.3, 99.78, 99.5, 96.92, 90.71, 94.64, 96.04, 96.7, 93.25, 90.51,
89.17, 78.3, 85.93, 89.34, 87.72, 86.91, 85.86, 83.24, 81.06,
78.71, 71.39, 72.72, 71.53, 68.81, 72.48, 76.51, 73.98, 72.88,
67.09, 68.88, 64.85, 62.5, 62.32, 65.38, 65.42, 65.94, 64.99,
65.71, 66.97, 67.43, 68.34, 68.22, 66.78, 67.28, 68.95, 67.62,
67.33, 66.31, 66.5, 66.19, 64.8, 68.67, 73.78, 68.45, 66.83,
66.86, 69.32, 68.21, 66.77, 65.99, 65.42, 66.57, 68.37, 68.33,
63.54, 58.59, 53.74, 50.76, 51.71, 50.83, 53.98, 54.9, 55.25,
52.75, 50.48, 51.17, 51.58, 52.72, 52.48, 51.72, 50.73, 51.42,
49.52, 50.19, 47.99, 48.17, 49.85, 53.66, 59.97, 58.21, 57.73,
54.42, 60.09, 60.35, 62.45, 62.53, 62.82, 68.22, 72.31, 77.91,
83.96, 85.31, 83.72, 80.78, 87.98, 87.92, 90.46, 85.65, 88.4,
85.79, 82.16, 90.84, 90.3, 92.54, 103.95, 99.38, 92.7, 91.75,
90.89, 91.71), `TPI 2 1300(BAR)` = c(86.34, 88.75, 90.78, 88.04,
83.64, 82.25, 80.36, 82.78, 78.15, 74.46, 73.23, 71.04, 69.87,
77.56, 75.23, 74.85, 74.06, 73.66, 72.45, 74.6, 77.4, 77.37,
78.99, 78.87, 80.26, 80.35, 83.83, 84.41, 83.04, 82.48, 83.27,
84.38, 86.28, 87.34, 88.35, 89.36, 92.84, 94.2, 92.02, 91.99,
90.33, 93.2, 92.55, 94.94, 94.69, 94.14, 92.04, 91.9, 93.28,
95.94, 95.57, 95.83, 96.2, 95.76, 95.14, 93.91, 90.5, 84.11,
81.86, 86.61, 80.74, 80.89, 80.93, 78.46, 76.25, 80.09, 82.18,
81.75, 83.19, 85.47, 88.22, 92.79, 91.38, 94.8, 95.79, 97.33,
97.26, 96.99, 99.93, 102.71, 101.8, 98.8, 94.56, 96.51, 99.43,
98.16, 99.56, 100.64, 100.45, 99.38, 101.23, 101.45, 100.01,
98.93, 100.74, 97.66, 92.71, 88.39, 81.37, 82.3, 86.07, 88.1,
89.46, 86.18, 84.34, 81.21, 81.88, 79.77, 79.13, 74.25, 71.47,
72.87, 74.9, 73.64, 70.3, 67.29, 61.74, 54.62, 60.16, 59.64,
58.82, 58.1, 59.35, 58.2, 55.48, 51.12, 50.55, 49.65, 47.79,
47.68, 53.54, 59.03, 59.34, 61.55, 58.65, 55.44, 54.95, 53.43,
48.68, 56.82, 62.25, 62.89, 59.49, 59.82, 60.94, 60.41, 57.99,
56.22, 56.29, 54.7, 55.72, 57.4, 55.58, 53.06, 52.73, 53.16,
52.09, 52.81, 51.05, 47.67, 48.35, 48.78, 49.23, 48.99, 46.1,
47.25, 45.68, 48.03, 50.63, 48.1, 45.64, 44.61, 41.46, 38.26,
39.33, 38.26, 38.8, 38.27, 45.99, 44.91, 44.64, 46.82, 50.68,
50.04, 48.73, 49.94, 50.15, 51.45, 49.68, 47.58, 47.42, 51.78,
51.61, 52.55, 53.32, 57.34, 57.77, 56.17, 56.39, 54.19, 51.41,
52.28, 53.98, 57.48, 62.02, 64.56, 68.58, 69.71, 68.99, 65.75,
70.31, 66.44, 67.51, 64.86, 68.52, 64.48, 65.7, 66.35, 65.96,
69.21, 72.25, 69.9, 68.38, 70.8, 73.92, 75.79), `TPI 4 1300(BAR)` = c(82,
84.25, 89.48, 88.25, 85.97, 83.48, 81.23, 82.36, 83.24, 80.1,
78.22, 76.19, 75.87, 80.31, 77.92, 76.95, 74.14, 74.2, 71.3,
71.98, 72.2, 74.56, 79.84, 79.04, 77.41, 77.16, 82.04, 82.51,
82.36, 79.45, 85.16, 86.65, 87.43, 87.46, 89.36, 89.86, 94.31,
93.33, 90.76, 90.73, 90.13, 91.5, 91.61, 94.13, 92.67, 89.73,
88.64, 94.41, 94.5, 95.58, 95.09, 94.48, 95.81, 97.29, 97.77,
98.36, 96.27, 90.1, 93.41, 98.34, 94.42, 92.24, 90.81, 90.2,
89.32, 91.75, 94.34, 95.37, 95.09, 98.17, 102.66, 104.7, 103.64,
103.98, 105.38, 105.81, 103.34, 103.79, 107.84, 108.95, 108.19,
106.57, 103.43, 101.62, 98.16, 95.66, 94.91, 97.35, 100.3, 97.33,
98.01, 100.75, 99.55, 98.52, 98.99, 98.38, 96.69, 91.65, 85.98,
89.27, 93.37, 95.42, 97.69, 96.87, 94.07, 90.23, 90.88, 89.33,
88.72, 84.06, 77.46, 78.79, 82.3, 80.44, 75.19, 71.33, 67.96,
65.52, 70.74, 69.72, 68.32, 70.42, 71.54, 66.47, 62.08, 60.18,
62.1, 63.98, 62.19, 64.32, 65.41, 65.85, 65.36, 66.1, 63.06,
59.62, 59.54, 57.89, 56.25, 58.97, 59.87, 60.69, 58.73, 62.48,
68.18, 69.58, 70.18, 67.38, 70.55, 73.48, 81.09, 86.66, 82.2,
82.33, 84.09, 87.48, 77.37, 88.17, 92.28, 85.54, 87.73, 86.9,
85.51, 81.65, 76.05, 70.4, 62.12, 61.76, 66.71, 71.04, 59.66,
51.58, 44.56, 49.49, 51.87, 53.15, 54.88, 55.26, 59.22, 55.3,
54.2, 50.09, 53.13, 53.6, 54.8, 53.34, 52.27, 54.01, 54.19, 56.4,
55.78, 57.08, 55.83, 56.21, 58.98, 61.7, 59.93, 57.18, 57.37,
59.76, 61.34, 67.09, 69.74, 77.7, 81.86, 84.32, 97.91, 97.32,
91.24, 87.68, 95.43, 90.13, 87.67, 83.9, 90.14, 89.31, 85.65,
90.91, 91.61, 96.39, 100.02, 96.7, 96.03, 89.93, 90.58, 93.07
), `TCC 1200(BAR)` = c(37.25, 37.76, 39.13, 41, 42.69, 43.31,
43.86, 43.97, 44.6, 45.11, 45.56, 45.72, 46.01, 45.47, 44.69,
42.19, 39.28, 38.11, 37.2, 36.2, 36.7, 37.3, 39.11, 41.59, 42.04,
41.65, 40.91, 40.8, 41.85, 42.28, 42.55, 42.52, 42.41, 42.75,
43.02, 43.85, 44.75, 46.03, 46.35, 46.37, 46.41, 46.8, 46.44,
46.09, 44.88, 44.16, 43.99, 44.2, 45.38, 46.09, 46.34, 46.81,
47.31, 48.02, 48.59, 48.84, 49.58, 50.16, 50.32, 51.04, 49.98,
48.45, 46.12, 43.55, 42.19, 41.48, 41.08, 41.13, 41.85, 42.79,
44.02, 45.55, 45.85, 46.82, 47.74, 48.08, 48.71, 47.82, 47.63,
47.64, 47.48, 45.06, 42, 40.48, 39.66, 38.09, 37.72, 37.96, 38.39,
38.72, 38.56, 38.88, 38.43, 38.13, 37.31, 36.07, 34.64, 32.29,
29.87, 28.85, 29.56, 30.43, 30.54, 30.29, 30.26, 31, 32.3, 33.51,
34.37, 34.45, 35.4, 37.01, 38.48, 40.32, 39.76, 37.8, 36.75,
37.3, 37.59, 38.23, 38.44, 39, 39.15, 38.78, 38, 37.36, 36.66,
35.84, 36.06, 35.81, 35.99, 35.89, 34.71, 33.7, 33.32, 32.34,
31.45, 30.96, 31.22, 31.57, 32.17, 32.69, 33.1, 33.19, 33.65,
34.36, 35.1, 35.06, 34.51, 33.88, 33.52, 33.71, 34.43, 34.52,
34.16, 33.83, 33.77, 34.27, 35.08, 35.44, 35.48, 36.4, 36.67,
35.92, 34.82, 33.79, 32.77, 31.84, 31.5, 30.83, 30.34, 28.23,
26.03, 24.24, 24.15, 25.71, 27.98, 29.37, 27.66, 25.38, 24.34,
23.89, 23.56, 23.39, 24.24, 25.36, 25.48, 25.12, 24.05, 23.57,
22.86, 22.63, 22.65, 23.57, 25.13, 26.03, 27.02, 29.13, 29.47,
28.91, 28.93, 28.94, 29.88, 31.78, 33.47, 36.11, 41.11, 44.8,
44.9, 45.56, 47.78, 49.95, 46.69, 39.51, 38.41, 39.57, 39.4,
38.67, 37.45, 36.81, 39.2, 43.9, 45.44, 47.71, 50.35, 52.19)), row.names = c(NA,
-226L), class = c("tbl_df", "tbl", "data.frame"))
下面是我使用的代码:
library(readr)
library(tsibble)
library(dplyr)
library(tidyverse)
library(data.table)
not_all_na <- function(x) any(!is.na(x))
corrD <- df %>%
group_by(Year, Month) %>%
select(-Year, -Month) %>%
select_if(not_all_na) %>%
cor(use = "pairwise.complete.obs", method = "sp")
我也试过:
corrD <- df3 %>%
group_by(Year, Month) %>%
select(-Year, -Month) %>%
select_if(not_all_na) %>%
summarise(correlations = cor(., use = "pairwise.complete.obs", method = "sp"))
但这也行不通。
我想要的是:
- 对于每个年和月(即观察必须先按年分组,然后按月分组,然后再计算)数据中其余四个变量之间的相关性
- 如果任何变量中有 NA,则不应考虑这些
- 对于每个组,即年月(例如 2017-1 或 2017-2 等等),应该创建一个列表或单独的数据框,以便我能够下载它并用于进一步分析.
- 最终结果(相关性 table)不应包括年份和月份(在我的例子中这些仍然出现)
我的代码没有返回预期的结果。相反,它是这样的:
有人可以帮忙吗?
您可以return每年每个月的相关矩阵列表。
library(dplyr)
library(purrr)
not_all_na <- function(x) any(!is.na(x))
df %>%
group_split(Year, Month, .keep = FALSE) %>%
map(~.x %>%
select(where(not_all_na)) %>%
cor(use = "pairwise.complete.obs", method = "sp"))
#[[1]]
# NC 1234(BAR) TPI 2 1300(BAR) TPI 4 1300(BAR) TCC 1200(BAR)
#NC 1234(BAR) 1.0 -0.8 -0.4 -0.2
#TPI 2 1300(BAR) -0.8 1.0 0.8 0.4
#TPI 4 1300(BAR) -0.4 0.8 1.0 0.8
#TCC 1200(BAR) -0.2 0.4 0.8 1.0
#[[2]]
# NC 1234(BAR) TPI 2 1300(BAR) TPI 4 1300(BAR) TCC 1200(BAR)
#NC 1234(BAR) 1.0 1.0 0.8 -0.4
#TPI 2 1300(BAR) 1.0 1.0 0.8 -0.4
#TPI 4 1300(BAR) 0.8 0.8 1.0 -0.8
#TCC 1200(BAR) -0.4 -0.4 -0.8 1.0
#...
#...
定义要执行关联的变量,将它们存储在 vars_keep
中。然后使用 split
模拟 group_by
。 Select 感兴趣的变量并执行相关:
注意:split
函数将df
数据帧分成一组,returns数据帧列表。这些数据帧中的每一个都是由变量定义的原始数据帧的子集。
vars_keep <- names(df)[c(3, 4, 5,6)]
some <- data %>%
split(list(data$Year,data$Month)) %>%
map(select, vars_keep) %>%
map(cor)
这导致列表中每个拆分的相关列表 some
。
1 的输出:
some[[1]]