基于另一个变量的变量的汇总统计
summary statistics for a variable based on another variable
我试图找到 ID 中重复某些值的 x 值的数量,然后根据新结果找到总体上的最小值、最大值、IQR 和中值;
ID <- c("ID004", "ID004", "ID004", "ID004", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID009", "ID009", "ID009", "ID009", "ID009", "ID009", "ID020", "ID020")
D <- c("CMP-001", "CMP-001","CMP-001","CMP-001","CMP-001", "CMP-001","CMP-002", "CMP-002", "CMP-002", "CMP-003", "CMP-003", "CMP-003", "CMP-004", "CMP-004", "CMP-004", "CMP-001", "CMP-001", "CMP-001", "CMP-001", "CMP-002", "CMP-002", "CMP-001", "CMP-001")
X <- c(3,3,3,3,1,1,3,3,3,1,1,1,4,4,4,4,4,4,4,2,2,2,2)
data <- data.frame(ID, D, X)
我们首先找出每个ID有多少个x值;
ID. No. of X values
ID004. 1
ID006. 4
ID009 2
ID020 1
那么根据这个结果我们应该得到下面的结果;
Min. Median. Max. IQR
Number of X per ID 1 1.5 4 3-1
我认为我们需要创建一个新变量,其中包含每个 ID 的 X 值。然后找到新变量的夏季统计数据
感谢您的帮助
希望这个回答:
> data %>% group_by(ID) %>% summarise(Min = min(X), Median = median(X), Max = max(X), IQR = IQR(X), No_of_X_values = length(rle(X)[[1]]))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 4 x 6
ID Min Median Max IQR No_of_X_values
<chr> <dbl> <dbl> <dbl> <dbl> <int>
1 ID004 3 3 3 0 1
2 ID006 1 3 4 2.5 4
3 ID009 2 4 4 1.5 2
4 ID020 2 2 2 0 1
>
可以在新数据框中存储 ID 和 x 值的数量,并对 x 值的数量进行汇总统计:
> x_values <- data %>% group_by(ID) %>% summarise(No_of_X_values = length(rle(X)[[1]]))
`summarise()` ungrouping output (override with `.groups` argument)
> x_values
# A tibble: 4 x 2
ID No_of_X_values
<chr> <int>
1 ID004 1
2 ID006 4
3 ID009 2
4 ID020 1
> summary(x_values$No_of_X_values)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 1.0 1.5 2.0 2.5 4.0
我试图找到 ID 中重复某些值的 x 值的数量,然后根据新结果找到总体上的最小值、最大值、IQR 和中值;
ID <- c("ID004", "ID004", "ID004", "ID004", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID006", "ID009", "ID009", "ID009", "ID009", "ID009", "ID009", "ID020", "ID020")
D <- c("CMP-001", "CMP-001","CMP-001","CMP-001","CMP-001", "CMP-001","CMP-002", "CMP-002", "CMP-002", "CMP-003", "CMP-003", "CMP-003", "CMP-004", "CMP-004", "CMP-004", "CMP-001", "CMP-001", "CMP-001", "CMP-001", "CMP-002", "CMP-002", "CMP-001", "CMP-001")
X <- c(3,3,3,3,1,1,3,3,3,1,1,1,4,4,4,4,4,4,4,2,2,2,2)
data <- data.frame(ID, D, X)
我们首先找出每个ID有多少个x值;
ID. No. of X values
ID004. 1
ID006. 4
ID009 2
ID020 1
那么根据这个结果我们应该得到下面的结果;
Min. Median. Max. IQR
Number of X per ID 1 1.5 4 3-1
我认为我们需要创建一个新变量,其中包含每个 ID 的 X 值。然后找到新变量的夏季统计数据
感谢您的帮助
希望这个回答:
> data %>% group_by(ID) %>% summarise(Min = min(X), Median = median(X), Max = max(X), IQR = IQR(X), No_of_X_values = length(rle(X)[[1]]))
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 4 x 6
ID Min Median Max IQR No_of_X_values
<chr> <dbl> <dbl> <dbl> <dbl> <int>
1 ID004 3 3 3 0 1
2 ID006 1 3 4 2.5 4
3 ID009 2 4 4 1.5 2
4 ID020 2 2 2 0 1
>
可以在新数据框中存储 ID 和 x 值的数量,并对 x 值的数量进行汇总统计:
> x_values <- data %>% group_by(ID) %>% summarise(No_of_X_values = length(rle(X)[[1]]))
`summarise()` ungrouping output (override with `.groups` argument)
> x_values
# A tibble: 4 x 2
ID No_of_X_values
<chr> <int>
1 ID004 1
2 ID006 4
3 ID009 2
4 ID020 1
> summary(x_values$No_of_X_values)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.0 1.0 1.5 2.0 2.5 4.0