为 r 中的检测概率创建箱线图
Create boxplot for detection probabilities in r
我有一个类似于此的数据集:
sex
observed
date
idtag
M
0
10/20/2019
12
M
0
10/20/2019
12
F
0
10/20/2019
21
F
0
10/20/2019
21
M
0
10/21/2019
12
M
1
10/21/2019
14
F
0
10/21/2019
21
M
1
10/21/2019
14
M
1
10/21/2019
14
F
1
10/21/2019
21
M
0
10/23/2019
12
M
0
10/23/2019
12
F
0
10/23/2019
21
F
0
10/23/2019
22
M
0
10/23/2019
14
M
1
10/23/2019
12
F
0
10/23/2019
22
M
1
10/23/2019
14
M
1
10/23/2019
12
我想创建一个按性别划分的检测率箱线图。即,我想比较(每个性别 1s 的 sex/number 的观察总数)。我使用此代码按性别计算检出率:
drrate_sex <- detectiondata %>%
group_by(sex) %>%
summarise(dr = mean(observed))
这是我通常使用的标准箱线图代码:
boxplot(? ~ sex, data=drdata, main="Detection by sex",
xlab="Sex", ylab="Detection rate (%)", notch=T, par(mar=c(4,12,4,12)))
我不确定如何将检测率(我在代码中输入 ? 的位置)合并到代码中以在 r 中生成一个箱线图来比较女性和男性的检测率。任何帮助将不胜感激。
对于此类数据,最接近箱线图的可能是比例图,其中置信区间由误差条显示。
获得所需数字的一种方法是 运行 逻辑回归,并使用其中的系数来获得比例的置信区间:
# Logistic regression model
regression <- glm(observed ~ sex, data = detectiondata, family = binomial)
# Extract coefficients
coefs <- summary(regression)$coef
# Convert coefficients to odds
coefs[2, 1] <- coefs[1, 1] + coefs[2, 1]
odds <- exp(cbind(mean = coefs[,1],
upper = coefs[,1] + 1.96 * coefs[,2],
lower = coefs[,1] - 1.96 * coefs[,2]))
# Convert odds to probabilities
probs <- odds/(1 + odds)
# Create data frame and add column for sex
df <- as.data.frame(probs)
df$sex <- c("Female", "Male")
给出以下数据框:
df
#> mean upper lower sex
#> (Intercept) 0.1428571 0.5806110 0.01966987 Female
#> sexM 0.5000000 0.9168654 0.08313460 Male
我们可以这样绘制:
library(ggplot2)
ggplot(df, aes(sex, mean)) +
geom_errorbar(aes(ymin = lower, ymax = upper, color = sex), size = 3,
width = 0.3) +
geom_point(size = 4, shape = 21, fill = "black")
数据
detectiondata <- structure(list(sex = c("M", "M", "F", "F", "M", "M", "F", "M",
"M", "F", "M", "M", "F", "F", "M", "M", "F", "M", "M"), observed = c(0L,
0L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L,
1L, 1L), date = c("10/20/2019", "10/20/2019", "10/20/2019", "10/20/2019",
"10/21/2019", "10/21/2019", "10/21/2019", "10/21/2019", "10/21/2019",
"10/21/2019", "10/23/2019", "10/23/2019", "10/23/2019", "10/23/2019",
"10/23/2019", "10/23/2019", "10/23/2019", "10/23/2019", "10/23/2019"
)), class = "data.frame", row.names = c(NA, -19L))
由 reprex package (v2.0.0)
于 2021-11-05 创建
我有一个类似于此的数据集:
sex | observed | date | idtag |
---|---|---|---|
M | 0 | 10/20/2019 | 12 |
M | 0 | 10/20/2019 | 12 |
F | 0 | 10/20/2019 | 21 |
F | 0 | 10/20/2019 | 21 |
M | 0 | 10/21/2019 | 12 |
M | 1 | 10/21/2019 | 14 |
F | 0 | 10/21/2019 | 21 |
M | 1 | 10/21/2019 | 14 |
M | 1 | 10/21/2019 | 14 |
F | 1 | 10/21/2019 | 21 |
M | 0 | 10/23/2019 | 12 |
M | 0 | 10/23/2019 | 12 |
F | 0 | 10/23/2019 | 21 |
F | 0 | 10/23/2019 | 22 |
M | 0 | 10/23/2019 | 14 |
M | 1 | 10/23/2019 | 12 |
F | 0 | 10/23/2019 | 22 |
M | 1 | 10/23/2019 | 14 |
M | 1 | 10/23/2019 | 12 |
我想创建一个按性别划分的检测率箱线图。即,我想比较(每个性别 1s 的 sex/number 的观察总数)。我使用此代码按性别计算检出率:
drrate_sex <- detectiondata %>%
group_by(sex) %>%
summarise(dr = mean(observed))
这是我通常使用的标准箱线图代码:
boxplot(? ~ sex, data=drdata, main="Detection by sex",
xlab="Sex", ylab="Detection rate (%)", notch=T, par(mar=c(4,12,4,12)))
我不确定如何将检测率(我在代码中输入 ? 的位置)合并到代码中以在 r 中生成一个箱线图来比较女性和男性的检测率。任何帮助将不胜感激。
对于此类数据,最接近箱线图的可能是比例图,其中置信区间由误差条显示。
获得所需数字的一种方法是 运行 逻辑回归,并使用其中的系数来获得比例的置信区间:
# Logistic regression model
regression <- glm(observed ~ sex, data = detectiondata, family = binomial)
# Extract coefficients
coefs <- summary(regression)$coef
# Convert coefficients to odds
coefs[2, 1] <- coefs[1, 1] + coefs[2, 1]
odds <- exp(cbind(mean = coefs[,1],
upper = coefs[,1] + 1.96 * coefs[,2],
lower = coefs[,1] - 1.96 * coefs[,2]))
# Convert odds to probabilities
probs <- odds/(1 + odds)
# Create data frame and add column for sex
df <- as.data.frame(probs)
df$sex <- c("Female", "Male")
给出以下数据框:
df
#> mean upper lower sex
#> (Intercept) 0.1428571 0.5806110 0.01966987 Female
#> sexM 0.5000000 0.9168654 0.08313460 Male
我们可以这样绘制:
library(ggplot2)
ggplot(df, aes(sex, mean)) +
geom_errorbar(aes(ymin = lower, ymax = upper, color = sex), size = 3,
width = 0.3) +
geom_point(size = 4, shape = 21, fill = "black")
数据
detectiondata <- structure(list(sex = c("M", "M", "F", "F", "M", "M", "F", "M",
"M", "F", "M", "M", "F", "F", "M", "M", "F", "M", "M"), observed = c(0L,
0L, 0L, 0L, 0L, 1L, 0L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 0L,
1L, 1L), date = c("10/20/2019", "10/20/2019", "10/20/2019", "10/20/2019",
"10/21/2019", "10/21/2019", "10/21/2019", "10/21/2019", "10/21/2019",
"10/21/2019", "10/23/2019", "10/23/2019", "10/23/2019", "10/23/2019",
"10/23/2019", "10/23/2019", "10/23/2019", "10/23/2019", "10/23/2019"
)), class = "data.frame", row.names = c(NA, -19L))
由 reprex package (v2.0.0)
于 2021-11-05 创建