如何计算S.D。 R中数据框中的每个组并按组绘制

Question

我的数据框为：

T_ID   S1    S2
1      21    26
1      20    25
1      21    22
2      20    53
2      41    62
2      30    23

我按照 T_ID 在同一张图中绘制了 S1 和 S2（T_ID“1”有一种颜色，T_ID“2”有一种颜色等等。 .).现在我想在同一张图中根据 T_IDs 绘制标准偏差。我不知道该怎么做..

Answer 1

我提供的答案使用三个包：tidyr、dplyr和ggplot2。这有点老套，但我认为它可以为您提供所需的输出，但需要您将数据转换为正确的格式并计算所需的值以在绘制之前定义美学——这是 ggplot2 的典型做法。如果其他人有更简单的方法来做到这一点，我很乐意看到它，现在这是我根据提供的数据所能想到的最好方法。

首先，将你的数据转化为正确的格式（假设你的数据被称为"df"），计算每个时间点（T1，T2）和组（S1，S2）的均值和标准差，然后绘制条形图，误差条代表平均值 +/- SD。

require(tidyr)
require(dplyr)
require(ggplot2)

df2 <- df %>% gather(group, measurement, S1:S2)
df3 <- df2 %>% group_by(T_ID, group) %>% mutate(sd = sd(measurement), m = mean(measurement))
gg1 <- ggplot(df3, aes(x=group, y=measurement, fill=factor(T_ID)))
gg1 + geom_bar(width=0.4, position=position_dodge(width=0.5), stat="identity")+geom_errorbar(aes(ymin=m-sd, ymax=m+sd), position=position_dodge(width=0.5), width=0.4, size=0.1)

给你以下

编辑以下 OPS 规范

第一次尝试没有成功。

df4 <- df %>% group_by(T_ID) %>% mutate(SD1 = sd(S1)) %>% mutate(SD2 = sd(S2)) %>% mutate(mean_s1 = mean(S1)) %>% mutate(mean_s2 = mean(S2))

df4
Source: local data frame [6 x 7]
Groups: T_ID

  T_ID S1 S2        SD1       SD2  mean_s1  mean_s2
1    1 21 26  0.5773503  2.081666 20.66667 24.33333
2    1 20 25  0.5773503  2.081666 20.66667 24.33333
3    1 21 22  0.5773503  2.081666 20.66667 24.33333
4    2 20 53 10.5039675 20.420578 30.33333 46.00000
5    2 41 62 10.5039675 20.420578 30.33333 46.00000
6    2 30 23 10.5039675 20.420578 30.33333 46.00000

gg2 <- ggplot(df4, aes(x=S1, y=S2, fill=factor(T_ID)))
gg2 + geom_point(aes(col=factor(T_ID)))+geom_errorbar(aes(ymin=mean_s1-SD1, ymax=mean_s1+SD2))+geom_errorbarh(aes(xmin=mean_s2-SD2, xmax=mean_s2+S2))

### this doesn't really work...too many error bars mapping all over the place

#create a new data-frame with plotting coordinates for geom_errobar; I tried this because in the help menu it said you could provide a new df to geom_errorbar() to overide plotting aesthetics, but

df2 <- df %>% group_by(T_ID) %>% summarise(mean_s1=mean(S1), sd_s1=sd(S1), mean_s2=mean(S2), sd_s2=sd(S2))
gg2 <- ggplot(df, aes(x=S1, y=S2, group=factor(T_ID), colour=factor(T_ID)))
gg2 + geom_point()+geom_errorbar(aes(ymax=mean_s1+sd_s1, ymin=mean_s1-sd_s1), data=df2)
Error in eval(expr, envir, enclos) : object 'S1' not found

# doesn't work

第二次尝试。

编辑 - OP 问题的可能解决方案

df4 <- df %>% group_by(T_ID) %>% mutate(SD1 = sd(S1)) %>% mutate(SD2 = sd(S2)) %>% mutate(mean_s1 = mean(S1)) %>% mutate(mean_s2 = mean(S2))

    df4
    Source: local data frame [6 x 7]
    Groups: T_ID

      T_ID S1 S2        SD1       SD2  mean_s1  mean_s2
    1    1 21 26  0.5773503  2.081666 20.66667 24.33333
    2    1 20 25  0.5773503  2.081666 20.66667 24.33333
    3    1 21 22  0.5773503  2.081666 20.66667 24.33333
    4    2 20 53 10.5039675 20.420578 30.33333 46.00000
    5    2 41 62 10.5039675 20.420578 30.33333 46.00000
    6    2 30 23 10.5039675 20.420578 30.33333 46.00000
gg2 <- ggplot(df4, aes(x=S1, y=S2, fill=factor(T_ID)))
gg2 + geom_point(aes(col=factor(T_ID)))+
+     geom_errorbar(aes(x=mean_s2, y=mean_s1, ymin=mean_s1-SD1,ymax=mean_s1+SD2, colour=factor(T_ID)))+geom_errorbarh(aes(x=mean_s2, y=mean_s1, xmin=mean_s1-SD1, xmax=mean_s1+SD2, colour=factor(T_ID)))

为您提供下面的图表，其中误差线是根据经度和纬度绘制的。我收集了你的真实数据，误差线会更美观。

Answer 2

下面是一个简单的解决方案，假设您的数据框被命名为 df1

means <- apply(df1[,2:3], 2, tapply, df1[,1], mean)
sds <- apply(df1[,2:3], 2, tapply, df1[,1], sd)
m <- barplot(means, beside=TRUE, ylim=c(0, 60), legend=TRUE)
segments(m, means - sds, m, means + sds, lwd=2)

这给了我们：

如何计算S.D。 R中数据框中的每个组并按组绘制

how to calculate S.D. per group in a dataframe in R and plotting it groupwise

csv

r

scatter-plot

编辑以下 OPS 规范

编辑 - OP 问题的可能解决方案