标准化 R 中每行的数据
Normalize data per row in R
如何 scale/normalize 我的每行数据(观察)?像 [-1:1] 这样的东西像 z 分数?
我看过以前的 post 涉及像这样对整个数据集进行归一化 https://stats.stackexchange.com/questions/178626/how-to-normalize-data-between-1-and-1
, 但我喜欢对每一行进行归一化,这样它们就可以绘制在同一个箱形图中,因为它们在 x 轴上都显示相同的模式。
Obs <- c("A", "B", "C")
count1 <- c(100,15,3)
count2 <- c(250, 30, 5)
count3 <- c(290, 20, 8)
count4<- c(80,12, 2 )
df <- data.frame(Obs, count1, count2, count3, count4)
dff<- df %>% pivot_longer(cols = !Obs, names_to = 'count', values_to = 'Value')
ggplot(dff, aes(x = count, y = Value)) +
geom_jitter(alpha = 0.1, color = "tomato") +
geom_boxplot()
根据您分享的link,您可以使用apply
使用相应的函数在[-1,1]上重新缩放dataframe。
library(scales)
library(ggplot2)
library(tidyr)
Obs <- c("A", "B", "C")
count1 <- c(100,15,3)
count2 <- c(250, 30, 5)
count3 <- c(290, 20, 8)
count4<- c(80,12, 2 )
df <- data.frame(count1, count2, count3, count4)
df <- as.data.frame(t(apply(df, 1, function(x)(2*(x-min(x))/(max(x)-min(x)))- 1)))
df <- cbind(Obs, df)
dff<- df %>%
tidyr::pivot_longer(cols = !Obs, names_to = 'count', values_to = 'Value')
ggplot(dff, aes(x = count, y = Value)) +
geom_jitter(alpha = 0.1, color = "tomato") +
geom_boxplot()
控制台输出:
如果你把它旋转得更长一些,你可以根据你的观察结果和比例进行分组:
df %>%
pivot_longer(cols = !Obs, names_to = 'count', values_to = 'Value') %>% group_by(Obs) %>%
mutate(z=as.numeric(scale(Value))) %>%
ggplot(aes(x=count,y=z))+geom_boxplot()
或者在 base R 中,只需执行:
boxplot(t(scale(t(df[,-1]))))
如何 scale/normalize 我的每行数据(观察)?像 [-1:1] 这样的东西像 z 分数?
我看过以前的 post 涉及像这样对整个数据集进行归一化 https://stats.stackexchange.com/questions/178626/how-to-normalize-data-between-1-and-1 , 但我喜欢对每一行进行归一化,这样它们就可以绘制在同一个箱形图中,因为它们在 x 轴上都显示相同的模式。
Obs <- c("A", "B", "C")
count1 <- c(100,15,3)
count2 <- c(250, 30, 5)
count3 <- c(290, 20, 8)
count4<- c(80,12, 2 )
df <- data.frame(Obs, count1, count2, count3, count4)
dff<- df %>% pivot_longer(cols = !Obs, names_to = 'count', values_to = 'Value')
ggplot(dff, aes(x = count, y = Value)) +
geom_jitter(alpha = 0.1, color = "tomato") +
geom_boxplot()
根据您分享的link,您可以使用apply
使用相应的函数在[-1,1]上重新缩放dataframe。
library(scales)
library(ggplot2)
library(tidyr)
Obs <- c("A", "B", "C")
count1 <- c(100,15,3)
count2 <- c(250, 30, 5)
count3 <- c(290, 20, 8)
count4<- c(80,12, 2 )
df <- data.frame(count1, count2, count3, count4)
df <- as.data.frame(t(apply(df, 1, function(x)(2*(x-min(x))/(max(x)-min(x)))- 1)))
df <- cbind(Obs, df)
dff<- df %>%
tidyr::pivot_longer(cols = !Obs, names_to = 'count', values_to = 'Value')
ggplot(dff, aes(x = count, y = Value)) +
geom_jitter(alpha = 0.1, color = "tomato") +
geom_boxplot()
控制台输出:
如果你把它旋转得更长一些,你可以根据你的观察结果和比例进行分组:
df %>%
pivot_longer(cols = !Obs, names_to = 'count', values_to = 'Value') %>% group_by(Obs) %>%
mutate(z=as.numeric(scale(Value))) %>%
ggplot(aes(x=count,y=z))+geom_boxplot()
或者在 base R 中,只需执行:
boxplot(t(scale(t(df[,-1]))))