如何分隔变量 "race" 以制作另一个变量的直方图

How to separate variable "race" to make histograms for another variable

我有以下数据框:

head(SB_xlsx)

##   patnum hospstay    lowph pltct  race  bwt gest        inout twn lol magsulf
## 1      1       34       NA   100 white 1250   35 born at duke   0  NA      NA
## 2      2        9 7.250000   244 white 1370   32 born at duke   0  NA      NA
## 3      3       -2 7.059998   114 black  620   23 born at duke   0  NA      NA
## 4      4       40 7.250000   182 black 1480   32 born at duke   0  NA      NA
## 5      5        2 6.969997    54 black  925   28 born at duke   0  NA      NA
## 6      6       62 7.189999    NA white  940   28 born at duke   0  NA      NA
##   meth toc  delivery apg1 vent pneumo pda cld    sex dead
## 1    0   0 abdominal    8    0      0   0   0 female    0
## 2    1   0 abdominal    7    0      0   0   0 female    0
## 3    0   1   vaginal    1    1      0   0  NA female    1
## 4    1   0   vaginal    8    0      0   0   0   male    0
## 5    0   0 abdominal    5    1      1   0   0 female    1
## 6    1   0 abdominal    8    1      0   0   0 female    0

我需要创建 4 个直方图,将出生体重 (bwt) 与“种族”变量下的 4 个种族(“白人”、“黑人”、“美洲原住民”和“东方人”)中的每一个进行比较。我需要如何分离种族才能制作 bwt 的直方图?

无论种族如何,我都必须为所有 bwts 制作一个直方图,代码如下所示。我知道如何制定直方图,但我不确定如何区分种族以便我可以制作 4 个种族特定的。

hist(SB_xlsx$bwt, ylab="frequency", xlab="Birth Weight", main="Histogram of Birth Weight")

您正在寻找的答案非常简单,它在包 library(dplyr) 中使用命令 filter()

因为我没有你的数据集,所以我做了一个,有四个种族,在“bwt”栏下,另一栏,我称之为“身高”,这就是你制作直方图的内容

# Create an empty data frame with column names
example_df <- data.frame( "bwt" = character(0), "height" = integer(0))
#Assign names to x 
variable_names <- c( "Black", "White", "Native-American", "Other")
# Assign names to y
w<-rnorm(200, mean=5, sd=2)
x<-rnorm(200, mean=5, sd=2)
y<-rnorm(200, mean=5, sd=2)
z<-rnorm(200, mean=5, sd=2)
#combine everything to create dataframe (df)
df <- data.frame( "btw" = variable_names, "height" = c(w,x,y,z))
attach(df)
#load necessary package
library(dplyr)
# Use the filter() command to select the races you want
black = filter(df,btw=="Black")
white = filter(df,btw=="White")
native = filter(df,btw=="Native-American")
other = filter(df,btw=="Other")
# make histograms (I will only upload one image as an example here)
hist(black$height)
hist(white$height)
hist(native$height)
hist(other$height)

[![在此处输入图片描述][1]][1]

# you can also plot all 4 in one window

par(mfrow = c(2, 2), cex = 1)

hist(black$height,col="blue")
hist(white$height, col="green")
hist(native$height, col="orange")
hist(other$height, col="grey23")

'''

[![enter image description here][2]][2]


  [1]: https://i.stack.imgur.com/OuWkx.png
  [2]: https://i.stack.imgur.com/t5rDp.png