Frequency/count r studio 中的变量

Question

长期潜伏者，通常使用 SPSS/graphpad 进行统计，慢慢地但肯定会努力学习如何使用 R studio。

在 SPSS 中，我有一个包含三个变量的数据集：保险（分类，4 个级别）； npo_violation（分类，2 个级别）和频率（刻度，这表示例如医疗补助 did/did 不违反 npo 的频率）。example dataset in SPSS

我正在尝试将这个带有频率计数变量的数据集引入 r-studio，以便我可以根据组合的百分比制作分组条形图。

我用foreign/haven/Hmisc把它带进了r studio，也自己制作了

df_sample <- data.frame(insurance = c("Medicaid", "Medicaid", "Blue Cross", "Blue Cross",
                                      "Managed Care", "Managed Care",
                                      "Other", "Other"), 
                        npo_violation=c("No", "Yes",
                                        "No", "Yes",
                                        "No", "Yes",
                                        "No", "Yes"),
                        wt=c(18075, 438, 14691, 109, 6006, 53, 3098, 25))

我不确定如何使 count/frequency 变量可用于计算每个分类组合的 percentage/count。因此，例如，计算（然后绘制）"medicaid+no npo violation" 和 "medicaid+yes npo violation" 的百分比我试过使用 wtd.table 函数

wtd.table(df_sample$insurance, df_sample$npo_violation, weights=wt)

但我知道那是不正确的，我收到错误 "Error in match.arg(type) : 'arg' must be NULL or a character vector"。

我非常害怕 post 在这里，但非常感谢任何帮助。使用 R 需要我永远，但非常令人满意。谢谢。

编辑：最终，我想绘制 x 轴：两个变量，"no" 和 "yes"。图例将有 4 个类别：医疗补助、蓝十字、管理式医疗、其他。 y 轴将是每个保险组在 "yes" 和 "no" 中所占的百分比，如我在 spss 中制作的交叉表所示

Answer 1

这是根据您的数据绘制的两个图：

library(dplyr)
library(magrittr)
library(ggplot2)

df_sample %>% 
   mutate(percent=wt/sum(wt)) %>%    # calculates percent
   ggplot() +                        # launches plotter 
   geom_bar(aes(x=insurance, y=percent, fill=npo_violation), 
        stat="identity",position=position_dodge())  # bars

生成这个：

在上面的示例中，您可以交换 x 和 fill 中的变量以获得相反的分组。您还可以这样做：

df_sample %>% 
   mutate(tag=paste(insurance, npo_violation)) %>%     # combines labels
   mutate(percent=wt/sum(wt)) %>%                      # calculates percent
   ggplot(aes(x=tag,y=percent)) +                      # launches plotter
   geom_bar(stat="identity") +                         # tells it to take wt literally
   theme(axis.text.x=element_text(angle=45, hjust=1))  # x axis labels

Frequency/count r studio 中的变量

Frequency/count variable in r studio

r

frequency

weighted