如何使用变量名根据类别值对表进行子集化？

Question

我尝试根据一个类别值对 table 进行子集化。假设我们只想保留泰坦尼克号数据中的成年人。我做的是：

data("Titanic")
subset(Titanic, Age == "Adult")

这会导致错误 object 'Age' not found。对数据帧使用相同的逻辑有效：subset(as.data.frame(Titanic), Age == "Adult")。但是我们如何直接对 table 进行子集化，即不将它们转换为数据框？

编辑这里 Adult 是第三维。在我的例子中，我不知道它是哪个维度，即我希望能够像 subset(Titanic, Age == "Adult") 中那样按变量名进行子集化。它可以是任何其他基本函数，即我不会被 subset 困住。但我正在寻找一个基本的 R 解决方案。

我的预期输出是

structure(c(118, 154, 387, 670, 4, 13, 89, 3, 57, 14, 75, 192, 140, 80, 76, 20), .Dim = c(4L, 2L, 2L), .Dimnames = list(Class = c("1st", "2nd", "3rd", "Crew"), Sex = c("Male", "Female"), Survived = c("No", "Yes")), class = "table")

Answer 1

您处理的不是二维数据框，而是 4 维数组。
因此，您必须在正确的维度中指定您的条件，如下所示：

Titanic[,,"Adult",]

当您显示数组时，您有以下 4 个维度：
1- Class
2- 性别
3- 年龄
4- 幸存

您可以使用“str()”或“dimnames()”获取维度名称

str(Titanic)
dimnames(Titanic)

Answer 2

通过匹配 dimnames 获取维度索引，然后使用 slice.index:

# user input
x = "Adult"

#get index
ix1 <- which(sapply(dimnames(Titanic), function(i) sum(i == x)) == 1)
ix2 <- which(dimnames(Titanic)[[ ix1 ]] == x)

#subset and restore dimensions
res <- Titanic[ slice.index(Titanic, ix1) == ix2 ]
dim(res) <- dim(Titanic)[ -ix1 ]

#test
all(Titanic[,,"Adult",] == res)
# [1] TRUE

# not identical as the names are missing
identical(Titanic[,,"Adult",], res)
# [1] FALSE

res
# , , 1
# 
#      [,1] [,2]
# [1,]  118    4
# [2,]  154   13
# [3,]  387   89
# [4,]  670    3
# 
# , , 2
# 
#      [,1] [,2]
# [1,]   57  140
# [2,]   14   80
# [3,]   75   76
# [4,]  192   20

如何使用变量名根据类别值对表进行子集化？

How to subset tables based on category value using variable name?

r

subset

multidimensional-array