table R 中的独特观察

Question

我有两个因子变量 - 在 data.frame 中称为 "data" - 看起来像这样：

brand  Country
 "A"    "ITA"
 "A"    "ITA"
 "C"    "SPA"
 "B"    "POR"
 "C"    "SPA"
 "B"    "POR"
 "A"    "ITA"
 "D"    "ITA"
 "E"    "SPA"
 "D"    "ITA"

并且我想要一个 table 列出 country 的独特 brands 的数量。按照示例应该是：

# of unique brands  Country
        2             "ITA"
        2             "SPA"
        1             "POR"

首先，我试过：

data$var <- with(data, ave(brand, Country, FUN = function(x){length(unique(x))}))

但它不适用于因数，所以我转换了我的因数：

data$brand_t<-as.character(data$brand)
data$Country_t<-as.character(data$Country)

然后又是：

data$var <- with(data, ave(brand_t, Country_t, FUN = function(x){length(unique(x))}))

现在，如果我申请 unique(data$var)，我会得到 "2", "2", "1"，这是正确的，但我无法获得我想要的 table。可能很傻，但我无法解决。

我也想知道是否有更聪明的方法来代替使用因子。

再次感谢。

Answer 1

这里有两个使用 data.table v >= 1.9.5 或 dplyr

的快速方法

library(data.table)
setDT(df)[, uniqueN(brand), by = Country]

或

library(dplyr)
df %>%
  group_by(Country) %>%
  summarise(n = n_distinct(brand))

或以 R 为基数

aggregate(brand ~ Country, df, function(x) length(unique(x)))

或

tapply(df$brand, df$Country, function(x) length(unique(x)))

或者，如果您喜欢基本的 R 简单语法并且您的数据集不是太大，您可以结合

中的方法

aggregate(brand ~ Country, df, uniqueN)

或

aggregate(brand ~ Country, df, n_distinct)

Answer 2

在 base R 中，您可以尝试将 table 与 unique 和 colSums 结合使用，如下所示：

colSums(table(unique(mydf)))
# ITA POR SPA 
#   2   1   2

table R 中的独特观察

table of unique observations in R

r

unique

factors