为给定的数字向量分配一个大于 2 levels/labels 的因子向量

Assign a factor vector with more than 2 levels/labels for a given numeric numeric vector

各位。我希望你能帮助我解决我的疑问。 对于表示苹果价格 ($) 的向量,例如

apple <- c(23, 26, 54, 34, 34, 34, 98, 23, 4, 34, 098, 45, 93, 20, 39, 83, 78, 34, 09, 8, 56, 98, 99, 62, 29)

我可以分配一个因子向量,如果苹果价格低于 50 美元,则表示它是否 "cheap",如果苹果价格高于或等于 50 美元,则表示它是否 "expensive"。例如,因子变量可以很容易地指定为:

price <- factor(apple>50, labels = c("cheap", "expensive"))

但是,我无法为一个因子变量分配三个价格水平,比如便宜、中等和昂贵,比如如果苹果的价格在 30 美元到 40 美元之间,就被认为价格适中。 谢谢

我们可以使用cut:

 cut(apple, breaks = c(0, 30, 40, Inf), labels = c("Cheap", "Moderate", "Expensive"))
#>  [1] Cheap     Cheap     Expensive Moderate  Moderate  Moderate  Expensive Cheap    
#>  [9] Cheap     Moderate  Expensive Expensive Expensive Cheap     Moderate  Expensive
#> [17] Expensive Moderate  Cheap     Cheap     Expensive Expensive Expensive Expensive
#> [25] Cheap    
#> Levels: Cheap Moderate Expensive

我们可以在base R

中使用findInterval
c('Cheap', 'Moderate', 'Expensive')[findInterval(apple, c(0, 30, 40))]
#[1] "Cheap"     "Cheap"     "Expensive" "Moderate"  "Moderate"  "Moderate"  "Expensive" "Cheap"     "Cheap"     "Moderate"  "Expensive"
#[12] "Expensive" "Expensive" "Cheap"     "Moderate"  "Expensive" "Expensive" "Moderate"  "Cheap"     "Cheap"     "Expensive" "Expensive"
#[23] "Expensive" "Expensive" "Cheap"