重新编码离散变量

Question

我有一个分数为 1-3 的离散变量。我想将其更改为 1=2, 2=1, 3=3.

我试过了

recode(Data$GEB43, "c(1=2; 2=1; 3=3")

但这不起作用。

我知道这是一个非常愚蠢的问题，可以在 excel 几秒钟内解决，但我想学习如何在 R 中做这样的基础知识。

Answer 1

我们应该始终提供 minimal reproducible example:

df <- data.frame(x=c(1,1,2,2,3,3))

您没有指定 recode 的包，所以我假设 dplyr。 ?dplyr::recode 告诉我们应该如何将参数传递给函数。在原始问题中 "c(1=2; 2=1; 3=3" 是一个字符串（即不是 R 表达式而是字符串 "c(1=2; 2=1; 3=3"）。要使其成为 R 表达式，我们必须去掉双引号并将 ; 替换为 ,。此外，我们需要一个右括号，即 c(1=2, 2=1, 3=3)。但是，正如 ?dplyr::recode 告诉我们的那样，这不是将此信息传递给 recode:

的方式

使用dplyr::recode的解决方案：

dplyr::recode(df$x, "1"=2, "2"=1, "3"=3)

Returns:

[1] 2 2 1 1 3 3

Answer 2

假设，你的意思是dplyr::recode，语法是

recode(.x, ..., .default = NULL, .missing = NULL)

从文档中可以看出

.x - A vector to modify

... - Replacements. For character and factor .x, these should be named and replacement is based only on their name. For numeric .x, these can be named or not. If not named, the replacement is done based on position i.e. .x represents positions to look for in replacements

所以当你有数值时你可以直接根据位置替换

recode(1:3, 2, 1, 3)
#[1] 2 1 3

重新编码离散变量

Recoding a discrete variable

r

dplyr

recode