有没有办法让 X(X、Y、Z 中)的最大值决定 R 中 V 的值?
Is there a way to let biggest value on X (out of X, Y, Z) decide value on V in R?
我有一个包含五个变量的数据集 (ft.mutate.topics)(四个数字是 ft_technical、ft_performative、ft_procedural 和 ft_moral)。第五个是 "topic_lab",我希望它采用与其他四个中值最高的变量相关的名称(作为字符)。
下面生成了一个类似于我的数据集。
set.seed(1)
Data <- data.frame(
X = sample(1:10),
Y = sample(1:10),
Z = sample(1:10))
我想要一个变量 - V - 接受 "X", "Y", og "Z" 对于对应于这三个变量中的哪一个的每个观察在最高值上——以 X 为例,这又是相似的:
if (Data$X > Data$Y & Data$X > Data$Z) Data$label <- "X"
Warning message:
In if (Data$X > Data$Y & Data$X > Data$Z) Data$label <- "X":
the condition has length > 1 and only the first element will be used
关于我的初始示例,我尝试了以下 if-commands 的组合:
if (ft.mutate.topics$ft_technical > ft.mutate.topics$ft_performative &
ft.mutate.topics$ft_technical > ft.mutate.topics$ft_procedural &
ft.mutate.topics$ft_technical > ft.mutate.topics$ft_moral)
ft.mutate.topics$topic_lab = "technical"
if (ft.mutate.topics$ft_performative > ft.mutate.topics$ft_technical &
ft.mutate.topics$ft_performative > ft.mutate.topics$ft_procedural &
ft.mutate.topics$ft_performative > ft.mutate.topics$ft_moral)
ft.mutate.topics$topic_lab = "performative"
if (ft.mutate.topics$ft_procedural > ft.mutate.topics$ft_performative &
ft.mutate.topics$ft_procedural > ft.mutate.topics$ft_technical &
ft.mutate.topics$ft_procedural > ft.mutate.topics$ft_moral)
ft.mutate.topics$topic_lab = "procedural"
if (ft.mutate.topics$ft_moral > ft.mutate.topics$ft_performative &
ft.mutate.topics$ft_moral > ft.mutate.topics$ft_procedural &
ft.mutate.topics$ft_moral > ft.mutate.topics$ft_technical)
ft.mutate.topics$topic_lab = "moral"
它说:"the condition has length > 1 and only the first element will be used" 并用 "performative" 替换整个变量,因为它在第 1 行中具有最高值。有人知道发生了什么吗?
谢谢!
这是使用 apply
和 which.max
的可能方法:
# create a fake input with random data
set.seed(123)
DF <- data.frame(ft_technical=sample(1:10,10),
ft_performative=sample(1:10,10),
ft_procedural=sample(1:10,10),
ft_moral=sample(1:10,10))
# add the columns using apply and which.max
mx <- DF[,c('ft_technical','ft_performative','ft_procedural','ft_moral')]
DF$topic_lab <- c('technical','performative','procedural','moral')[apply(mx,1,which.max)]
输出:
> DF
ft_technical ft_performative ft_procedural ft_moral topic_lab
1 3 10 9 10 performative
2 8 5 7 9 moral
3 4 6 6 6 performative
4 7 9 10 8 procedural
5 6 1 4 1 technical
6 1 7 8 3 procedural
7 10 8 3 4 technical
8 9 4 2 7 technical
9 2 3 1 5 moral
10 5 2 5 2 technical
这看起来很简单。我会使用一个编造的数据集,适应你的应该很容易。
nms <- sub("^ft_", "", names(ft))
ft$topic.lab <- apply(ft, 1, function(x) nms[which.max(x)])
数据.
这是一个模拟数据集。
set.seed(1234)
n <- 20
ft <- data.frame(ft_X = rnorm(n, 0, 2),
ft_Y = rnorm(n, 0, 3),
ft_Z = rnorm(n, 0, 4))
您可以使用max.col
获取最大值的列索引。然后,您将数据帧的 names
与此子集。
Data$V <- names(Data)[max.col(Data)]
这默认为随机拆分领带。
我有一个包含五个变量的数据集 (ft.mutate.topics)(四个数字是 ft_technical、ft_performative、ft_procedural 和 ft_moral)。第五个是 "topic_lab",我希望它采用与其他四个中值最高的变量相关的名称(作为字符)。
下面生成了一个类似于我的数据集。
set.seed(1)
Data <- data.frame(
X = sample(1:10),
Y = sample(1:10),
Z = sample(1:10))
我想要一个变量 - V - 接受 "X", "Y", og "Z" 对于对应于这三个变量中的哪一个的每个观察在最高值上——以 X 为例,这又是相似的:
if (Data$X > Data$Y & Data$X > Data$Z) Data$label <- "X"
Warning message:
In if (Data$X > Data$Y & Data$X > Data$Z) Data$label <- "X":
the condition has length > 1 and only the first element will be used
关于我的初始示例,我尝试了以下 if-commands 的组合:
if (ft.mutate.topics$ft_technical > ft.mutate.topics$ft_performative &
ft.mutate.topics$ft_technical > ft.mutate.topics$ft_procedural &
ft.mutate.topics$ft_technical > ft.mutate.topics$ft_moral)
ft.mutate.topics$topic_lab = "technical"
if (ft.mutate.topics$ft_performative > ft.mutate.topics$ft_technical &
ft.mutate.topics$ft_performative > ft.mutate.topics$ft_procedural &
ft.mutate.topics$ft_performative > ft.mutate.topics$ft_moral)
ft.mutate.topics$topic_lab = "performative"
if (ft.mutate.topics$ft_procedural > ft.mutate.topics$ft_performative &
ft.mutate.topics$ft_procedural > ft.mutate.topics$ft_technical &
ft.mutate.topics$ft_procedural > ft.mutate.topics$ft_moral)
ft.mutate.topics$topic_lab = "procedural"
if (ft.mutate.topics$ft_moral > ft.mutate.topics$ft_performative &
ft.mutate.topics$ft_moral > ft.mutate.topics$ft_procedural &
ft.mutate.topics$ft_moral > ft.mutate.topics$ft_technical)
ft.mutate.topics$topic_lab = "moral"
它说:"the condition has length > 1 and only the first element will be used" 并用 "performative" 替换整个变量,因为它在第 1 行中具有最高值。有人知道发生了什么吗?
谢谢!
这是使用 apply
和 which.max
的可能方法:
# create a fake input with random data
set.seed(123)
DF <- data.frame(ft_technical=sample(1:10,10),
ft_performative=sample(1:10,10),
ft_procedural=sample(1:10,10),
ft_moral=sample(1:10,10))
# add the columns using apply and which.max
mx <- DF[,c('ft_technical','ft_performative','ft_procedural','ft_moral')]
DF$topic_lab <- c('technical','performative','procedural','moral')[apply(mx,1,which.max)]
输出:
> DF
ft_technical ft_performative ft_procedural ft_moral topic_lab
1 3 10 9 10 performative
2 8 5 7 9 moral
3 4 6 6 6 performative
4 7 9 10 8 procedural
5 6 1 4 1 technical
6 1 7 8 3 procedural
7 10 8 3 4 technical
8 9 4 2 7 technical
9 2 3 1 5 moral
10 5 2 5 2 technical
这看起来很简单。我会使用一个编造的数据集,适应你的应该很容易。
nms <- sub("^ft_", "", names(ft))
ft$topic.lab <- apply(ft, 1, function(x) nms[which.max(x)])
数据.
这是一个模拟数据集。
set.seed(1234)
n <- 20
ft <- data.frame(ft_X = rnorm(n, 0, 2),
ft_Y = rnorm(n, 0, 3),
ft_Z = rnorm(n, 0, 4))
您可以使用max.col
获取最大值的列索引。然后,您将数据帧的 names
与此子集。
Data$V <- names(Data)[max.col(Data)]
这默认为随机拆分领带。