将 df 从因子转换为数字
convert df from factor to numeric
我正在努力将我的数据集转换为数值。我的数据集如下所示:
customer_id 2012 2013 2013 2014 2015 2016 2017
15251 X N U D S C L
X1 - X7 被标记为因子。 dput(head(df)) 的摘录是:
structure(list(`2012` = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("N",
"X"), class = "factor"), `2013` = structure(c(6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L
), .Label = c("C", "D", "N", "S", "U", "X"), class = "factor"),
`2014` = structure(c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("C",
"D", "L", "N", "R", "S", "U", "X"), class = "factor"), ...
我想要数值数据,但我不知道如何相应地转换它们。
目标是我可以将 df 馈送到热图中,以便我可以直观地探索差异。据我所知,这只有在数字矩阵中才有可能。因为我收到错误 Heatmap.2(input, trace = "none", : `x' must be a numeric matrix
有人知道吗?
非常感谢您的支持!
这是可行的。我认为下次包含完整的 df 会有所帮助。 heatmap.2 不起作用,因为您给了它一个字符矩阵。使用 heatmap.2 将颜色图例显示为字母有点复杂,我建议在下面使用 ggplot
library(ggplot2)
library(dplyr)
library(viridis)
# simulate data
df = data.frame(id=1:5,
replicate(7,sample(LETTERS[1:10],5)))
colnames(df)[-1] = 2012:2018
#convert to long format for plotting and refactor
df <- df %>% pivot_longer(-id) %>%
mutate(value=factor(as.character(value),levels=sort(levels(value))))
#define color scale
# sorted in alphabetical order
present_letters = levels(df$value)
COLS = viridis_pal()(length(present_letters))
names(COLS) = present_letters
#plot
ggplot(data=df,aes(x=name,y=id,fill=value)) +
geom_tile() +
scale_fill_manual(values=COLS)
我正在努力将我的数据集转换为数值。我的数据集如下所示:
customer_id 2012 2013 2013 2014 2015 2016 2017
15251 X N U D S C L
X1 - X7 被标记为因子。 dput(head(df)) 的摘录是:
structure(list(`2012` = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("N",
"X"), class = "factor"), `2013` = structure(c(6L, 6L, 6L, 6L,
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L
), .Label = c("C", "D", "N", "S", "U", "X"), class = "factor"),
`2014` = structure(c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L,
8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), .Label = c("C",
"D", "L", "N", "R", "S", "U", "X"), class = "factor"), ...
我想要数值数据,但我不知道如何相应地转换它们。 目标是我可以将 df 馈送到热图中,以便我可以直观地探索差异。据我所知,这只有在数字矩阵中才有可能。因为我收到错误 Heatmap.2(input, trace = "none", : `x' must be a numeric matrix
有人知道吗?
非常感谢您的支持!
这是可行的。我认为下次包含完整的 df 会有所帮助。 heatmap.2 不起作用,因为您给了它一个字符矩阵。使用 heatmap.2 将颜色图例显示为字母有点复杂,我建议在下面使用 ggplot
library(ggplot2)
library(dplyr)
library(viridis)
# simulate data
df = data.frame(id=1:5,
replicate(7,sample(LETTERS[1:10],5)))
colnames(df)[-1] = 2012:2018
#convert to long format for plotting and refactor
df <- df %>% pivot_longer(-id) %>%
mutate(value=factor(as.character(value),levels=sort(levels(value))))
#define color scale
# sorted in alphabetical order
present_letters = levels(df$value)
COLS = viridis_pal()(length(present_letters))
names(COLS) = present_letters
#plot
ggplot(data=df,aes(x=name,y=id,fill=value)) +
geom_tile() +
scale_fill_manual(values=COLS)