需要删除包含“0”的单元格并保留 R 中的其他单元格
Need to delete cells which contains "0" and retaining other cells in R
我的数据框的单元格中包含“0”值。我需要调出值(letters/non-zero`s)及其相应的票号。数据框(m)看起来像:(请注意:列名与其值相似)
a1 b1 c1 d1 e1 f1 g1 h1 i1 j1 k1 l1
TKT1 a1 b1 0 d1 0 0 0 h1 0 0 k1 0
TKT2 0 b1 0 0 e1 0 g1 h1 0 j1 k1 0
TKT3 a1 0 0 d1 e1 0 g1 h1 i1 0 k1 l1
生成数据集的代码:
#sample data
m <- matrix(sample(0:1, 12*3, replace=T), ncol=12)
colnames(m) <- c("a1", "b1", "c1", "d1", "e1", "f1", "g1", "h1", "i1", "j1", "k1", "l1")
rownames(m) <- c("TKT1","TKT2","TKT3")
#replacement
ones <- which(m==1, arr.ind=T)
m[ones]<-colnames(m)[ones[,2]]
m <- as.data.frame(m)
我想要的输出格式是:
Tickets Values
TKT1 a1
TKT1 b1
TKT1 d1
TKT1 h1
TKT1 k1
TKT2 b1
TKT2 e1
TKT2 g1
TKT2 h1
TKT2 j1
TKT2 k1
TKT3 a1
TKT3 d1
TKT3 e1
TKT3 g1
TKT3 h1
TKT3 i1
TKT3 k1
TKT3 l1
我想到的一种方法是删除数据框中包含 0 的单元格,然后将所有数据向左移动。我不确定如何进行。
虽然没有set.seed随机生成你的data.frame,但结果会略有不同:
library(dplyr); library(reshape2)
m %>% add_rownames('Tickets') %>% melt(id.var="Tickets") %>% filter(value!=0) %>% select(-variable) %>% arrange(Tickets)
这给出了预期的结果。
这可以在基础 r
中用一行来完成:
setNames(expand.grid(dimnames(m))[m != "0",], c("Tickets", "Values"))
expand.grid
给出所有行名和列名的组合,然后m != "0"
选择不为零的条目。 setNames
给列命名。
library(tidyverse)
单管解决方案:
dfLong <-
m %>%
rownames_to_column("Tickets") %>% # newly refactored dplyr function
gather(Keys, Values, a1:l1) %>% # tidyr:: gather all columns into key, value pairs
filter(Values == Keys) %>% # select the matched cells
select(-Keys) %>% # remove superfluous column
arrange(Tickets, Values) # order correctly for desired output
还有一个使用library(data.table)
的单线:
melt(setDT(m, keep.rownames = "Tickets"), id.vars = "Tickets")[, variable := NULL][value != "0"]
melt
的用法与@agenis的回答类似
我的数据框的单元格中包含“0”值。我需要调出值(letters/non-zero`s)及其相应的票号。数据框(m)看起来像:(请注意:列名与其值相似)
a1 b1 c1 d1 e1 f1 g1 h1 i1 j1 k1 l1
TKT1 a1 b1 0 d1 0 0 0 h1 0 0 k1 0
TKT2 0 b1 0 0 e1 0 g1 h1 0 j1 k1 0
TKT3 a1 0 0 d1 e1 0 g1 h1 i1 0 k1 l1
生成数据集的代码:
#sample data
m <- matrix(sample(0:1, 12*3, replace=T), ncol=12)
colnames(m) <- c("a1", "b1", "c1", "d1", "e1", "f1", "g1", "h1", "i1", "j1", "k1", "l1")
rownames(m) <- c("TKT1","TKT2","TKT3")
#replacement
ones <- which(m==1, arr.ind=T)
m[ones]<-colnames(m)[ones[,2]]
m <- as.data.frame(m)
我想要的输出格式是:
Tickets Values
TKT1 a1
TKT1 b1
TKT1 d1
TKT1 h1
TKT1 k1
TKT2 b1
TKT2 e1
TKT2 g1
TKT2 h1
TKT2 j1
TKT2 k1
TKT3 a1
TKT3 d1
TKT3 e1
TKT3 g1
TKT3 h1
TKT3 i1
TKT3 k1
TKT3 l1
我想到的一种方法是删除数据框中包含 0 的单元格,然后将所有数据向左移动。我不确定如何进行。
虽然没有set.seed随机生成你的data.frame,但结果会略有不同:
library(dplyr); library(reshape2)
m %>% add_rownames('Tickets') %>% melt(id.var="Tickets") %>% filter(value!=0) %>% select(-variable) %>% arrange(Tickets)
这给出了预期的结果。
这可以在基础 r
中用一行来完成:
setNames(expand.grid(dimnames(m))[m != "0",], c("Tickets", "Values"))
expand.grid
给出所有行名和列名的组合,然后m != "0"
选择不为零的条目。 setNames
给列命名。
library(tidyverse)
单管解决方案:
dfLong <-
m %>%
rownames_to_column("Tickets") %>% # newly refactored dplyr function
gather(Keys, Values, a1:l1) %>% # tidyr:: gather all columns into key, value pairs
filter(Values == Keys) %>% # select the matched cells
select(-Keys) %>% # remove superfluous column
arrange(Tickets, Values) # order correctly for desired output
还有一个使用library(data.table)
的单线:
melt(setDT(m, keep.rownames = "Tickets"), id.vars = "Tickets")[, variable := NULL][value != "0"]
melt
的用法与@agenis的回答类似