使用 data.tree 从 R 中的数据框创建并打印没有 "NA" 的产品层次结构树
Create and print a product hierarchy tree without "NA" from data frame in R with data.tree
我在 R 中有一个分类数据框如下:
Cat_0 Cat_1 Cat_2 Cat_3 Cat_4
Baby Travel Bath Towels Age 0-1
Baby Travel Bath Towels Age 1-2
Baby Travel Box NA NA
Baby Chairs Sit NA NA
Animals Horse Rider Safety Chaps
Animals Horse Rider Caps NA
Animals pig NA NA NA
我想用 data.tree 包定义树,为了以后的计算,树应该是这样的。
|----Chairs----sit
| | |---age 0-1
|---- Baby---| |----Bath----Towels----|
| |----Travel----| |---age 1-2
| |----Box
Product --|
| |---safety----chaps
| |---Horse---rider---|
|-- Animals--| |---caps
| |---Pig
我可以像上面那样创建树,但是出现了 NA,我想从 data.tree 中删除 NA。这是我的代码:
tree$pathString <- paste("product",
tree$Cat_0,
tree$Cat_1,
tree$Cat_2,
tree$Cat_3,
tree$Cat_4,
sep = "/")
tree <- as.Node(tree)
print(tree)
使用 data.tree 包:
library(data.tree)
包作者提供了答案,您必须在使用以下 Whosebug 答案中提供的替代 paste5 函数粘贴时省略 NA:
suppress NAs in paste()
paste5 <- function(..., sep = " ", collapse = NULL, na.rm = F) {
if (na.rm == F)
paste(..., sep = sep, collapse = collapse)
else
if (na.rm == T) {
paste.na <- function(x, sep) {
x <- gsub("^\s+|\s+$", "", x)
ret <- paste(na.omit(x), collapse = sep)
is.na(ret) <- ret == ""
return(ret)
}
df <- data.frame(..., stringsAsFactors = F)
ret <- apply(df, 1, FUN = function(x) paste.na(x, sep))
if (is.null(collapse))
ret
else {
paste.na(ret, sep = collapse)
}
}
}
然后
tree$pathString <- paste5("product",
tree$Cat_0,
tree$Cat_1,
tree$Cat_2,
tree$Cat_3,
tree$Cat_4,
sep = "/",
na.rm = TRUE)
htree <- as.Node(tree, na.rm=TRUE)
print(htree)
我在 R 中有一个分类数据框如下:
Cat_0 Cat_1 Cat_2 Cat_3 Cat_4
Baby Travel Bath Towels Age 0-1
Baby Travel Bath Towels Age 1-2
Baby Travel Box NA NA
Baby Chairs Sit NA NA
Animals Horse Rider Safety Chaps
Animals Horse Rider Caps NA
Animals pig NA NA NA
我想用 data.tree 包定义树,为了以后的计算,树应该是这样的。
|----Chairs----sit
| | |---age 0-1
|---- Baby---| |----Bath----Towels----|
| |----Travel----| |---age 1-2
| |----Box
Product --|
| |---safety----chaps
| |---Horse---rider---|
|-- Animals--| |---caps
| |---Pig
我可以像上面那样创建树,但是出现了 NA,我想从 data.tree 中删除 NA。这是我的代码:
tree$pathString <- paste("product",
tree$Cat_0,
tree$Cat_1,
tree$Cat_2,
tree$Cat_3,
tree$Cat_4,
sep = "/")
tree <- as.Node(tree)
print(tree)
使用 data.tree 包:
library(data.tree)
包作者提供了答案,您必须在使用以下 Whosebug 答案中提供的替代 paste5 函数粘贴时省略 NA:
suppress NAs in paste()
paste5 <- function(..., sep = " ", collapse = NULL, na.rm = F) {
if (na.rm == F)
paste(..., sep = sep, collapse = collapse)
else
if (na.rm == T) {
paste.na <- function(x, sep) {
x <- gsub("^\s+|\s+$", "", x)
ret <- paste(na.omit(x), collapse = sep)
is.na(ret) <- ret == ""
return(ret)
}
df <- data.frame(..., stringsAsFactors = F)
ret <- apply(df, 1, FUN = function(x) paste.na(x, sep))
if (is.null(collapse))
ret
else {
paste.na(ret, sep = collapse)
}
}
}
然后
tree$pathString <- paste5("product",
tree$Cat_0,
tree$Cat_1,
tree$Cat_2,
tree$Cat_3,
tree$Cat_4,
sep = "/",
na.rm = TRUE)
htree <- as.Node(tree, na.rm=TRUE)
print(htree)