错误 'undefined columns selected' - R 中的和弦图(circlize 包)

Error 'undefined columns selected' - Chord diagram (circlize package) in R

使用 circlize 包中的 chordDiagram() 函数时返回的错误消息需要一些帮助。

我正在从事渔业上岸工作。渔船在一个港口(母港 PORT_DE)开始航行,然后在另一个港口(着陆港 PORT_LA)卸下渔获物(本例中为扇贝)。我正在尝试使用 circlize 包绘制和弦图,以可视化端口之间的着陆流程。我有 161 个唯一端口,端口名称存储为 character 字符串。

在调用chordDiagram()函数绘制和弦图之前,我将相关列存储在一个虚拟对象(m)中。

# Store relevant column
m <- data.frame(PORT_DE = VMS_by_trips$PORT_DE_Label, 
            PORT_LA = VMS_by_trips$PORT_LA_Label, 
            SCALLOP_W = VMS_by_trips$Trip_SCALLOP_W)

head(m)
# PORT_DE  PORT_LA SCALLOP_W
# 1  Arbroath Arbroath  2.147143
# 2  Eyemouth Aberdeen  8.791970
# 3    Buckie Aberdeen  2.025833
# 4  Montrose Aberdeen  8.268540
# 5  Aberdeen Aberdeen  1.358286
# 6 Peterhead Aberdeen  0.797500

然后我使用 dcast() 创建邻接矩阵并重命名行。

require(reshape2)
m <- as.matrix(dcast(m, PORT_DE ~ PORT_LA, value.var = "SCALLOP_W", fun.aggregate = sum))
dim(m) #adjecency matrix represents port pairs
#[1] 153 138

row.names(m) <- m[,1]
m <- m[,2:dim(m)[2]]
class(m) <- "numeric"

最后,我调用绘图函数 chordDiagram()

library(circlize) 
chordDiagram(m) 

不幸的是,这会导致错误消息。

Error in `[.data.frame`(df, c(1, 2, 5)) : undefined columns selected

如果我用数字替换行名和列名,函数 运行s 将返回正确的图。

row.names(m) <- 1:153
colnames(m) <- 1:137

关于如何 运行 具有实际端口名称的函数有什么想法吗?

我已经尝试过去除特殊字符,用"_"下划线替换" "空格,保留较少的字符,只保留几个端口对。不幸的是,同样的错误不断出现。任何帮助表示赞赏。

Please note that since posting this question, I have managed to create the visualisation needed. Here is a link to another related question, which also includes the code to adjust various settings of a chord diagram.

感谢@ZuguangGu,错误消息的原因是我的列名称中的 NAs。如果您先删除它们,那么和弦图就可以绘制得很好。按照相同的符号,请看下面。

#create adjacency matrix
m <- data.frame(PORT_DE = VMS_by_trips$PORT_DE_Label, 
                PORT_LA = VMS_by_trips$PORT_LA_Label, 
                SCALLOP_W = VMS_by_trips$Trip_SCALLOP_W)


#Check for NA values in your dataset
which(is.na(m[, 1]))
which(is.na(m[, 2]))

#Remove the rows which have NA values, there will not be errors any more.
df = m
df = df[!(is.na(df[[1]]) | is.na(df[[2]])), ]

require(reshape2)
m <- dcast(df, PORT_DE ~ PORT_LA, value.var = "SCALLOP_W", fun.aggregate = sum)
row.names(m) <- m[,1]
m <- as.matrix(m[, -1])

# remove self-links
m2 = m
cn = intersect(rownames(m2), colnames(m2)) 
for(i in seq_along(cn)) {
  m2[cn[i], cn[i]] = 0
}

# Export 3 versions of the chord diagram in a PDF

library(circlize) 

pdf("test.pdf")

# Use all data
chordDiagram(m)
title("using all data")

#remove self-links
chordDiagram(m2)
title("remove self-links")

#here reduce = 0.01 means to remove ports which have capacity less than 0.01 of capacity of all ports.
chordDiagram(m2, reduce = 0.01)
title("remove self-links and small sectors")

dev.off()