如何读取 R 中的 .nodes 文件

Question

文件下载于：networkrepository.com/fb-pages-company.php。下面的代码不起作用。有什么想法吗？

read.nodes("fb-pages-company.nodes")

如果从.nodes和.edges文件中获取igraph对象就更好了

谢谢！

Answer 1

假设您使用的是 taxonomizr 包中的函数 read.nodes，我相信该函数并非旨在用于摄取图形。 taxonomizr 文档说：

Functions for assigning taxonomy to NCBI accession numbers and taxon IDs based on NCBI's accession2taxid and taxdump files. This package allows the user to downloads NCBI data dumps and create a local database for fast and local taxonomic assignment.

我认为您正在尝试将 fb-pages-company 数据作为图表摄取。您仅从边缘文件中获得完整的图表，但如果您希望能够访问公司名称，则需要节点文件。此答案显示了如何读取这两个文件、创建图形并将公司名称与节点相关联。如果这不是您想要做的，请解释您想要的。

library(igraph)

Nodes = read.csv("fb-pages-company.nodes")
Edges = read.csv("fb-pages-company.edges")

g = graph_from_data_frame(Edges, directed = TRUE)
vcount(g)
[1] 14113
ecount(g)
[1] 52310

节点和边数与源页面上给出的图的文档相匹配。但是，现在，节点由它们的 new_id 命名。我们想更改为公司名称。我们首先使用 new_id.

来轻松引用公司名称

row.names(Nodes) = Nodes$new_id

让我们继续使用 new_id 以备日后需要。

V(g)$new_id = V(g)$name

现在将名称更改为公司名称

V(g)$name = Nodes[V(g)$new_id, 2]
Encoding(V(g)$name) = "UTF-8"

指定编码的原因是有些名字不是英文的，实际上不是拉丁字母。如果不更改为 UTF-8，名称将无法正确显示。

现在让我们看一下以确保结果看起来不错。

cat(paste(head(V(g)$name, 10), collapse="\n"), "\n")
Sportsnavi
NRK P1
蘋果余艾苔
99.1 WQIK
TalkTalk
Tahiti Tourisme
United
NBC OUT
Time Out Los Angeles
ConnectSafely

您可以使用 new_id 对节点文件进行检查。

head(V(g)$new_id, 10)
 [1] "0"    "1"    "8087" "2"    "3"    "4"    "4588" "6"    "8"    "9"

可以在nodes文件中搜索，发现new_id=0对应的是公司Sportsnavi。 new_id=1对应公司NRK P1。等等。

如何读取 R 中的 .nodes 文件

How to read the .nodes file in R

r

igraph