优化运行时:更改 igraph 中边的权重需要很长时间。有没有办法优化它?
Optimize the runtime: change the weight of edges in an igraph takes long time. Is there a way to optimize it?
我正在从 osmar 对象构建的 igraph 中搜索一组边,并想更改这些边的权重。
由于我的图表很大,这个任务需要很长时间。
因为我 运行 这个函数在循环中 运行 时间变得更大。
有什么方法可以优化它吗?
代码如下:
library(osmar)
library(igraph)
library(tidyr)
library(dplyr)
### Get data ----
src <- osmsource_api(url = "https://api.openstreetmap.org/api/0.6/")
muc_bbox <- center_bbox(11.575278, 48.137222, 1000, 1000)
muc <- get_osm(muc_bbox, src)
### Reduce to highways: ----
hways <- subset(muc, way_ids = find(muc, way(tags(k == "highway"))))
hways <- find(hways, way(tags(k == "name")))
hways <- find_down(muc, way(hways))
hways <- subset(muc, ids = hways)
#### Plot data ----
## Plot complete data and highways on top:
plot(muc)
plot_ways(muc, col = "lightgrey")
plot_ways(hways, col = "coral", add = TRUE)
### Define route start and end nodes: ----
id<-find(muc, node(tags(v %agrep% "Sendlinger Tor")))[1]
hway_start_node <-find_nearest_node(muc, id, way(tags(k == "highway")))
hway_start <- subset(muc, node(hway_start_node))
id <- find(muc, node(attrs(lon > 11.58 & lat > 48.15)))[1]
hway_end_node <- find_nearest_node(muc, id, way(tags(k == "highway")))
hway_end <- subset(muc, node(hway_end_node))
## Add the route start and and nodes to the plot:
plot_nodes(hway_start, add = TRUE, col = "red", pch = 19, cex = 2)
plot_nodes(hway_end, add = TRUE, col = "red", pch = 19, cex = 2)
### Create street graph ----
gr <- as.undirected(as_igraph(hways))
### Compute shortest route: ----
# Calculate route
route <- function(start_node,end_node) {
get.shortest.paths(gr,
from = as.character(start_node),
to = as.character(end_node),
mode = "all")[[1]][[1]]}
# Plot route
plot.route <- function(r,color) {
r.nodes.names <- as.numeric(V(gr)[r]$name)
r.ways <- subset(hways, ids = osmar::find_up(hways, node(r.nodes.names)))
plot_ways(r.ways, add = TRUE, col = color, lwd = 2)
}
nways <- 1
numway <- 1
r <- route(hway_start_node,hway_end_node)
# Plot route
color <- colorRampPalette(c("springgreen","royalblue"))(nways)[numway]
plot.route(r,color)
## Route details ----
# Construct a new osmar object containing only elements
# related to the nodes defining the route:
route_nodes <- as.numeric(V(gr)[r]$name)
route_ids <- find_up(hways, node(route_nodes))
osmar.route <- subset(hways, ids = route_ids)
osmar.nodes.ids <- osmar.route$nodes$attrs$id
# Extract the nodes’ coordinates,
osmar.nodes.coords <- osmar.route$nodes$attrs[, c("lon", "lat")]
osmar.nodes <- cbind(data.frame(ids = osmar.nodes.ids),
data.frame(ids_igraph = as.numeric(V(gr)[r]) ),
osmar.nodes.coords)
## Find edges ids containing points of interest ----
wished.coords <- data.frame(wlon = c(11.57631),
wlat = c(48.14016))
# Calculate all distances
distances <- crossing(osmar.nodes,wished.coords) %>%
mutate(dist = geosphere::distHaversine(cbind(lon,lat),cbind(wlon,wlat)))
# Select nodes below maximum distance :
mindist <- 50 #m
wished.nodes <- distances %>% filter(dist < mindist)
# Select edges incident to these nodes :
selected.edges <- unlist(incident_edges(gr,V(gr)[wished.nodes$ids_igraph]))
This is where the slowdown occurs: Weight of selected edges, change it by multiplying it with 10
E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10
这是减速发生的地方:选定边的权重,通过将其乘以 10
来改变它
E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10
也许我可以使用哈希图?
更新
哈希图
单位:秒
Hashmap:
expr min lq mean median uq max neval
Hashmap 3.248543 3.289474 3.472038 3.324417 3.734050 4.188924 100
Without 3.267549 3.333012 3.557179 3.367015 3.776429 5.643784 100
Sadly it does not seemt to bring a lot of improvement.
library(hashmap)
#https://github.com/nathan-russell/hashmap
H <- hashmap(E(gr)[selected.edges],E(gr)[selected.edges]$weight)
sapply(H$find(E(grr)[selected.edges]), function(x) x * 10)
更新:
根据 igraph 文档,igraph 是线程安全的,因此我可以使用并行。
我目前正在尝试这个:
no_cores <- detectCores(logical = FALSE)
data <- split(selected.edges,factor(sort(rank(selected.edges)%%no_cores)))
c_result <- mclapply(1:no_cores, function(x) {
E(gr)[unlist(data[[x]])]$weight * 1000 / mean_value }, mc.cores = no_cores)
E(gr)[unlist(data)]$weight<-unlist(c_result)
我想知道为什么我必须在并行循环之外执行“写入步骤”。
当我试图在循环中将权重写回 igraph 时,它不起作用,即权重没有得到更新。
提前致谢!
BR
如 Advanced R 中所示,R 中的实现性能会因语法的不同而有很大差异。
E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10
是一个有效的语法,但也可以用其他方式表述:
set.edge.attribute(graph=gr,name="weight",index=selected.edges,value=10*get.edge.attribute(graph=gr,name="weight",index=selected.edges))
所以让我们比较一下这两种解决方案:
microbenchmark::microbenchmark(
ref={E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10},
new={set.edge.attribute(graph=gr,name="weight",index=selected.edges,value=10*get.edge.attribute(graph=gr,name="weight",index=selected.edges))})
Unit: microseconds
expr min lq mean median uq max neval cld
ref 15920.404 16567.788 17793.4412 17111.583 18491.685 25867.477 100 b
new 246.974 266.462 296.5088 278.769 292.718 662.974 100 a
@Andreas,如果这可以解决您的问题,您能否查看更大的数据集?
我正在从 osmar 对象构建的 igraph 中搜索一组边,并想更改这些边的权重。 由于我的图表很大,这个任务需要很长时间。 因为我 运行 这个函数在循环中 运行 时间变得更大。
有什么方法可以优化它吗?
代码如下:
library(osmar)
library(igraph)
library(tidyr)
library(dplyr)
### Get data ----
src <- osmsource_api(url = "https://api.openstreetmap.org/api/0.6/")
muc_bbox <- center_bbox(11.575278, 48.137222, 1000, 1000)
muc <- get_osm(muc_bbox, src)
### Reduce to highways: ----
hways <- subset(muc, way_ids = find(muc, way(tags(k == "highway"))))
hways <- find(hways, way(tags(k == "name")))
hways <- find_down(muc, way(hways))
hways <- subset(muc, ids = hways)
#### Plot data ----
## Plot complete data and highways on top:
plot(muc)
plot_ways(muc, col = "lightgrey")
plot_ways(hways, col = "coral", add = TRUE)
### Define route start and end nodes: ----
id<-find(muc, node(tags(v %agrep% "Sendlinger Tor")))[1]
hway_start_node <-find_nearest_node(muc, id, way(tags(k == "highway")))
hway_start <- subset(muc, node(hway_start_node))
id <- find(muc, node(attrs(lon > 11.58 & lat > 48.15)))[1]
hway_end_node <- find_nearest_node(muc, id, way(tags(k == "highway")))
hway_end <- subset(muc, node(hway_end_node))
## Add the route start and and nodes to the plot:
plot_nodes(hway_start, add = TRUE, col = "red", pch = 19, cex = 2)
plot_nodes(hway_end, add = TRUE, col = "red", pch = 19, cex = 2)
### Create street graph ----
gr <- as.undirected(as_igraph(hways))
### Compute shortest route: ----
# Calculate route
route <- function(start_node,end_node) {
get.shortest.paths(gr,
from = as.character(start_node),
to = as.character(end_node),
mode = "all")[[1]][[1]]}
# Plot route
plot.route <- function(r,color) {
r.nodes.names <- as.numeric(V(gr)[r]$name)
r.ways <- subset(hways, ids = osmar::find_up(hways, node(r.nodes.names)))
plot_ways(r.ways, add = TRUE, col = color, lwd = 2)
}
nways <- 1
numway <- 1
r <- route(hway_start_node,hway_end_node)
# Plot route
color <- colorRampPalette(c("springgreen","royalblue"))(nways)[numway]
plot.route(r,color)
## Route details ----
# Construct a new osmar object containing only elements
# related to the nodes defining the route:
route_nodes <- as.numeric(V(gr)[r]$name)
route_ids <- find_up(hways, node(route_nodes))
osmar.route <- subset(hways, ids = route_ids)
osmar.nodes.ids <- osmar.route$nodes$attrs$id
# Extract the nodes’ coordinates,
osmar.nodes.coords <- osmar.route$nodes$attrs[, c("lon", "lat")]
osmar.nodes <- cbind(data.frame(ids = osmar.nodes.ids),
data.frame(ids_igraph = as.numeric(V(gr)[r]) ),
osmar.nodes.coords)
## Find edges ids containing points of interest ----
wished.coords <- data.frame(wlon = c(11.57631),
wlat = c(48.14016))
# Calculate all distances
distances <- crossing(osmar.nodes,wished.coords) %>%
mutate(dist = geosphere::distHaversine(cbind(lon,lat),cbind(wlon,wlat)))
# Select nodes below maximum distance :
mindist <- 50 #m
wished.nodes <- distances %>% filter(dist < mindist)
# Select edges incident to these nodes :
selected.edges <- unlist(incident_edges(gr,V(gr)[wished.nodes$ids_igraph]))
This is where the slowdown occurs: Weight of selected edges, change it by multiplying it with 10
E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10
这是减速发生的地方:选定边的权重,通过将其乘以 10
来改变它E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10
也许我可以使用哈希图?
更新
哈希图
单位:秒
Hashmap:
expr min lq mean median uq max neval
Hashmap 3.248543 3.289474 3.472038 3.324417 3.734050 4.188924 100
Without 3.267549 3.333012 3.557179 3.367015 3.776429 5.643784 100
Sadly it does not seemt to bring a lot of improvement.
library(hashmap)
#https://github.com/nathan-russell/hashmap
H <- hashmap(E(gr)[selected.edges],E(gr)[selected.edges]$weight)
sapply(H$find(E(grr)[selected.edges]), function(x) x * 10)
更新: 根据 igraph 文档,igraph 是线程安全的,因此我可以使用并行。
我目前正在尝试这个:
no_cores <- detectCores(logical = FALSE)
data <- split(selected.edges,factor(sort(rank(selected.edges)%%no_cores)))
c_result <- mclapply(1:no_cores, function(x) {
E(gr)[unlist(data[[x]])]$weight * 1000 / mean_value }, mc.cores = no_cores)
E(gr)[unlist(data)]$weight<-unlist(c_result)
我想知道为什么我必须在并行循环之外执行“写入步骤”。 当我试图在循环中将权重写回 igraph 时,它不起作用,即权重没有得到更新。
提前致谢! BR
如 Advanced R 中所示,R 中的实现性能会因语法的不同而有很大差异。
E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10
是一个有效的语法,但也可以用其他方式表述:
set.edge.attribute(graph=gr,name="weight",index=selected.edges,value=10*get.edge.attribute(graph=gr,name="weight",index=selected.edges))
所以让我们比较一下这两种解决方案:
microbenchmark::microbenchmark(
ref={E(gr)[selected.edges]$weight<-E(gr)[selected.edges]$weight*10},
new={set.edge.attribute(graph=gr,name="weight",index=selected.edges,value=10*get.edge.attribute(graph=gr,name="weight",index=selected.edges))})
Unit: microseconds
expr min lq mean median uq max neval cld
ref 15920.404 16567.788 17793.4412 17111.583 18491.685 25867.477 100 b
new 246.974 266.462 296.5088 278.769 292.718 662.974 100 a
@Andreas,如果这可以解决您的问题,您能否查看更大的数据集?