识别矩阵每一行中的最小元素

Identifying Smallest Element in Each Row of a Matrix

我在 R 中有这个“成本矩阵”,表示从任何位置到任何其他位置(总共 5 个位置)的“成本”:

X<-matrix(rnorm(25) , nrow = 5)

rownames(X) <- colnames(X) <- c("Location 1", "Location 2", "Location 3", "Location 4", "Location 5")

               Location 1  Location 2  Location 3 Location 4 Location 5
Location 1  0.4501251  2.30029903 -0.26950735  0.1723589  0.5045694
Location 2  1.1208198  1.38557818  0.25250596 -0.6174514 -0.5324785
Location 3  0.4181011  0.01103208  0.83713132 -0.7649082 -0.5619196
Location 4  0.9372365 -1.04258420  0.08397031  0.1611555  1.8356483
Location 5  1.0201278 -0.56020913  1.14815210  1.0362332 -2.2052776

我想找出从“位置 1”开始到“位置 1”结束的“贪心最短路径”,同时访问每个位置一次。

我认为这看起来像这样 (R getting the minimum value for each row in a matrix, and returning the row and column name) - 此代码 return 是矩阵每一行中的最小值:

result <- t(sapply(seq(nrow(X)), function(i) {
  j <- which.min(X[i,])
  c(paste(rownames(X)[i], colnames(X)[j], sep='/'), X[i,j])
}))

当我查看结果时:

print(result)


     [,1]                    [,2]                
[1,] "Location 1/Location 3" "-0.269507349140081"
[2,] "Location 2/Location 4" "-0.617451368699149"
[3,] "Location 3/Location 4" "-0.764908186347014"
[4,] "Location 4/Location 2" "-1.04258420123991" 
[5,] "Location 5/Location 5" "-2.20527763537575" 

我认为这告诉我“贪心最短路径”(从“位置 1”开始)是:1 到 3、3 到 4、4 到 2、2 到 4 ... 但后来我明白了永远陷入“2 到 4、4 到 2”的循环中。

通过手动执行此操作:

我希望产生以下输出:

Path : (1,4), (4,3), (3,5), (5,2), (2,1)
Total Distance = -0.8441315 + (-0.7244259) + (-0.3775706) + 0.3796208 + 0.3015059 =  -1.265001

谢谢!

这会跟踪访问过的位置,但不会检查它们:

set.seed(123)

n <- 5L
X <- matrix(rnorm(n^2), nrow = n)

rownames(X) <- colnames(X) <- paste("Location", 1:n)

shortest_path <- function(x, start = 1L) {
  n <- nrow(x)
  nn <- 1:n
  used <- c(start, integer(n - 1L))

  for (step in 2:n) {
    used[step] <- nn[-used][which.min(x[used[step - 1L], -used])]
  }
  
  data.frame(path = colnames(x)[used], dist = c(0, x[used[1:(n - 1L)] + n*(used[2:n] - 1L)]))
}

df <- shortest_path(X)
X
#>             Location 1 Location 2 Location 3 Location 4 Location 5
#> Location 1 -0.56047565  1.7150650  1.2240818  1.7869131 -1.0678237
#> Location 2 -0.23017749  0.4609162  0.3598138  0.4978505 -0.2179749
#> Location 3  1.55870831 -1.2650612  0.4007715 -1.9666172 -1.0260044
#> Location 4  0.07050839 -0.6868529  0.1106827  0.7013559 -0.7288912
#> Location 5  0.12928774 -0.4456620 -0.5558411 -0.4727914 -0.6250393
df
#>         path       dist
#> 1 Location 1  0.0000000
#> 2 Location 5 -1.0678237
#> 3 Location 3 -0.5558411
#> 4 Location 4 -1.9666172
#> 5 Location 2 -0.6868529

这似乎是一个典型的Traveling Salesman Problem (TSP),相信你能找到一堆实现方法。


这是一个基本的 R 选项,通过定义如下递归函数(从 借用数据)

f <- function(i, S = setdiff(1:ncol(X), i), path = i) {
    if (length(S) == 1) {
        return(list(cost = X[i, S] + X[S, 1], path = c(path, S)))
    }
    vp <- Inf
    for (k in S) {
        r <- Recall(k, setdiff(S, k), c(path, k))
        v <- X[i, k] + r$cost
        if (v <= vp) {
            vp <- v
            l <- list(cost = v, path = r$path)
        }
    }
    l
}

这给出了

> f(1)
$cost
[1] -4.507312

$path
[1] 1 5 3 4 2

哪里

  • f(1)表示start/end点是1
  • $cost 是最低总成本
  • $path是描述汉密尔顿路径的列索引