如何加快 Log(x+1) 在 Julia 中应用于稀疏数组的速度
How can I speed up application of Log(x+1) to a sparse array in Julia
Julia 中的稀疏矩阵只存储非零元素。
一些功能,例如log(x+1)
(在所有基地),
将零映射到零,因此不需要应用于那些零元素。
(我想我们会称之为 Monoid homomorphism。)
我如何利用这个事实来加速操作?
示例代码:
X = sprand(10^4,10^4, 10.0^-5, rand)
function naiveLog2p1(N::SparseMatrixCSC{Float64,Int64})
log2(1+N) |> sparse
end
运行:
@time naiveLog2p1(X)
输出为:
elapsed time: 2.580125482 seconds (2289 MB allocated, 6.86% gc time in 3 pauses with 0 full sweep)
第二次(这样函数应该已经被编译):
elapsed time: 2.499118888 seconds (2288 MB allocated, 8.17% gc time in 3 pauses with 0 full sweep)
变化不大,大概是因为编译起来很简单。
您要查找的是sparse nonzeros function。
nonzeros(A)
Return a vector of the structural nonzero values in sparse matrix A
. This includes zeros that are explicitly stored in the sparse
matrix. The returned vector points directly to the internal nonzero
storage of A
, and any modifications to the returned vector will mutate
A
as well.
您可以如下使用:
function improvedLog2p1(N::SparseMatrixCSC{Float64,Int64})
M = copy(N)
ms = nonzeros(M) #Creates a view,
ms = log2(1+ms) #changes to ms, change M
M
end
@time improvedLog2p1(X)
运行第一次输出是:
elapsed time: 0.002447847 seconds (157 kB allocated)
运行第二次输出为:
0.000102335 seconds (109 kB allocated)
速度和内存使用量提高了 4 个数量级。
我的解决方案是实际操作数据结构本身的内部:
mysparselog(N::SparseMatrixCSC) =
SparseMatrixCSC(N.m, N.n, copy(N.colptr), copy(N.rowval), log2(1+N.nzval))
请注意,如果您想就地对稀疏矩阵进行操作(我想这在实践中很常见),这将是零内存操作。基准测试显示这与@Oxinabox 的答案相似,因为它在内存操作方面大致相同(尽管该答案实际上并不 return 新矩阵,如 mean
输出所示) :
with warmup times removed:
naiveLog2p1
elapsed time: 1.902405905 seconds (2424151464 bytes allocated, 10.35% gc time)
mean(M) => 0.005568094618997372
mysparselog
elapsed time: 0.022551705 seconds (24071168 bytes allocated)
elapsed time: 0.025841895 seconds (24071168 bytes allocated)
mean(M) => 0.005568094618997372
improvedLog2p1
elapsed time: 0.018682775 seconds (32068160 bytes allocated)
elapsed time: 0.027129497 seconds (32068160 bytes allocated)
mean(M) => 0.004995127985160583
根据 Julia 手册关于 "Sparse matrix operations" 的建议,我将使用 findnz()
将稀疏矩阵转换为密集矩阵,对值进行对数运算并使用 sparse()
.
function improvedLog2p1(N::SparseMatrixCSC{Float64,Int64})
I,J,V = findnz(N)
return sparse(I,J,log2(1+V))
end
@time improvedLog2p1(X)
elapsed time: 0.000553508 seconds (473288 bytes allocated)
Julia 中的稀疏矩阵只存储非零元素。
一些功能,例如log(x+1)
(在所有基地),
将零映射到零,因此不需要应用于那些零元素。
(我想我们会称之为 Monoid homomorphism。)
我如何利用这个事实来加速操作?
示例代码:
X = sprand(10^4,10^4, 10.0^-5, rand)
function naiveLog2p1(N::SparseMatrixCSC{Float64,Int64})
log2(1+N) |> sparse
end
运行:
@time naiveLog2p1(X)
输出为:
elapsed time: 2.580125482 seconds (2289 MB allocated, 6.86% gc time in 3 pauses with 0 full sweep)
第二次(这样函数应该已经被编译):
elapsed time: 2.499118888 seconds (2288 MB allocated, 8.17% gc time in 3 pauses with 0 full sweep)
变化不大,大概是因为编译起来很简单。
您要查找的是sparse nonzeros function。
nonzeros(A)
Return a vector of the structural nonzero values in sparse matrix
A
. This includes zeros that are explicitly stored in the sparse matrix. The returned vector points directly to the internal nonzero storage ofA
, and any modifications to the returned vector will mutateA
as well.
您可以如下使用:
function improvedLog2p1(N::SparseMatrixCSC{Float64,Int64})
M = copy(N)
ms = nonzeros(M) #Creates a view,
ms = log2(1+ms) #changes to ms, change M
M
end
@time improvedLog2p1(X)
运行第一次输出是:
elapsed time: 0.002447847 seconds (157 kB allocated)
运行第二次输出为:
0.000102335 seconds (109 kB allocated)
速度和内存使用量提高了 4 个数量级。
我的解决方案是实际操作数据结构本身的内部:
mysparselog(N::SparseMatrixCSC) =
SparseMatrixCSC(N.m, N.n, copy(N.colptr), copy(N.rowval), log2(1+N.nzval))
请注意,如果您想就地对稀疏矩阵进行操作(我想这在实践中很常见),这将是零内存操作。基准测试显示这与@Oxinabox 的答案相似,因为它在内存操作方面大致相同(尽管该答案实际上并不 return 新矩阵,如 mean
输出所示) :
with warmup times removed:
naiveLog2p1
elapsed time: 1.902405905 seconds (2424151464 bytes allocated, 10.35% gc time)
mean(M) => 0.005568094618997372
mysparselog
elapsed time: 0.022551705 seconds (24071168 bytes allocated)
elapsed time: 0.025841895 seconds (24071168 bytes allocated)
mean(M) => 0.005568094618997372
improvedLog2p1
elapsed time: 0.018682775 seconds (32068160 bytes allocated)
elapsed time: 0.027129497 seconds (32068160 bytes allocated)
mean(M) => 0.004995127985160583
根据 Julia 手册关于 "Sparse matrix operations" 的建议,我将使用 findnz()
将稀疏矩阵转换为密集矩阵,对值进行对数运算并使用 sparse()
.
function improvedLog2p1(N::SparseMatrixCSC{Float64,Int64})
I,J,V = findnz(N)
return sparse(I,J,log2(1+V))
end
@time improvedLog2p1(X)
elapsed time: 0.000553508 seconds (473288 bytes allocated)