Julia 布尔值比较中的意外行为
Unexpected behavior in Julia boolean comparison
我正在测试逻辑函数 CDF 的不同参数化,并比较结果和不同参数对曲线的影响。
using Distributions
# Vector of x to test the different functions
x = collect(0:20)
Logis = Logistic(10, 1) # PDF of Logistic function in Julia
y = cdf(Logis, x) # CDF of Logistic function in Julia
# This is a standard representation of the CDF for Logistic
LogisticV1(x, μ=10, θ=1) = 1 / ( 1 + e^-((x-μ)/θ))
y1 = LogisticV1.(x)
# This is another representation of the CDF for Logistic
LogisticV2(x, μ=10, θ=1) = 1/2 + 1/2 * tanh((x-μ)/2*θ)
y2 = LogisticV2.(x)
正如预期的那样,所有三个函数的绘图都是相同的。所有三个 y 向量的类型也相同 (Array{Float64,1}),三个 y 向量看起来也相同。
show(y)
[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955 ]
show(y1)
[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955 ]
show(y2)
[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955 ]
但是:
y == y1 # true
y == y2 # false
y1 == y2 # false
为什么会这样?我假设这与 LogisticV2 中的 tanh 函数引入的浮点变量有关,但我不确定。我很感激对此的任何见解。
编辑:修正了一些拼写错误以使代码可运行
要比较浮点数,请使用 isapprox
而不是 ==
。
在您的情况下,您会看到 isapprox(y,y1) == isapprox(y,y2) == isapprox(y1,y2) == true
。此外,您可以检查 maximum(abs.(y-y2))
以了解浮点精度顺序的差异(我发现 1.1102230246251565e-16
)。 (但是请注意,默认情况下 isapprox
检查相对偏差)
I assume this has something to do with floating point variations introduced by the tanh function in LogisticV2
你是对的:
julia> (y .== y1)'
1×21 RowVector{Bool,BitArray{1}}:
true true true true true true true true true true true true true true true true true true true true true
julia> (y .== y2)'
1×21 RowVector{Bool,BitArray{1}}:
false false false false false false false false false true true true false false true false false true false false false
但是:
julia> y ≈ y2 # \approx<TAB> for: ≈ symbol
true
≈
是 isapprox
的 Unicode 别名:
help?> ≈
"≈" can be typed by \approx
search: ≈
isapprox(x, y; rtol::Real=sqrt(eps), atol::Real=0, nans::Bool=false, norm::Function)
Inexact equality comparison: true
if norm(x-y) <= atol + tol*max(norm(x), norm(y))
. The default atol
is zero and the default rtol
depends on the types of x
and y
. The keyword argument nans determines whether or not NaN values are considered equal (defaults to false).
For real or complex floating-point values, rtol
defaults to
sqrt(eps(typeof(real(x-y))))
. This corresponds to requiring equality
of about half of the significand digits. For other types, rtol
defaults to zero.
x
and y
may also be arrays of numbers, in which case norm
defaults
to vecnorm
but may be changed by passing a norm::Function
keyword
argument. (For numbers, norm
is the same thing as abs
.) When x
and y
are arrays, if norm(x-y)
is not finite (i.e. ±Inf
or NaN
), the
comparison falls back to checking whether all elements of x
and y
are approximately equal component-wise.
The binary operator ≈
is equivalent to isapprox
with the default arguments, and x ≉ y
is equivalent to !isapprox(x,y)
.
julia> 0.1 ≈ (0.1 - 1e-10)
true
julia> isapprox(10, 11; atol = 2)
true
julia> isapprox([10.0^9, 1.0], [10.0^9, 2.0])
true
我正在测试逻辑函数 CDF 的不同参数化,并比较结果和不同参数对曲线的影响。
using Distributions
# Vector of x to test the different functions
x = collect(0:20)
Logis = Logistic(10, 1) # PDF of Logistic function in Julia
y = cdf(Logis, x) # CDF of Logistic function in Julia
# This is a standard representation of the CDF for Logistic
LogisticV1(x, μ=10, θ=1) = 1 / ( 1 + e^-((x-μ)/θ))
y1 = LogisticV1.(x)
# This is another representation of the CDF for Logistic
LogisticV2(x, μ=10, θ=1) = 1/2 + 1/2 * tanh((x-μ)/2*θ)
y2 = LogisticV2.(x)
正如预期的那样,所有三个函数的绘图都是相同的。所有三个 y 向量的类型也相同 (Array{Float64,1}),三个 y 向量看起来也相同。
show(y)
[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955 ]
show(y1)
[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955 ]
show(y2)
[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955 ]
但是:
y == y1 # true
y == y2 # false
y1 == y2 # false
为什么会这样?我假设这与 LogisticV2 中的 tanh 函数引入的浮点变量有关,但我不确定。我很感激对此的任何见解。
编辑:修正了一些拼写错误以使代码可运行
要比较浮点数,请使用 isapprox
而不是 ==
。
在您的情况下,您会看到 isapprox(y,y1) == isapprox(y,y2) == isapprox(y1,y2) == true
。此外,您可以检查 maximum(abs.(y-y2))
以了解浮点精度顺序的差异(我发现 1.1102230246251565e-16
)。 (但是请注意,默认情况下 isapprox
检查相对偏差)
I assume this has something to do with floating point variations introduced by the tanh function in LogisticV2
你是对的:
julia> (y .== y1)'
1×21 RowVector{Bool,BitArray{1}}:
true true true true true true true true true true true true true true true true true true true true true
julia> (y .== y2)'
1×21 RowVector{Bool,BitArray{1}}:
false false false false false false false false false true true true false false true false false true false false false
但是:
julia> y ≈ y2 # \approx<TAB> for: ≈ symbol
true
≈
是 isapprox
的 Unicode 别名:
help?> ≈
"≈" can be typed by \approx
search: ≈
isapprox(x, y; rtol::Real=sqrt(eps), atol::Real=0, nans::Bool=false, norm::Function)
Inexact equality comparison:
true
ifnorm(x-y) <= atol + tol*max(norm(x), norm(y))
. The defaultatol
is zero and the defaultrtol
depends on the types ofx
andy
. The keyword argument nans determines whether or not NaN values are considered equal (defaults to false).For real or complex floating-point values,
rtol
defaults tosqrt(eps(typeof(real(x-y))))
. This corresponds to requiring equality of about half of the significand digits. For other types,rtol
defaults to zero.
x
andy
may also be arrays of numbers, in which casenorm
defaults tovecnorm
but may be changed by passing anorm::Function
keyword argument. (For numbers,norm
is the same thing asabs
.) Whenx
andy
are arrays, ifnorm(x-y)
is not finite (i.e.±Inf
orNaN
), the comparison falls back to checking whether all elements ofx
andy
are approximately equal component-wise.The binary operator
≈
is equivalent toisapprox
with the default arguments, andx ≉ y
is equivalent to!isapprox(x,y)
.julia> 0.1 ≈ (0.1 - 1e-10) true julia> isapprox(10, 11; atol = 2) true julia> isapprox([10.0^9, 1.0], [10.0^9, 2.0]) true