Julia 布尔值比较中的意外行为

Question

我正在测试逻辑函数 CDF 的不同参数化，并比较结果和不同参数对曲线的影响。

using Distributions

# Vector of x to test the different functions
x = collect(0:20)

Logis = Logistic(10, 1)  # PDF of Logistic function in Julia
y = cdf(Logis, x)       # CDF of Logistic function in Julia

# This is a standard representation of the CDF for Logistic
LogisticV1(x, μ=10, θ=1) = 1 / ( 1 + e^-((x-μ)/θ))   
y1 = LogisticV1.(x)

# This is another representation of the CDF for Logistic
LogisticV2(x, μ=10, θ=1) = 1/2 + 1/2 * tanh((x-μ)/2*θ)
y2 = LogisticV2.(x)

正如预期的那样，所有三个函数的绘图都是相同的。所有三个 y 向量的类型也相同 (Array{Float64,1})，三个 y 向量看起来也相同。

show(y)

[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955 ]

show(y1)

[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955 ]

show(y2)

[4.53979e-5, 0.000123395, 0.00033535, 0.000911051, 0.00247262, 0.00669285, 0.0179862, 0.0474259, 0.119203, 0.268941, 0.5, 0.731059, 0.880797, 0.952574, 0.982014, 0.993307, 0.997527, 0.999089, 0.999665, 0.999877, 0.999955 ]

但是：

y == y1    # true
y == y2    # false
y1 == y2   # false

为什么会这样？我假设这与 LogisticV2 中的 tanh 函数引入的浮点变量有关，但我不确定。我很感激对此的任何见解。

编辑：修正了一些拼写错误以使代码可运行

Answer 1

要比较浮点数，请使用 isapprox 而不是 ==。

在您的情况下，您会看到 isapprox(y,y1) == isapprox(y,y2) == isapprox(y1,y2) == true。此外，您可以检查 maximum(abs.(y-y2)) 以了解浮点精度顺序的差异（我发现 1.1102230246251565e-16）。（但是请注意，默认情况下 isapprox 检查相对偏差）

Answer 2

I assume this has something to do with floating point variations introduced by the tanh function in LogisticV2

你是对的：

julia> (y .== y1)'
1×21 RowVector{Bool,BitArray{1}}:
 true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true  true

julia> (y .== y2)'
1×21 RowVector{Bool,BitArray{1}}:
 false  false  false  false  false  false  false  false  false  true  true  true  false  false  true  false  false  true  false  false  false

但是：

julia> y ≈ y2    # \approx<TAB> for: ≈ symbol
true

≈ 是 isapprox 的 Unicode 别名：

help?> ≈

"≈" can be typed by \approx

search: ≈

isapprox(x, y; rtol::Real=sqrt(eps), atol::Real=0, nans::Bool=false, norm::Function)

Inexact equality comparison: true if norm(x-y) <= atol + tol*max(norm(x), norm(y)). The default atol is zero and the default rtol depends on the types of x and y. The keyword argument nans determines whether or not NaN values are considered equal (defaults to false).

For real or complex floating-point values, rtol defaults to sqrt(eps(typeof(real(x-y)))). This corresponds to requiring equality of about half of the significand digits. For other types, rtol defaults to zero.

x and y may also be arrays of numbers, in which case norm defaults to vecnorm but may be changed by passing a norm::Function keyword argument. (For numbers, norm is the same thing as abs.) When x and y are arrays, if norm(x-y) is not finite (i.e. ±Inf or NaN), the comparison falls back to checking whether all elements of x and y are approximately equal component-wise.

The binary operator ≈ is equivalent to isapprox with the default arguments, and x ≉ y is equivalent to !isapprox(x,y).
julia> 0.1 ≈ (0.1 - 1e-10)   
true

julia> isapprox(10, 11; atol = 2)
true

julia> isapprox([10.0^9, 1.0], [10.0^9, 2.0])   
true

Julia 布尔值比较中的意外行为

Unexpected behavior in Julia boolean comparison

boolean-expression

julia

logistic-regression