为什么 Regexp 对象被认为是 Ruby 中的 "falsy"？

Question

Ruby 有一个普遍的想法“真实性”和“虚假性”。

Ruby 有两个特定的类布尔对象，TrueClass and FalseClass，单例实例由特殊变量 true 和 false。

然而，truthiness和falsiness不限于这两个类的实例，概念是universal 并适用于 Ruby 中的每个对象。每个对象都是 truthy 或 falsy。规则非常简单。特别是，只有两个对象是假的:

nil，NilClass和
false，FalseClass

每个其他对象都是真实的。这甚至包括在其他编程语言中被视为 falsy 的对象，例如

Integer 0,
Float 0.0,
空String'',
空Array[],
空Hash{},

这些规则内置于语言中，用户无法定义。没有 to_bool 隐式转换或类似的东西。

引用自ISO Ruby Language Specification：

6.6 Boolean values

An object is classified into either a trueish object or a falseish object.

Only false and nil are falseish objects. false is the only instance of the class FalseClass (see 15.2.6), to which a false-expression evaluates (see 11.5.4.8.3). nil is the only instance of the class NilClass (see 15.2.4), to which a nil-expression evaluates (see 11.5.4.8.2).

Objects other than false and nil are classified into trueish objects. true is the only instance of the class TrueClass (see 15.2.5), to which a true-expression evaluates (see 11.5.4.8.3).

The executable Ruby/Spec seems to agree:

it "considers a non-nil and non-boolean object in expression result as true" do
  if mock('x')
    123
  else
    456
  end.should == 123
end

根据这两个来源，我假设 Regexp 也是 真实的 ，但根据我的测试，它们不是：

if // then 'Regexps are truthy' else 'Regexps are falsy' end
#=> 'Regexps are falsy'

我在 YARV 2.7.0-preview1, TruffleRuby 19.2.0.1, and JRuby 9.2.8.0 上测试过这个。所有三种实现都相互一致，但不同意 ISO Ruby 语言规范和我对 Ruby/Spec.

的解释

更准确地说，Regexp 对象是评估 Regexp 文字的结果 falsy，而Regexp 作为其他表达式结果的对象是 truthy:

r = //
if r then 'Regexps are truthy' else 'Regexps are falsy' end
#=> 'Regexps are truthy'

这是错误还是期望的行为？

Answer 1

这是（据我所知）ruby 语言的一个未记录的特性的结果，最好由 this spec:

解释

it "matches against $_ (last input) in a conditional if no explicit matchee provided" do
  -> {
    eval <<-EOR
    $_ = nil
    (true if /foo/).should_not == true
    $_ = "foo"
    (true if /foo/).should == true
    EOR
  }.should complain(/regex literal in condition/)
end

您通常可以将 $_ 视为 "last string read by gets"

为了让事情变得更加混乱，$_ (along with $-) is not a global variable; it has local scope。

当 ruby 脚本启动时，$_ == nil。

所以，代码：

// ? 'Regexps are truthy' : 'Regexps are falsey'

被解释为：

(// =~ nil) ? 'Regexps are truthy' : 'Regexps are falsey'

...哪个 returns 错误。

另一方面，对于 非文字 正则表达式（例如 r = // 或 Regexp.new('')），此特殊解释不适用。

// 是真实的；就像 ruby 中除了 nil 和 false.

中的所有其他对象一样

除非运行一个 ruby 脚本直接在命令行上（即带有 -e 标志），ruby 解析器将显示警告以防止此类用法：

warning: regex literal in condition

您可以在脚本中使用此行为，例如：

puts "Do you want to play again?"
gets
# (user enters e.g. 'Yes' or 'No')
/y/i ? play_again : back_to_menu

...但是将局部变量分配给 gets 的结果并显式对该值执行正则表达式检查会更正常。

我不知道有任何使用 empty 正则表达式执行此检查的用例，尤其是当定义为文字值时。您突出显示的结果确实会让大多数 ruby 开发人员措手不及。

Answer 2

这不是错误。发生的事情是 Ruby 正在重写代码，以便

if /foo/
  whatever
end

实际上变成了

if /foo/ =~ $_
  whatever
end

如果您在普通脚本中运行此代码（并且未使用 -e 选项），那么您应该会看到一条警告：

warning: regex literal in condition

大多数时候这可能有点令人困惑，这就是发出警告的原因，但对于使用 -e 选项的一行很有用。例如，您可以使用

从文件中打印与给定正则表达式匹配的所有行

$ ruby -ne 'print if /foo/' filename

（print 的默认参数也是 $_。）

为什么 Regexp 对象被认为是 Ruby 中的 "falsy"？

Why is a Regexp object considered to be "falsy" in Ruby?

ruby

regex

boolean

yarv

jruby

6.6 Boolean values