ActiveRecord 集合减法
ActiveRecord Collection subtraction
对从类似的 ActiveRecord 集合中减去查询有疑问。
假设我有一个查询如下:
all_users = User.all
users_with_adequate_reviews = User.joins(:reviews).select("users.id, count(*) as num_reviews").group(:id).having("num_reviews > 5")
如果我这样做 all_users - users_with_adequate_reviews
,我会得到我所期望的,即评论数少于 5 的用户。即使我只有 [=18],ActiveRecord 关系减法如何知道删除相似的记录=] 一些来自用户的属性(主要是 id)。正在寻找有关此的文档,但无法在任何地方找到它
减法的定义在哪里?
ActiveRecord::Delegation 模块上定义了 ActiveRecord 关系的减法。
如果您正在挖掘该源代码,您可以看到该方法是从 Array class.
委托的
所以我们需要深挖Array的减法来理解ActiveRecord关系的减法是如何工作的。
数组减法如何工作?
这取自documentation关于数组减法/差分的内容。
Array Difference
Returns a new array that is a copy of the original array, removing any
items that also appear in other_ary. The order is preserved from the
original array.
It compares elements using their hash and eql? methods for efficiency.
这意味着减法评估两个方法:hash
&& eql?
从每个对象执行任务。
这些方法如何作用于 ACTIVE RECORD 对象?
下面的代码取自 ActiveRecord::Core 模块。
def ==(comparison_object)
super ||
comparison_object.instance_of?(self.class) &&
!id.nil? &&
comparison_object.id == id
end
alias :eql? :==
def hash
if id
self.class.hash ^ id.hash
else
super
end
end
您可以看到 hash
和 eql?
都只评估 class
和 id
。
这意味着 all_users - users_with_adequate_reviews
将排除某些对象 仅当两个元素中的任何对象具有相同的对象 ID 和对象的 class。
另一个样本
irb(main):001:0> users = User.all
User Load (26.4ms) SELECT `users`.* FROM `users` LIMIT 11
=> #<ActiveRecord::Relation [
#<User id: 1, name: "Bob", created_at: "2020-06-09 13:03:45", updated_at: "2020-06-09 13:03:45">,
#<User id: 2, name: "Danny", created_at: "2020-06-09 13:04:14", updated_at: "2020-06-09 13:04:14">,
#<User id: 3, name: "Alan", created_at: "2020-06-09 13:05:30", updated_at: "2020-06-09 13:05:30">,
#<User id: 4, name: "Joe", created_at: "2020-06-09 13:07:00", updated_at: "2020-06-09 13:07:00">]>
irb(main):002:0> users_with_multiple_emails = User.joins(:user_emails).select("users.id, users.name, count(*) as num_emails").group(:id).having("num_emails > 1")
User Load (2.8ms) SELECT users.id, users.name, count(*) as num_emails FROM `users` INNER JOIN `user_emails` ON `user_emails`.`user_id` = `users`.`id` GROUP BY `users`.`id` HAVING (num_emails > 1) LIMIT 11
=> #<ActiveRecord::Relation [#<User id: 1, name: "Bob">]>
irb(main):003:0> users - users_with_multiple_emails
=> [
#<User id: 2, name: "Danny", created_at: "2020-06-09 13:04:14", updated_at: "2020-06-09 13:04:14">,
#<User id: 3, name: "Alan", created_at: "2020-06-09 13:05:30", updated_at: "2020-06-09 13:05:30">,
#<User id: 4, name: "Joe", created_at: "2020-06-09 13:07:00", updated_at: "2020-06-09 13:07:00">]
如您所见,all users - users_with_multiple_emails
排除了第一个对象 (Bob)。
为什么?这是因为来自两个元素的 Bob
具有相同的 id 和 class (id: 1, class: User)
减法returns不同的结果如果是这样
irb(main):001:0> users = User.all
User Load (26.4ms) SELECT `users`.* FROM `users` LIMIT 11
=> #<ActiveRecord::Relation [
#<User id: 1, name: "Bob", created_at: "2020-06-09 13:03:45", updated_at: "2020-06-09 13:03:45">,
#<User id: 2, name: "Danny", created_at: "2020-06-09 13:04:14", updated_at: "2020-06-09 13:04:14">,
#<User id: 3, name: "Alan", created_at: "2020-06-09 13:05:30", updated_at: "2020-06-09 13:05:30">,
#<User id: 4, name: "Joe", created_at: "2020-06-09 13:07:00", updated_at: "2020-06-09 13:07:00">]>
irb(main):002:0> users_with_multiple_emails = User.joins(:user_emails).select("users.name, count(*) as num_emails").group(:id).having("num_emails > 1")
User Load (2.3ms) SELECT users.name, count(*) as num_emails FROM `users` INNER JOIN `user_emails` ON `user_emails`.`user_id` = `users`.`id` GROUP BY `users`.`id` HAVING (num_emails > 1) LIMIT 11
=> #<ActiveRecord::Relation [#<User id: nil, name: "Bob">]>
irb(main):003:0> users - users_with_multiple_emails
=> [
#<User id: 1, name: "Bob", created_at: "2020-06-09 13:03:45", updated_at: "2020-06-09 13:03:45">,
#<User id: 2, name: "Danny", created_at: "2020-06-09 13:04:14", updated_at: "2020-06-09 13:04:14">,
#<User id: 3, name: "Alan", created_at: "2020-06-09 13:05:30", updated_at: "2020-06-09 13:05:30">,
#<User id: 4, name: "Joe", created_at: "2020-06-09 13:07:00", updated_at: "2020-06-09 13:07:00">]
这次users_with_multiple_emails
只有select名字&num_emails.
如您所见,all users - users_with_multiple_emails
不排除 Bob
。
为什么?这是因为两个元素的 Bob
具有不同的 id。
Bob
来自 users
的 id : 1
Bob
来自 users_with_multiple_emails
的 id:无
对从类似的 ActiveRecord 集合中减去查询有疑问。
假设我有一个查询如下:
all_users = User.all
users_with_adequate_reviews = User.joins(:reviews).select("users.id, count(*) as num_reviews").group(:id).having("num_reviews > 5")
如果我这样做 all_users - users_with_adequate_reviews
,我会得到我所期望的,即评论数少于 5 的用户。即使我只有 [=18],ActiveRecord 关系减法如何知道删除相似的记录=] 一些来自用户的属性(主要是 id)。正在寻找有关此的文档,但无法在任何地方找到它
减法的定义在哪里?
ActiveRecord::Delegation 模块上定义了 ActiveRecord 关系的减法。
如果您正在挖掘该源代码,您可以看到该方法是从 Array class.
委托的所以我们需要深挖Array的减法来理解ActiveRecord关系的减法是如何工作的。
数组减法如何工作?
这取自documentation关于数组减法/差分的内容。
Array Difference
Returns a new array that is a copy of the original array, removing any items that also appear in other_ary. The order is preserved from the original array.
It compares elements using their hash and eql? methods for efficiency.
这意味着减法评估两个方法:hash
&& eql?
从每个对象执行任务。
这些方法如何作用于 ACTIVE RECORD 对象?
下面的代码取自 ActiveRecord::Core 模块。
def ==(comparison_object)
super ||
comparison_object.instance_of?(self.class) &&
!id.nil? &&
comparison_object.id == id
end
alias :eql? :==
def hash
if id
self.class.hash ^ id.hash
else
super
end
end
您可以看到 hash
和 eql?
都只评估 class
和 id
。
这意味着 all_users - users_with_adequate_reviews
将排除某些对象 仅当两个元素中的任何对象具有相同的对象 ID 和对象的 class。
另一个样本
irb(main):001:0> users = User.all
User Load (26.4ms) SELECT `users`.* FROM `users` LIMIT 11
=> #<ActiveRecord::Relation [
#<User id: 1, name: "Bob", created_at: "2020-06-09 13:03:45", updated_at: "2020-06-09 13:03:45">,
#<User id: 2, name: "Danny", created_at: "2020-06-09 13:04:14", updated_at: "2020-06-09 13:04:14">,
#<User id: 3, name: "Alan", created_at: "2020-06-09 13:05:30", updated_at: "2020-06-09 13:05:30">,
#<User id: 4, name: "Joe", created_at: "2020-06-09 13:07:00", updated_at: "2020-06-09 13:07:00">]>
irb(main):002:0> users_with_multiple_emails = User.joins(:user_emails).select("users.id, users.name, count(*) as num_emails").group(:id).having("num_emails > 1")
User Load (2.8ms) SELECT users.id, users.name, count(*) as num_emails FROM `users` INNER JOIN `user_emails` ON `user_emails`.`user_id` = `users`.`id` GROUP BY `users`.`id` HAVING (num_emails > 1) LIMIT 11
=> #<ActiveRecord::Relation [#<User id: 1, name: "Bob">]>
irb(main):003:0> users - users_with_multiple_emails
=> [
#<User id: 2, name: "Danny", created_at: "2020-06-09 13:04:14", updated_at: "2020-06-09 13:04:14">,
#<User id: 3, name: "Alan", created_at: "2020-06-09 13:05:30", updated_at: "2020-06-09 13:05:30">,
#<User id: 4, name: "Joe", created_at: "2020-06-09 13:07:00", updated_at: "2020-06-09 13:07:00">]
如您所见,all users - users_with_multiple_emails
排除了第一个对象 (Bob)。
为什么?这是因为来自两个元素的 Bob
具有相同的 id 和 class (id: 1, class: User)
减法returns不同的结果如果是这样
irb(main):001:0> users = User.all
User Load (26.4ms) SELECT `users`.* FROM `users` LIMIT 11
=> #<ActiveRecord::Relation [
#<User id: 1, name: "Bob", created_at: "2020-06-09 13:03:45", updated_at: "2020-06-09 13:03:45">,
#<User id: 2, name: "Danny", created_at: "2020-06-09 13:04:14", updated_at: "2020-06-09 13:04:14">,
#<User id: 3, name: "Alan", created_at: "2020-06-09 13:05:30", updated_at: "2020-06-09 13:05:30">,
#<User id: 4, name: "Joe", created_at: "2020-06-09 13:07:00", updated_at: "2020-06-09 13:07:00">]>
irb(main):002:0> users_with_multiple_emails = User.joins(:user_emails).select("users.name, count(*) as num_emails").group(:id).having("num_emails > 1")
User Load (2.3ms) SELECT users.name, count(*) as num_emails FROM `users` INNER JOIN `user_emails` ON `user_emails`.`user_id` = `users`.`id` GROUP BY `users`.`id` HAVING (num_emails > 1) LIMIT 11
=> #<ActiveRecord::Relation [#<User id: nil, name: "Bob">]>
irb(main):003:0> users - users_with_multiple_emails
=> [
#<User id: 1, name: "Bob", created_at: "2020-06-09 13:03:45", updated_at: "2020-06-09 13:03:45">,
#<User id: 2, name: "Danny", created_at: "2020-06-09 13:04:14", updated_at: "2020-06-09 13:04:14">,
#<User id: 3, name: "Alan", created_at: "2020-06-09 13:05:30", updated_at: "2020-06-09 13:05:30">,
#<User id: 4, name: "Joe", created_at: "2020-06-09 13:07:00", updated_at: "2020-06-09 13:07:00">]
这次users_with_multiple_emails
只有select名字&num_emails.
如您所见,all users - users_with_multiple_emails
不排除 Bob
。
为什么?这是因为两个元素的 Bob
具有不同的 id。
Bob
来自users
的 id : 1Bob
来自users_with_multiple_emails
的 id:无