在不中止更新过程的情况下更新时确保 Rails 数据库记录的唯一性

Question

Ruby 2.3.0，Rails 4.2.4，实际上使用的是 postgreSQL 而不是 SQLite

为清楚起见进行了更新

我有一个很大的 csv 文件（每天从外部更新和下载）并编写了一个方法来更新 Rails 数据库 table。 我不希望该方法在不验证唯一性的情况下将所有行附加到数据库，因此我将这个出色的解决方案 (How do I make a column unique and index it in a Ruby on Rails migration?) 与 add_index 结合使用。

我正在使用 rake 文件来存储 executable 更新代码，然后在我的终端中输入 $ rake update_task（如果 table 与导入的代码没有重复项，则可以使用csv 行）。这个问题是数据库在遇到第一个重复条目 (ERROR: duplicate key value violates unique constraint) 时中止 (rake aborted!) 耙子。

我可以做些什么来 remove/not 保存任何重复项同时避免 aborting/failing？ 我不能简单地删除数据库 table 并重新加载它每天。这是架构：

ActiveRecord::Schema.define(version: 20160117172450) do

# These are extensions that must be enabled in order to support this database
enable_extension "plpgsql"

  create_table "tablename", force: :cascade do |t|
    t.string   "attr1"
    t.string   "attr2"
    t.string   "attr3"
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
  end

  add_index "tablename", ["attr1", "attr2", "attr3"], name: "index_tablename_on_attr1_and_attr2_and_attr3", unique: true, using: :btree

end

我的 rake 任务在 lib/tasks/file_name.rake 内容：

desc "Download data and update database table"

task :update_task => :environment do
  u = CorrectClassName.new
  u.perform_this
end

和CorrectClassName在app/directory1的.rb文件中：

class CorrectClassName

  def perform_this
    something = ClassWithUpdateCode.new
    something.update_database
  end

end

和ClassWithUpdateCode在app/directory2的.rb文件中：

require 'csv'

class ClassWithUpdateCode

  def update_database
    csv_update = File.read(Rails.root.join('lib', 'assets', "file_name.csv"))
    options = {:headers => true}

    csv = CSV.parse(csv_update, options)
    csv.each do |row|
        tm = TableModel.new

        tm.attr1 = row[0]
        tm.attr2 = row[1]
        tm.attr3 = row[2]
        tm.save # maybe I can use a different method or if statement here?
    end
  end

end

更新：@Kristan 的解决方案在下面有效，但这里是放置 begin/rescue/end 处理的地方：

在app/directory2中的.rb文件中：

require 'csv'

class ClassWithUpdateCode

  def update_database
    csv_update = File.read(Rails.root.join('lib', 'assets', "file_name.csv"))
    options = {:headers => true}

    csv = CSV.parse(csv_update, options)
    csv.each do |row|
        tm = TableModel.new
        begin
          tm.attr1 = row[0]
          tm.attr2 = row[1]
          tm.attr3 = row[2]
          tm.save
        rescue ActiveRecord::RecordNotUnique
        end
    end
  end

end

Answer 1

rake 正在退出，因为当您尝试保存违反 table 的唯一性约束的记录时会引发异常。防止这种情况发生的最简单方法是捕获并忽略异常。我假设您的记录是在 u.perform_this.

期间创建的

task :update_task => :environment do
  u = CorrectClassName.new
  begin
    u.perform_this
  rescue ActiveRecord::RecordNotUnique
    # move on
  end
end

另一种选择是将 uniqueness validation 添加到您的 Rails 模型，然后在保存之前检查 valid? 或调用 create（不是 create!），这不会引发验证异常。

class CorrectClassName < ActiveRecord::Base
  validates_uniqueness_of :attr1, scope: [:attr2, :attr3]
end

task :update_task => :environment do
  u = CorrectClassName.new(data)
  u.perform_this if u.valid?
end

在不中止更新过程的情况下更新时确保 Rails 数据库记录的唯一性

Ensure Rails database record uniqueness when updating without aborting the update process

database

postgresql

ruby-on-rails

rake-task