如何同步新的 ActiveStorage 镜像?

How to sync new ActiveStorage mirrors?

从 ActiveStorage 开始,您可以知道定义用于存储文件的镜像。

local:
  service: Disk
  root: <%= Rails.root.join("storage") %>

amazon:
  service: S3
  access_key_id: <%= Rails.application.credentials.dig(:aws, :access_key_id) %>
  secret_access_key: <%= Rails.application.credentials.dig(:aws, :secret_access_key) %>
  region: us-east-1
  bucket: mybucket

mirror:
  service: Mirror
  primary: local
  mirrors:
    - amazon
    - another_mirror

如果您在某个时间点后添加镜像,则必须注意复制所有文件,例如从 "local" 到 "amazon" 或 "another_mirror"。

  1. 有没有方便的方法来保持文件同步?
  2. 或方法 运行 验证以检查所有文件是否在每项服务上可用?

所有内容都根据 ActiveStorage 的密钥存储,因此只要您的存储桶名称和文件名在传输过程中没有更改,您就可以将所有内容复制到新服务中。 了解如何复制内容。

我有几个可能适合您的解决方案,一个适用于 Rails <= 6.0,另一个适用于 Rails >= 6.1:

首先,您需要遍历 ActiveStorage blob:

ActiveStorage::Blob.all.each do |blob|
  # work with blob
end

然后...

  1. Rails <= 6.0

    您将需要 blob 的密钥、校验和以及磁盘上的本地文件。

    local_file = ActiveStorage::Blob.service.primary.path_for blob.key
    
    # I'm picking the first mirror as an example,
    # but you can select a specific mirror if you want
    mirror = blob.service.mirrors.first
    
    mirror.upload blob.key, File.open(local_file), checksum: blob.checksum
    

    您可能还想避免上传镜像中已存在的文件。您可以这样做:

    mirror = blob.service.mirrors.first
    
    # If the file doesn't exist on the mirror, upload it
    unless mirror.exist? blob.key
      # Upload file to mirror
    end
    

    将它们放在一起,rake 任务可能如下所示:

    # lib/tasks/active_storage.rake
    
    namespace :active_storage do
    
      desc 'Ensures all files are mirrored'
      task mirror_all: [:environment] do
    
      # Iterate through each blob
      ActiveStorage::Blob.all.each do |blob|
    
        # We assume the primary storage is local
        local_file = ActiveStorage::Blob.service.primary.path_for blob.key
    
        # Iterate through each mirror
        blob.service.mirrors.each do |mirror|
    
          # If the file doesn't exist on the mirror, upload it
          mirror.upload(blob.key, File.open(local_file), checksum: blob.checksum) unless mirror.exist? blob.key
    
          end
        end
      end
    end
    

    您可能 运行 遇到类似 的情况,您可能需要从本地磁盘以外的其他地方进行镜像。在这种情况下,rake 任务可能如下所示:

    # lib/tasks/active_storage.rake
    
    namespace :active_storage do
    
      desc 'Ensures all files are mirrored'
      task mirror_all: [:environment] do
    
        # All services in our rails configuration
        all_services = [ActiveStorage::Blob.service.primary, *ActiveStorage::Blob.service.mirrors]
    
        # Iterate through each blob
        ActiveStorage::Blob.all.each do |blob|
    
          # Select services where file exists
          services = all_services.select { |file| file.exist? blob.key }
    
          # Skip blob if file doesn't exist anywhere
          next unless services.present?
    
          # Select services where file doesn't exist
          mirrors = all_services - services
    
          # Open the local file (if one exists)
          local_file = File.open(services.find{ |service| service.is_a? ActiveStorage::Service::DiskService }.path_for blob.key) if services.select{ |service| service.is_a? ActiveStorage::Service::DiskService }.any?
    
          # Upload local file to mirrors (if one exists)
          mirrors.each do |mirror|
            mirror.upload blob.key, local_file, checksum: blob.checksum
          end if local_file.present?
    
          # If no local file exists then download a remote file and upload it to the mirrors (thanks @Rystraum)
          services.first.open blob.key, checksum: blob.checksum do |temp_file|
            mirrors.each do |mirror|
              mirror.upload blob.key, temp_file, checksum: blob.checksum
            end
          end unless local_file.present?
    
        end
      end
    end
    

    虽然第一个 rake 任务回答了 OP 的问题,但后者的用途要广泛得多:

    • 它可以与任何服务组合一起使用
    • 不需要磁盘服务
    • 优先通过磁盘服务上传
    • 避免extra存在?调用,因为我们每个服务每个 blob 只调用一次
  2. Rails > 6.1

    超级简单,只需在每个 blob 上调用它...

    blob.mirror_later
    

    将其包装成抽取任务如下所示:

    # lib/tasks/active_storage.rake
    
    namespace :active_storage do
    
      desc 'Ensures all files are mirrored'
      task mirror_all: [:environment] do
        ActiveStorage::Blob.all.each do |blob|
          blob.mirror_later
        end
      end
    end
    

我在 之上工作,所以 rake 任务不假定文件在本地。

我从 S3 开始,由于成本问题,我决定将文件移动到磁盘并改为使用 S3 和 Azure 作为镜像。

所以我的情况是,对于某些文件,我的主(磁盘)有时没有该文件,而我的完整存储库实际上在我的第一个镜像上。

所以,有两件事:

  1. 将文件从 S3 移动到磁盘
  2. 添加了新镜像,并希望保持最新
namespace :active_storage do
  desc "Ensures all files are mirrored"
  task mirror_all: [:environment] do
    ActiveStorage::Blob.all.each do |blob|
      source_mirror = if blob.service.primary.exist? blob.key
                        blob.service.primary
                      else
                        blob.service.mirrors.find { |m| m.exist? blob.key }
                      end

      source_mirror.open(blob.key, checksum: blob.checksum) do |file|
        blob.service.primary.upload(blob.key, file, checksum: blob.checksum) unless blob.service.primary.exist? blob.key

        blob.service.mirrors.each do |mirror|
          next if mirror == source_mirror

          mirror.upload(blob.key, file, checksum: blob.checksum) unless mirror.exist? blob.key
        end
      end
    rescue StandardError
      puts blob.key.to_s
    end
  end
end

(03-11-2021) 在 Rails > 6.1.4.1,使用 active_storage > 6.1.4.1 以及在:

宝石文件:

gem 'azure-storage-blob', github: 'Azure/azure-storage-ruby'

config/environments/production.rb

 # Store uploaded files on the local file system (see config/storage.yml for options).
  config.active_storage.service = :mirror #:microsoft or #:amazon

config/storage.yml:

amazon:
  service: S3
  access_key_id: XXX
  secret_access_key: XXX
  region: XXX
  bucket: XXX

microsoft:
  service: AzureStorage
  storage_account_name: YYY
  storage_access_key: YYY
  container: YYY

mirror:
  service: Mirror
  primary: amazon
  mirrors: [ microsoft ]

有效:

ActiveStorage::Blob.all.each do |blob|
  blob.mirror_later
end && puts("Mirroring done!")

DID 的工作是:

ActiveStorage::Blob.all.each do |blob|
  ActiveStorage::Blob.service.try(:mirror, blob.key, checksum: blob.checksum)
end && puts("Mirroring done!")

不确定为什么会这样,也许 Rails 的未来版本会支持它,或者它需要额外的后台作业设置,或者它最终会发生(这对我来说从未发生过)。

TL;DR

如果您需要立即对整个存储进行镜像,请添加此 rake 任务并在您给定的环境中使用 bundle exec rails active_storage:mirror_all:

执行它

lib/tasks/active_storage.耙

namespace :active_storage do
  desc 'Ensures all files are mirrored'
  task mirror_all: [:environment] do
    ActiveStorage::Blob.all.each do |blob|
      ActiveStorage::Blob.service.try(:mirror, blob.key, checksum: blob.checksum)
    end && puts("Mirroring done!")
  end
end

可选:
镜像所有 blob 后,如果您希望它们真正从正确的存储中获得服务,您可能想要更改它们的所有服务名称:

namespace :active_storage do
  desc 'Change each blob service name to microsoft'
    task switch_to_microsoft: [:environment] do
      ActiveStorage::Blob.all.each do |blob|
        blob.service_name = 'microsoft'
        blob.save
    end && puts("All blobs will now be served from microsoft!")
  end
end

最后,更改:production.rb 中的 config.active_storage.service= 或将主镜像设为您希望以后上传的目标镜像。