使用 Ruby 创建一个包含特定路径(没有 chdir)内容的 tar.gz

Create a tar.gz with contens of a specific path (without chdir) with Ruby

我正在研究 Ruby 中的方法,该方法将创建一个 tar.gz 文件,该文件将归档目录和文件在特定路径 (cdpath) 下,预计类似于 tar -C cdpath -zcf targzfile srcs,但不更改 CWD(以保持线程安全)。我正在使用 Gem::Package::TarWriter 创建 Tar 对象并用 Zlib::GzipWriter 包装它以进行压缩。

这是我想出的(这只是一个简单的独立测试):

require 'rubygems/package'
require 'zlib'
require 'pathname'
require 'find'

cdpath="/absolute/path/to/some/place"
targzfile="test.tar.gz"
src=["some-dir-name-at-cdpath"]

BLOCKSIZE_TO_READ = 1024 * 1000

path = Pathname.new(cdpath)
raise "path #{cdpath} should be an absolute path" unless path.absolute?
raise "path #{cdpath} should be a directory" unless File.directory? cdpath
raise "Destination tar.gz file #{targzfile} already exists" if File.exist? targzfile
raise "no file or directory to tar" if !src || src.length == 0

src.each { |p| p.sub! /^/, "#{cdpath}/" }
File.open targzfile, 'wb' do |otargzfile|
  Zlib::GzipWriter.wrap otargzfile do |gz|
    Gem::Package::TarWriter.new gz do |tar|
      Find.find *src do |f|
        relative_path = f.sub "#{cdpath}/", ""
        mode = File.stat(f).mode
        if File.directory? f
          tar.mkdir relative_path, mode
        else
          File.open f, 'rb' do |rio|
            tar.add_file relative_path, mode do |tio|
              tio.write rio.read
            end
          end
        end
      end
    end
  end
end

但是,我遇到了以下异常,我似乎无法弄清楚我做错了什么。

/usr/lib/ruby/2.1.0/rubygems/package/tar_writer.rb:108:in `add_file': Gem::Package::NonSeekableIO (Gem::Package::NonSeekableIO)
        from tartest2.rb:29:in `block (5 levels) in <main>'
        from tartest2.rb:28:in `open'
        from tartest2.rb:28:in `block (4 levels) in <main>'
        from /usr/lib/ruby/2.1.0/find.rb:48:in `block (2 levels) in find'
        from /usr/lib/ruby/2.1.0/find.rb:47:in `catch'
        from /usr/lib/ruby/2.1.0/find.rb:47:in `block in find'
        from /usr/lib/ruby/2.1.0/find.rb:42:in `each'
        from /usr/lib/ruby/2.1.0/find.rb:42:in `find'
        from tartest2.rb:22:in `block (3 levels) in <main>'
        from /usr/lib/ruby/2.1.0/rubygems/package/tar_writer.rb:85:in `new'
        from tartest2.rb:21:in `block (2 levels) in <main>'
        from tartest2.rb:20:in `wrap'
        from tartest2.rb:20:in `block in <main>'
        from tartest2.rb:19:in `open'
        from tartest2.rb:19:in `<main>'

编辑: 我能够解决这个问题,通过使用 TarWriteradd_file_simple 而不是 add_file,文件大小需要使用File.stat方法获取,详情在下面的答案中。

如OP中所述,解决方案是使用add_file_simple方法而不是add_file,这也需要您使用File.stat方法获取文件大小。

这是一个工作方法:

  # similar as 'tar -C cdpath -zcf targzfile srcs', the difference is 'srcs' is related
  # to the current working directory, instead of 'cdpath'
  def self.cdtargz(cdpath, targzfile, *src)
    path = Pathname.new(cdpath)
    raise "path #{cdpath} should be an absolute path" unless path.absolute?
    raise "path #{cdpath} should be a directory" unless File.directory? cdpath
    raise "Destination tar.gz file #{targzfile} already exists" if File.exist? targzfile
    raise "no file or directory to tar" if !src || src.length == 0

    src.each { |p| p.sub! /^/, "#{cdpath}/" }
    File.open targzfile, 'wb' do |otargzfile|
      Zlib::GzipWriter.wrap otargzfile do |gz|
        Gem::Package::TarWriter.new gz do |tar|
          Find.find *src do |f|
            relative_path = f.sub "#{cdpath}/", ""
            mode = File.stat(f).mode
            size = File.stat(f).size
            if File.directory? f
              tar.mkdir relative_path, mode
            else
              tar.add_file_simple relative_path, mode, size do |tio|
                File.open f, 'r' do |rio|
                  tio.write rio.read
                end
              end
            end
          end
        end
      end
    end
  end

编辑: 在查看 this 问题的答案后,我稍微修改了上面的内容以避免 "slurping" 文件,在我的情况下 95%文件很小,但很少有很大的文件,所以这很有意义。这是更新后的版本:

  BLOCKSIZE_TO_READ = 1024 * 1000

  def self.cdtargz(cdpath, targzfile, *src)
    path = Pathname.new(cdpath)
    raise "path #{cdpath} should be an absolute path" unless path.absolute?
    raise "path #{cdpath} should be a directory" unless File.directory? cdpath
    raise "Destination tar.gz file #{targzfile} already exists" if File.exist? targzfile
    raise "no file or directory to tar" if !src || src.length == 0

    src.each { |p| p.sub! /^/, "#{cdpath}/" }
    File.open targzfile, 'wb' do |otargzfile|
      Zlib::GzipWriter.wrap otargzfile do |gz|
        Gem::Package::TarWriter.new gz do |tar|
          Find.find *src do |f|
            relative_path = f.sub "#{cdpath}/", ""
            mode = File.stat(f).mode
            size = File.stat(f).size
            if File.directory? f
              tar.mkdir relative_path, mode
            else
              tar.add_file_simple relative_path, mode, size do |tio|
                File.open f, 'rb' do |rio|
                  while buffer = rio.read(BLOCKSIZE_TO_READ)
                    tio.write buffer
                  end
                end
              end
            end
          end
        end
      end
    end
  end