Node.js archiver 需要通过 glob 排除文件类型的语法

Node.js archiver Need syntax for excluding file types via glob

使用 archiver.js(对于 Node.js),我需要从递归(多子目录)存档中排除图像。这是我的代码:

const zip = archiver('zip', { zlib: { level: 9 } });
const output = await fs.createWriteStream(`backup/${fileName}.zip`);
res.setHeader('Content-disposition', `attachment; filename=${fileName}.zip`);
res.setHeader('Content-type', 'application/download');
output.on('close', function () {
  res.download(`backup/${fileName}.zip`, `${fileName}.zip`);
});
output.on('end', function () {
  res.download(`backup/${fileName}.zip`, `${fileName}.zip`);
});
zip.pipe(output);
zip.glob('**/*',
  {
    cwd: 'user_uploads',
    ignore: ['*.jpg', '*.png', '*.webp', '*.bmp'],
  },
  {});
zip.finalize();

问题是它没有排除忽略文件。如何更正语法?

以下代码使用此目录结构:

node-app
    |
    |_ upload
         |_subdir1
         |_subdir2
         |_...

在代码中 __dirnamenode-app 目录(node-app 是您的应用所在的目录)。该代码改编自 快速入门

段中 https://www.archiverjs.com/ 上的代码
// require modules
const fs = require('fs');
const archiver = require('archiver');

// create a file to stream archive data to.
const output = fs.createWriteStream(__dirname + '/example.zip');
const archive = archiver('zip', {
  zlib: { level: 9 } // Sets the compression level.
});

// listen for all archive data to be written
// 'close' event is fired only when a file descriptor is involved
output.on('close', function() {
  console.log(archive.pointer() + ' total bytes');
  console.log('archiver has been finalized and the output file descriptor has closed.');
});

// This event is fired when the data source is drained no matter what was the data source.
// It is not part of this library but rather from the NodeJS Stream API.
// @see: https://nodejs.org/api/stream.html#stream_event_end
output.on('end', function() {
  console.log('Data has been drained');
});

// good practice to catch warnings (ie stat failures and other non-blocking errors)
archive.on('warning', function(err) {
  if (err.code === 'ENOENT') {
    // log warning
  } else {
    // throw error
    throw err;
  }
});

// good practice to catch this error explicitly
archive.on('error', function(err) {
  throw err;
});

// pipe archive data to the file
archive.pipe(output);

    
archive.glob('**', 
             {
                cwd: __dirname + '/upload',
                ignore: ['*.png','*.jpg']}
);

// finalize the archive (ie we are done appending files but streams have to finish yet)
// 'close', 'end' or 'finish' may be fired right after calling this method so register to them beforehand
archive.finalize();

glob 是 'global' 的缩写,因此您可以在文件名中使用 * 等通配符 ( https://en.wikipedia.org/wiki/Glob_(programming) ). So one possible accurate wildcard expression is *.jpg, *.png,... depending on the file type you want to exclude. In general the asterisk wildcard * replaces an arbitrary number of literal characters or an empty string in in the context of file systems ( file and directory names , https://en.wikipedia.org/wiki/Wildcard_character)

另见 node.js - Archiving folder using archiver generate an empty zip

Archiver uses Readdir-Glob for globbing which uses minimatch 匹配。

Readdir-Glob (node-readdir-glob/index.js#L147) is done against the full filename including the pathname and it does not allow us to apply the option matchBase 中的匹配,这将只是完整路径的基本名称。

为了让它发挥作用,您有 2 个选择:


1。让你的 glob 排除文件扩展名

您可以使用 glob 否定 !(...) 转换您的 glob 表达式以排除您不希望出现在存档文件中的所有文件扩展名,它将包括除与否定表达式匹配的内容之外的所有内容:

zip.glob(
  '**/!(*.jpg|*.png|*.webp|*.bmp)',
  {
    cwd: 'user_uploads',
  },
  {}
);

2。使 minimatch 使用完整文件路径名

要使 minimatch 在我们无法设置 matchBase 选项的情况下工作,我们必须包含匹配的目录 glob 才能工作:

zip.glob(
  '**/*',
  {
    cwd: 'user_uploads',
    ignore: ['**/*.jpg', '**/*.png', '**/*.webp', '**/*.bmp'],
  },
  {}
);

行为

Readdir-Glob 的这种行为对于 ignore 选项有点混乱:

Options

ignore: Glob pattern or Array of Glob patterns to exclude matches. If a file or a folder matches at least one of the provided patterns, it's not returned. It doesn't prevent files from folder content to be returned.

这意味着 igrore 必须是实际的 glob 表达式 ,必须包含整个 path/file 表达式。当我们指定 *.jpg 时,它将匹配 仅在根目录 中的文件 而不是子目录 中的文件。如果我们想将 JPG 文件排除到目录树深处,我们必须使用 include all 目录模式以及文件扩展名模式 **/*.jpg.

仅在子目录中排除

如果你只想排除特定子目录中的某些文件扩展名,你可以将子目录添加到路径中,否定模式如下:

// The glob pattern '**/!(Subdir)/*.jpg' will exclude all JPG files,
// that are inside any 'Subdir/' subdirectory.

zip.glob(
  '**/*',
  {
    cwd: 'user_uploads',
    ignore: ['**/!(Subdir)/*.jpg'],
  },
  {}
);