使用 nodejs 从文件中删除最后 n 行

Remove last n lines from file using nodejs

我正在尝试使用 fs 作为 nodejs 的一部分从文件中删除最后 3 行。我目前正在将文件读入内存,然后在没有 3 行的情况下再次写入它,但我确信有一种更有效的方法不需要将整个文件读入内存。

我现在的代码

fs.readFile(filename, function (err, data) {
    if (err) throw err;
    theFile = data.toString().split("\n");
    theFile.splice(-3, 3);
    fs.writeFile(filename, theFile.join("\n"), function (err) {
        if (err) {
            return console.log(err);
        }
        console.log("Removed last 3 lines");
        console.log(theFile.length);

    });
});

让我们创建一个大文件:

$ base64 /dev/urandom | head -1000000 > /tmp/crap
$ wc -l /tmp/crap
1000000 /tmp/crap
$ du -sh /tmp/crap
74M /tmp/crap

这是您的代码:

$ cat /tmp/a.js
var fs = require('fs');

var filename = '/tmp/crap1';

fs.readFile(filename, function(err, data) {
    if(err) throw err;
    theFile = data.toString().split("\n");
    theFile.splice(-3,3);
    fs.writeFile(filename, theFile.join("\n"), function(err) {
    if(err) {
        return console.log(err);
    }
    console.log("Removed last 3 lines");
    console.log(theFile.length);
    });
});

这是我的:

$ cat /tmp/b.js
var fs = require('fs'),
    util = require('util'),
    cp = require('child_process');

var filename = '/tmp/crap2';
var lines2nuke = 3;
var command = util.format('tail -n %d %s', lines2nuke, filename);

cp.exec(command, (err, stdout, stderr) => {
    if (err) throw err;
    var to_vanquish = stdout.length;
    fs.stat(filename, (err, stats) => {
        if (err) throw err;
        fs.truncate(filename, stats.size - to_vanquish, (err) => {
            if (err) throw err;
            console.log('File truncated!');
        })
    });
});

让我们复制同一个文件:

$ cp /tmp/crap /tmp/crap1
$ cp /tmp/crap /tmp/crap2

让我们看看谁更快:

$ time node a.js
Removed last 3 lines
999998
node a.js  0.53s user 0.19s system 99% cpu 0.720 total

$ time node b.js
File truncated!
node b.js  0.08s user 0.01s system 100% cpu 0.091 total

当我将文件大小增加 10 倍时,我的系统 运行 内存不足 a.js;但是 b.js,花费了:

$ time node b.js
File truncated!
node b.js  0.07s user 0.03s system 6% cpu 1.542 total

我的代码使用 tail, which doesn't read the whole file, it seeks to the end then read blocks backwards until the expected number of lines have been reached, then it displays the lines in the proper direction until the end of the file. Now I now the number of bytes to disappear. Then I use fs.stat, which tells me the total number of bytes in the file. Now, I know how many bytes I actually want in the file at the end, after removal of those last n lines. At the end, I use fs.truncate,这会导致常规文件被截断为指定的大小(以字节为单位)。

更新:

OP 说平台是Windows。在那种情况下,我们可以修改此程序以 调用另一个实用程序,但在节点本身中执行所有操作。幸运的是,所需的功能已经作为节点模块 read-last-lines 提供给我们。现在更新后的 os-agnostic 代码如下所示:

$ npm install read-last-lines
$ cat /tmp/c.js 
var fs = require('fs'),
    rll = require('read-last-lines');

var filename = '/tmp/crap2';
var lines2nuke = 3;

rll.read(filename, lines2nuke).then((lines) => {
    var to_vanquish = lines.length;
    fs.stat(filename, (err, stats) => {
        if (err) throw err;
        fs.truncate(filename, stats.size - to_vanquish, (err) => {
            if (err) throw err;
            console.log('File truncated!');
        })
    });
});

对于 10 倍大小的文件,它花费了:

$ time node c.js
File truncated!
node c.js  0.14s user 0.04s system 8% cpu 2.022 total