如何在一些结果后停止 papaparse 流式传输

Question

我正在使用 PapaPase 解析大型 CSV 文件，使用块模式。

我正在验证 csv 数据，我想在验证失败时停止流式传输。

但经过一些解析后，我无法停止流式传输。

我试图停止使用块回调中的 return false，但它不起作用。

下面是代码。

$("#fileselect").on("change", function(e){
    if (this.files.length) {
        var file = this.files[0]
        count = 0;
        Papa.parse(file, {
            worker: true,
            delimiter: "~~",
            skipEmptyLines:true,
            chunk: function (result) {
                count += result.data.length;
                console.clear();
                console.log(count);
                if (count>60000) {
                    return false;
                }
            },
            complete: function (result, file) {
                console.log(result)
            }
        });
    }
});

Answer 1

Chunk 和 Step 也都可以访问解析器，您可以使用它来暂停、恢复或（如您所愿）中止。

step: function(results, parser) {
console.log("Row data:", results.data);
console.log("Row errors:", results.errors);
}’

所以在你的例子中，你需要这样做（未经测试）：

$("#fileselect").on("change", function(e){
    if (this.files.length) {
        var file = this.files[0]
        count = 0;
        Papa.parse(file, {
            worker: true,
            delimiter: "~~",
            skipEmptyLines:true,
            chunk: function (result, parser) {
                count += result.data.length;
                console.clear();
                console.log(count);
                if (count>60000) {
                    //return false;
                    parser.abort(); // <-- stop streaming
                }
            },
            complete: function (result, file) {
                console.log(result)
            }
        });
    }
});

查看步骤和块的文档。

https://www.papaparse.com/docs

希望对您有所帮助！

Answer 2

在我的例子中，我只需要文件中的前 10 行数据。如果有人为此需要解决方案，这里有一个我如何让它工作的例子：

为了在一定数量的行之后停止流式传输，只需在配置中传入 'preview' 选项。

let fileInput = document.getElementById('myFile');
let file = fileInput.files[0];
let parsedData; //variable to store the chunked results
Papa.parse(file, {
    worker: true,
    preview: 10, //this is what you need to do the trick,
    chunk: function(results){
       parsedData = results; //set results to the parsedData variable.
       //I'm doing this because "When streaming, parse results are not available in 
       the 'complete' callback."
    },
    complete: function(){
       console.log(parsedData); //log the results once parsing is completed
        /**
         Do whatever else you want with parsedData here.
         In my case, I just created an html table to show a preview of the data.
        */

    }
});

有了这个，您应该能够在不使浏览器崩溃的情况下解析非常大的文件。我使用超过 100 万行的 .csv 文件进行了测试，没有遇到任何问题。

查看文档：https://www.papaparse.com/docs#config-details

如何在一些结果后停止 papaparse 流式传输

How to stop papaparse streaming after some results

javascript

node.js

papaparse