GridFS 返回的文件不完整

Incomplete file returned by GridFS

我正在开发一个 Java 项目,使用 GridFS 规范从 MongoDB 存储和检索文件。我正在使用 MongoDB Java 驱动程序文档中提供的代码片段 https://mongodb.github.io/mongo-java-driver/4.1/driver/tutorials/gridfs/.

在使用 OpenDownloadStream 检索文件时,我注意到如果文件被分成多个块,它 returns 只有第一个块,而不是完整文件。

ObjectId fileId;

GridFSDownloadStream downloadStream = gridFSBucket.openDownloadStream(fileId);
int fileLength = (int) downloadStream.getGridFSFile().getLength();
byte[] bytesToWriteTo = new byte[fileLength];
downloadStream.read(bytesToWriteTo);    /*read file contents */
downloadStream.close();

System.out.println(new String(bytesToWriteTo, StandardCharsets.UTF_8));

有什么解决办法吗?

查看实现 GridFSDownloadStream 的 class GridFSDownloadStreamImpl,看起来方法 read(byte[]) 逐块读取:

@Override
public int read(final byte[] b) {
    return read(b, 0, b.length);
}

@Override
public int read(final byte[] b, final int off, final int len) {
    checkClosed();

    if (currentPosition == length) {
        return -1;
    } else if (buffer == null) {
        buffer = getBuffer(chunkIndex);
    } else if (bufferOffset == buffer.length) {
        chunkIndex += 1;
        buffer = getBuffer(chunkIndex);
        bufferOffset = 0;
    }

    int r = Math.min(len, buffer.length - bufferOffset);
    System.arraycopy(buffer, bufferOffset, b, off, r);
    bufferOffset += r;
    currentPosition += r;
    return r;
}

因此,您必须循环直到实际读取所有预期的字节:

byte[] bytesToWriteTo = new byte[fileLength];
int bytesRead = 0;
while(bytesRead < fileLength) {
    int newBytesRead = downloadStream.read(bytesToWriteTo);
    if(newBytesRead == -1) {
        throw new Exception();
    }
    bytesRead += newBytesRead;
}
downloadStream.close();

请注意,我无法测试以上代码,因此请谨慎使用。

我最终使用了 readAllBytes() 方法,它 returns 整个文件。

GridFSDownloadStream downloadStream = gridFSBucket.openDownloadStream(fileId);
int fileLength = (int) downloadStream.getGridFSFile().getLength();
byte[] bytesToWriteTo = new byte[fileLength];
bytesToWriteTo = downloadStream.readAllBytes();
downloadStream.close();