即使缓冲区很小,BufferedReader 的 readLine 也不会更改文件指针

readLine of BufferedReader does not change file pointer even if buffer size is small

我的应用程序逐行读取文本文件并记录每一行的偏移量,直到文件结束。偏移量仅在首次执行 readLine 时更改。之后它不再改变。我测试了从 10 到 16384 的 bufferSize。我的代码有什么问题?我使用 RandomAccessFile 而不是 FileInputStream 因为当文件很大时 seek() 比 skip() 快。

String line;        
long offset;

RandomAccessFile raf = new RandomAccessFile("data.txt", "r");
FileInputStream is = new FileInputStream(raf.getFD());
InputStreamReader isr = new InputStreamReader(is, encoding);
BufferedReader br = new BufferedReader(isr, bufferSize);

while (true) {
    offset = raf.getFilePointer(); // offset remains the same after 1st readLine. why?
    if ((line = br.readLine()) == null) // line has correct value.
        return;
    ………………………………
}

为了更新 RandomAccessFile 中的文件指针,您需要使用属于 RandomAccessFile 对象的 read() 方法。

单独制作Reader不会更新。

如果您需要使用 BufferedReader,您始终可以将 RandomAccessFile 包装在您自己的 InputStream 实现中,以便读取 inputStream 委托以读取 RandomAccessFile:

我以前不得不这样做。不难:

public final class RandomAccessFileInputStream extends InputStream{

private final RandomAccessFile randomAccessFile;
private long bytesRead=0;
/**
 * The number of bytes to read in the stream;
 * or {@code null} if we should read the whole thing.
 */
private final Long length;
private final boolean ownFile;
/**
 * Creates a new {@link RandomAccessFileInputStream}
 * of the given file starting at the given position.
 * Internally, a new {@link RandomAccessFile} is created
 * and is seek'ed to the given startOffset
 * before reading any bytes.  The internal 
 * {@link RandomAccessFile} instance is managed by this
 * class and will be closed when {@link #close()} is called.
 * @param file the {@link File} to read.
 * @param startOffset the start offset to start reading
 * bytes from.
 * @throws IOException if the given file does not exist 
 * @throws IllegalArgumentException if the startOffset is less than 0.
 */
public RandomAccessFileInputStream(File file, long startOffset) throws IOException{
    assertStartOffValid(file, startOffset);
    this.randomAccessFile = new RandomAccessFile(file, "r");
    randomAccessFile.seek(startOffset);
    this.length = null;
    ownFile =true;
}
/**
 * Creates a new {@link RandomAccessFileInputStream}
 * of the given file starting at the given position
 * but will only read the given length.
 * Internally, a new {@link RandomAccessFile} is created
 * and is seek'ed to the given startOffset
 * before reading any bytes.  The internal 
 * {@link RandomAccessFile} instance is managed by this
 * class and will be closed when {@link #close()} is called.
 * @param file the {@link File} to read.
 * @param startOffset the start offset to start reading
 * bytes from.
 * @param length the maximum number of bytes to read from the file.
 *  this inputStream will only as many bytes are in the file.
 * @throws IOException if the given file does not exist
 * @throws IllegalArgumentException if either startOffset or length are less than 0
 * or if startOffset < file.length().
 */
public RandomAccessFileInputStream(File file, long startOffset, long length) throws IOException{
    assertStartOffValid(file, startOffset);
    if(length < 0){
        throw new IllegalArgumentException("length can not be less than 0");
    }
    this.randomAccessFile = new RandomAccessFile(file, "r");
    randomAccessFile.seek(startOffset);
    this.length = length;
    ownFile =true;
}
private void assertStartOffValid(File file, long startOffset) {
    if(startOffset < 0){
        throw new IllegalArgumentException("start offset can not be less than 0");
    }

    if(file.length() < startOffset){
        throw new IllegalArgumentException(
                String.format("invalid startOffset %d: file is only %d bytes" ,
                        startOffset,
                        file.length()));
    }
}
/**
 * Creates a new RandomAccessFileInputStream that reads
 * bytes from the given {@link RandomAccessFile}.
 * Any external changes to the file pointer
 * via {@link RandomAccessFile#seek(long)} or similar
 * methods will also alter the subsequent bytes read
 * by this {@link InputStream}.
 * Closing the inputStream returned by this constructor
 * DOES NOT close the {@link RandomAccessFile} which 
 * must be closed separately by the caller.
 * @param file the {@link RandomAccessFile} instance 
 * to read as an {@link InputStream}; can not be null.
 * @throws NullPointerException if file is null.
 */
public RandomAccessFileInputStream(RandomAccessFile file){
    if(file ==null){
        throw new NullPointerException("file can not be null");
    }
    this.randomAccessFile = file;
    length = null;
    ownFile =false;
}

@Override
public synchronized int read() throws IOException {
    if(length !=null && bytesRead >=length){
        return -1;
    }
    int value = randomAccessFile.read();
    if(value !=-1){
        bytesRead++;
    }
    return value;

}

@Override
public synchronized int read(byte[] b, int off, int len) throws IOException {
    if(length != null && bytesRead >=length){
        return -1;
    }
    final int reducedLength = computeReducedLength(len);
    int numberOfBytesRead = randomAccessFile.read(b, off, reducedLength);
    bytesRead+=numberOfBytesRead;
    return numberOfBytesRead;
}
private int computeReducedLength(int len) {
    if(length ==null){
        return len;         
    }
    return Math.min(len, (int)(length - bytesRead));
}
/**
 * If this instance was creating
 * using the {@link #RandomAccessFileInputStream(RandomAccessFile)}
 * constructor, then this method does nothing- the RandomAccessFile
 * will still be open.
 * If constructed using {@link #RandomAccessFileInputStream(File, long)}
 * or {@link #RandomAccessFileInputStream(File, long, long)},
 * then the internal {@link RandomAccessFile} will be closed.
 */
@Override
public void close() throws IOException {
    //if we created this randomaccessfile
    //then its our job to close it.
    if(ownFile){
        randomAccessFile.close();
    }
}
}

编辑 我已经使用我的 RandomAccessFileInputStream 尝试了 运行 你的代码示例,问题是即使设置缓冲区大小,BufferedReader 由于某种原因仍在缓冲,因此文件指针递增 8912每当读取底层 inputStream 时。即使缓冲按预期工作,缓冲区也将始终读取下一行,因此 offset 永远不会成为行尾的位置。

如果您不想缓冲数据并且不想编写自己的读取行的实现。您可以使用 DataInputStream,它有一个已弃用的 readLine() 方法。该方法已被弃用,因为它 "does not properly convert bytes to characters" 但是,如果您使用的是 ASCII 字符,它应该没问题。

InputStream in = new RandomAccessFileInputStream(raf);
DataInputStream dataIn = new DataInputStream(in))

 ...
  if ((line = dataIn.readLine()) == null) 
  ...

按预期工作。偏移量仅更新每行的确切字节数。但是,由于它没有缓冲,因此读取文件的速度会变慢。