即使缓冲区很小,BufferedReader 的 readLine 也不会更改文件指针
readLine of BufferedReader does not change file pointer even if buffer size is small
我的应用程序逐行读取文本文件并记录每一行的偏移量,直到文件结束。偏移量仅在首次执行 readLine 时更改。之后它不再改变。我测试了从 10 到 16384 的 bufferSize。我的代码有什么问题?我使用 RandomAccessFile 而不是 FileInputStream 因为当文件很大时 seek() 比 skip() 快。
String line;
long offset;
RandomAccessFile raf = new RandomAccessFile("data.txt", "r");
FileInputStream is = new FileInputStream(raf.getFD());
InputStreamReader isr = new InputStreamReader(is, encoding);
BufferedReader br = new BufferedReader(isr, bufferSize);
while (true) {
offset = raf.getFilePointer(); // offset remains the same after 1st readLine. why?
if ((line = br.readLine()) == null) // line has correct value.
return;
………………………………
}
为了更新 RandomAccessFile
中的文件指针,您需要使用属于 RandomAccessFile 对象的 read()
方法。
单独制作Reader不会更新。
如果您需要使用 BufferedReader
,您始终可以将 RandomAccessFile 包装在您自己的 InputStream 实现中,以便读取 inputStream 委托以读取 RandomAccessFile:
我以前不得不这样做。不难:
public final class RandomAccessFileInputStream extends InputStream{
private final RandomAccessFile randomAccessFile;
private long bytesRead=0;
/**
* The number of bytes to read in the stream;
* or {@code null} if we should read the whole thing.
*/
private final Long length;
private final boolean ownFile;
/**
* Creates a new {@link RandomAccessFileInputStream}
* of the given file starting at the given position.
* Internally, a new {@link RandomAccessFile} is created
* and is seek'ed to the given startOffset
* before reading any bytes. The internal
* {@link RandomAccessFile} instance is managed by this
* class and will be closed when {@link #close()} is called.
* @param file the {@link File} to read.
* @param startOffset the start offset to start reading
* bytes from.
* @throws IOException if the given file does not exist
* @throws IllegalArgumentException if the startOffset is less than 0.
*/
public RandomAccessFileInputStream(File file, long startOffset) throws IOException{
assertStartOffValid(file, startOffset);
this.randomAccessFile = new RandomAccessFile(file, "r");
randomAccessFile.seek(startOffset);
this.length = null;
ownFile =true;
}
/**
* Creates a new {@link RandomAccessFileInputStream}
* of the given file starting at the given position
* but will only read the given length.
* Internally, a new {@link RandomAccessFile} is created
* and is seek'ed to the given startOffset
* before reading any bytes. The internal
* {@link RandomAccessFile} instance is managed by this
* class and will be closed when {@link #close()} is called.
* @param file the {@link File} to read.
* @param startOffset the start offset to start reading
* bytes from.
* @param length the maximum number of bytes to read from the file.
* this inputStream will only as many bytes are in the file.
* @throws IOException if the given file does not exist
* @throws IllegalArgumentException if either startOffset or length are less than 0
* or if startOffset < file.length().
*/
public RandomAccessFileInputStream(File file, long startOffset, long length) throws IOException{
assertStartOffValid(file, startOffset);
if(length < 0){
throw new IllegalArgumentException("length can not be less than 0");
}
this.randomAccessFile = new RandomAccessFile(file, "r");
randomAccessFile.seek(startOffset);
this.length = length;
ownFile =true;
}
private void assertStartOffValid(File file, long startOffset) {
if(startOffset < 0){
throw new IllegalArgumentException("start offset can not be less than 0");
}
if(file.length() < startOffset){
throw new IllegalArgumentException(
String.format("invalid startOffset %d: file is only %d bytes" ,
startOffset,
file.length()));
}
}
/**
* Creates a new RandomAccessFileInputStream that reads
* bytes from the given {@link RandomAccessFile}.
* Any external changes to the file pointer
* via {@link RandomAccessFile#seek(long)} or similar
* methods will also alter the subsequent bytes read
* by this {@link InputStream}.
* Closing the inputStream returned by this constructor
* DOES NOT close the {@link RandomAccessFile} which
* must be closed separately by the caller.
* @param file the {@link RandomAccessFile} instance
* to read as an {@link InputStream}; can not be null.
* @throws NullPointerException if file is null.
*/
public RandomAccessFileInputStream(RandomAccessFile file){
if(file ==null){
throw new NullPointerException("file can not be null");
}
this.randomAccessFile = file;
length = null;
ownFile =false;
}
@Override
public synchronized int read() throws IOException {
if(length !=null && bytesRead >=length){
return -1;
}
int value = randomAccessFile.read();
if(value !=-1){
bytesRead++;
}
return value;
}
@Override
public synchronized int read(byte[] b, int off, int len) throws IOException {
if(length != null && bytesRead >=length){
return -1;
}
final int reducedLength = computeReducedLength(len);
int numberOfBytesRead = randomAccessFile.read(b, off, reducedLength);
bytesRead+=numberOfBytesRead;
return numberOfBytesRead;
}
private int computeReducedLength(int len) {
if(length ==null){
return len;
}
return Math.min(len, (int)(length - bytesRead));
}
/**
* If this instance was creating
* using the {@link #RandomAccessFileInputStream(RandomAccessFile)}
* constructor, then this method does nothing- the RandomAccessFile
* will still be open.
* If constructed using {@link #RandomAccessFileInputStream(File, long)}
* or {@link #RandomAccessFileInputStream(File, long, long)},
* then the internal {@link RandomAccessFile} will be closed.
*/
@Override
public void close() throws IOException {
//if we created this randomaccessfile
//then its our job to close it.
if(ownFile){
randomAccessFile.close();
}
}
}
编辑
我已经使用我的 RandomAccessFileInputStream
尝试了 运行 你的代码示例,问题是即使设置缓冲区大小,BufferedReader
由于某种原因仍在缓冲,因此文件指针递增 8912每当读取底层 inputStream 时。即使缓冲按预期工作,缓冲区也将始终读取下一行,因此 offset
永远不会成为行尾的位置。
如果您不想缓冲数据并且不想编写自己的读取行的实现。您可以使用 DataInputStream
,它有一个已弃用的 readLine()
方法。该方法已被弃用,因为它 "does not properly convert bytes to characters" 但是,如果您使用的是 ASCII 字符,它应该没问题。
InputStream in = new RandomAccessFileInputStream(raf);
DataInputStream dataIn = new DataInputStream(in))
...
if ((line = dataIn.readLine()) == null)
...
按预期工作。偏移量仅更新每行的确切字节数。但是,由于它没有缓冲,因此读取文件的速度会变慢。
我的应用程序逐行读取文本文件并记录每一行的偏移量,直到文件结束。偏移量仅在首次执行 readLine 时更改。之后它不再改变。我测试了从 10 到 16384 的 bufferSize。我的代码有什么问题?我使用 RandomAccessFile 而不是 FileInputStream 因为当文件很大时 seek() 比 skip() 快。
String line;
long offset;
RandomAccessFile raf = new RandomAccessFile("data.txt", "r");
FileInputStream is = new FileInputStream(raf.getFD());
InputStreamReader isr = new InputStreamReader(is, encoding);
BufferedReader br = new BufferedReader(isr, bufferSize);
while (true) {
offset = raf.getFilePointer(); // offset remains the same after 1st readLine. why?
if ((line = br.readLine()) == null) // line has correct value.
return;
………………………………
}
为了更新 RandomAccessFile
中的文件指针,您需要使用属于 RandomAccessFile 对象的 read()
方法。
单独制作Reader不会更新。
如果您需要使用 BufferedReader
,您始终可以将 RandomAccessFile 包装在您自己的 InputStream 实现中,以便读取 inputStream 委托以读取 RandomAccessFile:
我以前不得不这样做。不难:
public final class RandomAccessFileInputStream extends InputStream{
private final RandomAccessFile randomAccessFile;
private long bytesRead=0;
/**
* The number of bytes to read in the stream;
* or {@code null} if we should read the whole thing.
*/
private final Long length;
private final boolean ownFile;
/**
* Creates a new {@link RandomAccessFileInputStream}
* of the given file starting at the given position.
* Internally, a new {@link RandomAccessFile} is created
* and is seek'ed to the given startOffset
* before reading any bytes. The internal
* {@link RandomAccessFile} instance is managed by this
* class and will be closed when {@link #close()} is called.
* @param file the {@link File} to read.
* @param startOffset the start offset to start reading
* bytes from.
* @throws IOException if the given file does not exist
* @throws IllegalArgumentException if the startOffset is less than 0.
*/
public RandomAccessFileInputStream(File file, long startOffset) throws IOException{
assertStartOffValid(file, startOffset);
this.randomAccessFile = new RandomAccessFile(file, "r");
randomAccessFile.seek(startOffset);
this.length = null;
ownFile =true;
}
/**
* Creates a new {@link RandomAccessFileInputStream}
* of the given file starting at the given position
* but will only read the given length.
* Internally, a new {@link RandomAccessFile} is created
* and is seek'ed to the given startOffset
* before reading any bytes. The internal
* {@link RandomAccessFile} instance is managed by this
* class and will be closed when {@link #close()} is called.
* @param file the {@link File} to read.
* @param startOffset the start offset to start reading
* bytes from.
* @param length the maximum number of bytes to read from the file.
* this inputStream will only as many bytes are in the file.
* @throws IOException if the given file does not exist
* @throws IllegalArgumentException if either startOffset or length are less than 0
* or if startOffset < file.length().
*/
public RandomAccessFileInputStream(File file, long startOffset, long length) throws IOException{
assertStartOffValid(file, startOffset);
if(length < 0){
throw new IllegalArgumentException("length can not be less than 0");
}
this.randomAccessFile = new RandomAccessFile(file, "r");
randomAccessFile.seek(startOffset);
this.length = length;
ownFile =true;
}
private void assertStartOffValid(File file, long startOffset) {
if(startOffset < 0){
throw new IllegalArgumentException("start offset can not be less than 0");
}
if(file.length() < startOffset){
throw new IllegalArgumentException(
String.format("invalid startOffset %d: file is only %d bytes" ,
startOffset,
file.length()));
}
}
/**
* Creates a new RandomAccessFileInputStream that reads
* bytes from the given {@link RandomAccessFile}.
* Any external changes to the file pointer
* via {@link RandomAccessFile#seek(long)} or similar
* methods will also alter the subsequent bytes read
* by this {@link InputStream}.
* Closing the inputStream returned by this constructor
* DOES NOT close the {@link RandomAccessFile} which
* must be closed separately by the caller.
* @param file the {@link RandomAccessFile} instance
* to read as an {@link InputStream}; can not be null.
* @throws NullPointerException if file is null.
*/
public RandomAccessFileInputStream(RandomAccessFile file){
if(file ==null){
throw new NullPointerException("file can not be null");
}
this.randomAccessFile = file;
length = null;
ownFile =false;
}
@Override
public synchronized int read() throws IOException {
if(length !=null && bytesRead >=length){
return -1;
}
int value = randomAccessFile.read();
if(value !=-1){
bytesRead++;
}
return value;
}
@Override
public synchronized int read(byte[] b, int off, int len) throws IOException {
if(length != null && bytesRead >=length){
return -1;
}
final int reducedLength = computeReducedLength(len);
int numberOfBytesRead = randomAccessFile.read(b, off, reducedLength);
bytesRead+=numberOfBytesRead;
return numberOfBytesRead;
}
private int computeReducedLength(int len) {
if(length ==null){
return len;
}
return Math.min(len, (int)(length - bytesRead));
}
/**
* If this instance was creating
* using the {@link #RandomAccessFileInputStream(RandomAccessFile)}
* constructor, then this method does nothing- the RandomAccessFile
* will still be open.
* If constructed using {@link #RandomAccessFileInputStream(File, long)}
* or {@link #RandomAccessFileInputStream(File, long, long)},
* then the internal {@link RandomAccessFile} will be closed.
*/
@Override
public void close() throws IOException {
//if we created this randomaccessfile
//then its our job to close it.
if(ownFile){
randomAccessFile.close();
}
}
}
编辑
我已经使用我的 RandomAccessFileInputStream
尝试了 运行 你的代码示例,问题是即使设置缓冲区大小,BufferedReader
由于某种原因仍在缓冲,因此文件指针递增 8912每当读取底层 inputStream 时。即使缓冲按预期工作,缓冲区也将始终读取下一行,因此 offset
永远不会成为行尾的位置。
如果您不想缓冲数据并且不想编写自己的读取行的实现。您可以使用 DataInputStream
,它有一个已弃用的 readLine()
方法。该方法已被弃用,因为它 "does not properly convert bytes to characters" 但是,如果您使用的是 ASCII 字符,它应该没问题。
InputStream in = new RandomAccessFileInputStream(raf);
DataInputStream dataIn = new DataInputStream(in))
...
if ((line = dataIn.readLine()) == null)
...
按预期工作。偏移量仅更新每行的确切字节数。但是,由于它没有缓冲,因此读取文件的速度会变慢。