FlatFileItemReader 不接收以 # 开头的行
FlatFileItemReader does not pick up lines starting #
我正在尝试使用 flatFileItemReader
通过 spring 批处理读取 ATM-EJ 文件,它完美地读取了所有内容,除了它错过了任何以 # 开头的行。下面是我的 itemReder
@Bean
@Scope(value = "step", proxyMode = ScopedProxyMode.TARGET_CLASS)
public FlatFileItemReader flatFileItemReader(@Value("#{jobParameters}") Map<String, JobParameter> jobParameters) {
return new FlatFileItemReaderBuilder<FieldSet>()
.name("flatFileItemReader")
.resource(new PathResource(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "inputFilePath")))
//.lineTokenizer(ejFileTokenizer(null))
.lineTokenizer(initiateNewTokenizer())
.fieldSetMapper(new PassThroughFieldSetMapper())
.linesToSkip(Integer.parseInt(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "linesToSkip")))
.encoding("Cp1252")
//.encoding("UTF-8")
.build();
}
行分词器
private LineTokenizer initiateNewTokenizer() {
return new AbstractLineTokenizer() {
@Override
protected List<String> doTokenize(String line) {
return Arrays.asList(line);
}
};
}
示例输入
*TRANSACTION START*
[020t CARD INSERTED
[020tCARD: ****************9847
DATE 29-12-20 TIME 00:04:34
00:04:36 ATR RECEIVED T=0
[020t 00:04:53 PIN ENTERED
[020t 00:04:59 OPCODE = A C C B
00:04:59 GENAC 1 : ARQC
EXTERNAL AUTHENTICATE: NO ARPC
00:05:02 GENAC 2 : AAC
00:05:09 ATR RECEIVED T=0
[020t 00:05:11 OPCODE = A C C B
00:05:11 GENAC 1 : ARQC
00:05:14 GENAC 2 : TC
[020t 00:05:20 NOTES STACKED
[020t 00:05:25 CARD TAKEN
[020t 00:05:28 NOTES PRESENTED 0,1,0,0
#29/12/20 00:06 ATM0001
000607934460 1351 29/12/20
XXXXXXXXXXXXXXXXX
CUR100.00 CashWithdrawal 000
[020t 00:05:29 NOTES TAKEN
[000p[040q(1 *1351*1*E*000010000,M-00,R-10100
[020t 00:05:36 TRANSACTION END
这是未被读取的行
#29/12/20 00:06 ATM0001
我不确定是什么问题,这可能是编码问题?或者与分词器有关?我尝试调试,发现下面的方法没有接收到以 #
开头的行
@Override
protected List<String> doTokenize(String line) {
"#" 是 FlatFileItemReader
的评论,这就是为什么您没有收到该行的原因。
FlatFileItemReader 源代码包含:
public static final String[] DEFAULT_COMMENT_PREFIXES = new String[] { "#" };
因此,如果您想在构建器中指定不同的注释前缀:
.comments("")
在您的代码中:
return new FlatFileItemReaderBuilder<FieldSet>()
.name("flatFileItemReader")
.resource(new PathResource(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "inputFilePath")))
//.lineTokenizer(ejFileTokenizer(null))
.lineTokenizer(initiateNewTokenizer())
.fieldSetMapper(new PassThroughFieldSetMapper())
.linesToSkip(Integer.parseInt(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "linesToSkip")))
.encoding("Cp1252")
//.encoding("UTF-8")
.comments("") // ignore lines that starts with
.build();
我正在尝试使用 flatFileItemReader
通过 spring 批处理读取 ATM-EJ 文件,它完美地读取了所有内容,除了它错过了任何以 # 开头的行。下面是我的 itemReder
@Bean
@Scope(value = "step", proxyMode = ScopedProxyMode.TARGET_CLASS)
public FlatFileItemReader flatFileItemReader(@Value("#{jobParameters}") Map<String, JobParameter> jobParameters) {
return new FlatFileItemReaderBuilder<FieldSet>()
.name("flatFileItemReader")
.resource(new PathResource(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "inputFilePath")))
//.lineTokenizer(ejFileTokenizer(null))
.lineTokenizer(initiateNewTokenizer())
.fieldSetMapper(new PassThroughFieldSetMapper())
.linesToSkip(Integer.parseInt(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "linesToSkip")))
.encoding("Cp1252")
//.encoding("UTF-8")
.build();
}
行分词器
private LineTokenizer initiateNewTokenizer() {
return new AbstractLineTokenizer() {
@Override
protected List<String> doTokenize(String line) {
return Arrays.asList(line);
}
};
}
示例输入
*TRANSACTION START*
[020t CARD INSERTED
[020tCARD: ****************9847
DATE 29-12-20 TIME 00:04:34
00:04:36 ATR RECEIVED T=0
[020t 00:04:53 PIN ENTERED
[020t 00:04:59 OPCODE = A C C B
00:04:59 GENAC 1 : ARQC
EXTERNAL AUTHENTICATE: NO ARPC
00:05:02 GENAC 2 : AAC
00:05:09 ATR RECEIVED T=0
[020t 00:05:11 OPCODE = A C C B
00:05:11 GENAC 1 : ARQC
00:05:14 GENAC 2 : TC
[020t 00:05:20 NOTES STACKED
[020t 00:05:25 CARD TAKEN
[020t 00:05:28 NOTES PRESENTED 0,1,0,0
#29/12/20 00:06 ATM0001
000607934460 1351 29/12/20
XXXXXXXXXXXXXXXXX
CUR100.00 CashWithdrawal 000
[020t 00:05:29 NOTES TAKEN
[000p[040q(1 *1351*1*E*000010000,M-00,R-10100
[020t 00:05:36 TRANSACTION END
这是未被读取的行
#29/12/20 00:06 ATM0001
我不确定是什么问题,这可能是编码问题?或者与分词器有关?我尝试调试,发现下面的方法没有接收到以 #
开头的行 @Override
protected List<String> doTokenize(String line) {
"#" 是 FlatFileItemReader
的评论,这就是为什么您没有收到该行的原因。
FlatFileItemReader 源代码包含:
public static final String[] DEFAULT_COMMENT_PREFIXES = new String[] { "#" };
因此,如果您想在构建器中指定不同的注释前缀:
.comments("")
在您的代码中:
return new FlatFileItemReaderBuilder<FieldSet>()
.name("flatFileItemReader")
.resource(new PathResource(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "inputFilePath")))
//.lineTokenizer(ejFileTokenizer(null))
.lineTokenizer(initiateNewTokenizer())
.fieldSetMapper(new PassThroughFieldSetMapper())
.linesToSkip(Integer.parseInt(ProducerUtil.getJobParameterByName(jobParameters, ProducerConstants.MULTILINE_JOB_PARAM_NAME, "linesToSkip")))
.encoding("Cp1252")
//.encoding("UTF-8")
.comments("") // ignore lines that starts with
.build();