在 Java 中按管道拆分会产生不同的结果
Splitting by pipe in Java yields different results
在任何人下结论之前,是的,我知道需要转义管道符号 :-)
...在我的代码中,我这样做了:
String line = "C0000005|A13433185|SCUI|RB|C0036775|A7466261|SCUI||R86000559||MSHFRE|MSHFRE|||N||"
line = line.trim();
String[] columns_array = line.trim().split("\|"); // length = 15
List<String> columns_list = Splitter.on("|").splitToList(line); // size = 17
我正在解析一个巨大的文件 (~5GB),其中每一行都是管道分隔的,上面的 line
是该文件中的第一行,我的代码因索引越界错误而崩溃。调试后,我意识到发生了什么,并添加了 guava Splitter
行作为完整性检查。使用拆分器,我得到了预期的列表。
为什么guava splitter和native split的结果不一样?
String.split()
从结果数组中删除尾随的空字符串。在被拆分的字符串末尾有两个定界符 (...||
).
以下是文档的摘录:http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
String.split()
的 API 文档说:
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
由于这个事实,您的列表被截断了。
正如一位评论者已经指出的那样,您可以使用以下方法生成正确的结果:
String[] columns_array = line.trim().split("\|", -1); // length 17
接受多个参数的 split(String s, int n)
函数的 API:
If n is non-positive then the pattern will be applied as many times as possible and the array can have any length
在任何人下结论之前,是的,我知道需要转义管道符号 :-)
...在我的代码中,我这样做了:
String line = "C0000005|A13433185|SCUI|RB|C0036775|A7466261|SCUI||R86000559||MSHFRE|MSHFRE|||N||"
line = line.trim();
String[] columns_array = line.trim().split("\|"); // length = 15
List<String> columns_list = Splitter.on("|").splitToList(line); // size = 17
我正在解析一个巨大的文件 (~5GB),其中每一行都是管道分隔的,上面的 line
是该文件中的第一行,我的代码因索引越界错误而崩溃。调试后,我意识到发生了什么,并添加了 guava Splitter
行作为完整性检查。使用拆分器,我得到了预期的列表。
为什么guava splitter和native split的结果不一样?
String.split()
从结果数组中删除尾随的空字符串。在被拆分的字符串末尾有两个定界符 (...||
).
以下是文档的摘录:http://docs.oracle.com/javase/7/docs/api/java/lang/String.html#split%28java.lang.String%29
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
String.split()
的 API 文档说:
This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
由于这个事实,您的列表被截断了。
正如一位评论者已经指出的那样,您可以使用以下方法生成正确的结果:
String[] columns_array = line.trim().split("\|", -1); // length 17
接受多个参数的 split(String s, int n)
函数的 API:
If n is non-positive then the pattern will be applied as many times as possible and the array can have any length