如何在 java 中进行文本处理
How to do text processing in java
我有一个 csv 文件
input.csv
1,[103.85,1.28992],[103.89,1.294],[103.83,1.216]
2,[103.5,1.292],[103.9,1.4],[103.3,1.21]
3,[103.6,1.291],[103.6,1.39],[103.3,1.29]
由此我需要将其转换为
{
"type": "LineString",
"coordinates": [[103.85,1.28992],[103.89,1.294],[103.83,1.216]]
"properties": {
"id": "1"
}
},
{
"type": "LineString",
"properties": {
"id": "2"
},
"coordinates": [[103.5,1.292],[103.9,1.4],[103.3,1.21]]
},{
"type": "LineString",
"properties": {
"id": "3"
},
"coordinates": [[103.6,1.291],[103.6,1.39],[103.3,1.29]]
}
我现在正尝试在 java.So 我打开 csv 阅读 csv 文件
try (CSVReader reader = new CSVReader(new FileReader(fileName))) {
String[] nextLine;
while ((nextLine = reader.readNext()) != null) {
for (String e: nextLine) {
// System.out.format("%s ", e);
System.out.println( e.split(",",1));
}
}
但是我在拆分 line.If 时遇到问题,你看第一行然后我想要
1
作为一部分,其余 [103.85,1.28992],[103.89,1.294],[103.83,1.216]
作为另一个 part.So,我可以构建字符串
String s="{\"type\": \"LineString\", \"coordinates\": "+s[1]+"
\"properties\": { \"id\":"+s[0]+"} }";
感谢任何帮助
你可以试试:
(\d+),(.*)
你不需要拆分...如果你执行它你会得到两个组。第1组为数字,第2组为后面的内容
Explanation
试试这个示例:
final String regex = "(\d+),(.*)";
final String string = "1,[103.85,1.28992],[103.89,1.294],[103.83,1.216]\n"
+ "2,[103.5,1.292],[103.9,1.4],[103.3,1.21]\n"
+ "3,[103.6,1.291],[103.6,1.39],[103.3,1.29]";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
}
使用JSON简单创建您需要的JSON。在我看来,最简单的 JSON 库。看到这个example的使用。
你可以自己解析这些行:
try (BufferedReader reader = new BufferedReader(new FileReader(fileName))) {
String nextLine;
while ((nextLine = reader.readLine()) != null) {
int ix = nextLine.indexOf(',');
if (ix >= 0) {
String head = nextLine.substring(0, ix);
String tail = nextLine.substring(ix+1);
doSomethingWith(head, tail);
}
}
}
问题是,要以您需要的方式获取数据,生成 input.csv 文件的任何内容都需要将不同的部分封装在引号中。
所以要么
input.csv
1,"[103.85,1.28992],[103.89,1.294],[103.83,1.216]"
2,"[103.5,1.292],[103.9,1.4],[103.3,1.21]"
3,"[103.6,1.291],[103.6,1.39],[103.3,1.29]"
或
input.csv
"1","[103.85,1.28992],[103.89,1.294],[103.83,1.216]"
"2","[103.5,1.292],[103.9,1.4],[103.3,1.21]"
"3","[103.6,1.291],[103.6,1.39],[103.3,1.29]"
因为在行的一和末尾之间有六个逗号,任何 csv 解析器都会将其解释为该行有七列而不是两列。
我有一个 csv 文件
input.csv
1,[103.85,1.28992],[103.89,1.294],[103.83,1.216]
2,[103.5,1.292],[103.9,1.4],[103.3,1.21]
3,[103.6,1.291],[103.6,1.39],[103.3,1.29]
由此我需要将其转换为
{
"type": "LineString",
"coordinates": [[103.85,1.28992],[103.89,1.294],[103.83,1.216]]
"properties": {
"id": "1"
}
},
{
"type": "LineString",
"properties": {
"id": "2"
},
"coordinates": [[103.5,1.292],[103.9,1.4],[103.3,1.21]]
},{
"type": "LineString",
"properties": {
"id": "3"
},
"coordinates": [[103.6,1.291],[103.6,1.39],[103.3,1.29]]
}
我现在正尝试在 java.So 我打开 csv 阅读 csv 文件
try (CSVReader reader = new CSVReader(new FileReader(fileName))) {
String[] nextLine;
while ((nextLine = reader.readNext()) != null) {
for (String e: nextLine) {
// System.out.format("%s ", e);
System.out.println( e.split(",",1));
}
}
但是我在拆分 line.If 时遇到问题,你看第一行然后我想要
1
作为一部分,其余 [103.85,1.28992],[103.89,1.294],[103.83,1.216]
作为另一个 part.So,我可以构建字符串
String s="{\"type\": \"LineString\", \"coordinates\": "+s[1]+"
\"properties\": { \"id\":"+s[0]+"} }";
感谢任何帮助
你可以试试:
(\d+),(.*)
你不需要拆分...如果你执行它你会得到两个组。第1组为数字,第2组为后面的内容 Explanation
试试这个示例:
final String regex = "(\d+),(.*)";
final String string = "1,[103.85,1.28992],[103.89,1.294],[103.83,1.216]\n"
+ "2,[103.5,1.292],[103.9,1.4],[103.3,1.21]\n"
+ "3,[103.6,1.291],[103.6,1.39],[103.3,1.29]";
final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
}
使用JSON简单创建您需要的JSON。在我看来,最简单的 JSON 库。看到这个example的使用。
你可以自己解析这些行:
try (BufferedReader reader = new BufferedReader(new FileReader(fileName))) {
String nextLine;
while ((nextLine = reader.readLine()) != null) {
int ix = nextLine.indexOf(',');
if (ix >= 0) {
String head = nextLine.substring(0, ix);
String tail = nextLine.substring(ix+1);
doSomethingWith(head, tail);
}
}
}
问题是,要以您需要的方式获取数据,生成 input.csv 文件的任何内容都需要将不同的部分封装在引号中。
所以要么
input.csv
1,"[103.85,1.28992],[103.89,1.294],[103.83,1.216]"
2,"[103.5,1.292],[103.9,1.4],[103.3,1.21]"
3,"[103.6,1.291],[103.6,1.39],[103.3,1.29]"
或
input.csv
"1","[103.85,1.28992],[103.89,1.294],[103.83,1.216]"
"2","[103.5,1.292],[103.9,1.4],[103.3,1.21]"
"3","[103.6,1.291],[103.6,1.39],[103.3,1.29]"
因为在行的一和末尾之间有六个逗号,任何 csv 解析器都会将其解释为该行有七列而不是两列。