解析文件并使用 Java 替换双引号内的空格

Parsing a file and replacing White spaces fond within double quotes using Java

我正在读取文件并尝试按以下顺序修改它:

  1. 如果行为空 trim()
  2. if line ends with \ 删除该字符并添加下一行。
  3. 完整行包含双引号,引号之间有白色space,将白色space替换为~。 例如:"This is text within double quotes" 更改为:"This~is~text~within~double~quotes"

此代码有效但有问题。 当它找到以 \ 结尾的行和其他已完成的行时,就会出现问题。

for example: 
line 1 and \
line 2
line 3

所以而不是

line 1 and line 2
line 3

我有这个:

line 1 and line 2     line 3

编码更新:

public List<String> OpenFile() throws IOException {
        try (BufferedReader br = new BufferedReader(new FileReader(path))) {
            String line;
            //StringBuilder concatenatedLine = new StringBuilder();
            List<String> formattedStrings = new ArrayList<>();
            //Pattern matcher = Pattern.compile("\"([^\"]+)\"");
        while ((line = br.readLine()) != null) {
            boolean addToPreviousLine;
            if (line.isEmpty()) {
                line.trim();

            }

            if (line.contains("\"")) {
                Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(line);
                while (matcher.find()) {
                    String match = matcher.group();
                    line = line.replace(match, match.replaceAll("\s+", "~"));
                }

            }

            if (line.endsWith("\")) {
                addToPreviousLine = false;
                line = line.substring(0, line.length() - 1);
                formattedStrings.add(line);
            } else {
                addToPreviousLine = true;

            }

            if (addToPreviousLine) {
                int previousLineIndex = formattedStrings.size() - 1;
                if (previousLineIndex > -1) {
                    // Combine the previous line and current line
                    String previousLine = formattedStrings.remove(previousLineIndex);
                    line = previousLine + " " + line;
                    formattedStrings.add(line);
                }
            }
            testScan(formattedStrings);
            //concatenatedLine.setLength(0);
        }
        return formattedStrings;
    }

我不太擅长正则表达式,所以这是一个编程方法:

    String string = "He said, \"Hello Mr Nice Guy\"";

    // split it along the quotes
    String splitString[] = string.split("\"");

    // loop through, each odd indexted item is inside quotations
    for(int i = 0; i < splitString.length; i++) {
        if(i % 2 > 0) {
            splitString[i] = splitString[i].replaceAll(" ", "~");
        }
    }

    String finalString = "";

    // re-build the string w/ quotes added back in
    for(int i = 0; i < splitString.length; i++) {
        if(i % 2 > 0) {
            finalString += "\"" + splitString[i] + "\"";
        } else {
            finalString += splitString[i];
        }
    }

    System.out.println(finalString);

输出:他说,"Hello~Mr~Nice~Guy"

而不是

if (line.contains("\"")) {
                   StringBuffer sb = new StringBuffer();
                   Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(line);
                   while (matcher.find()) {
                   matcher.appendReplacement(sb, matcher.group().replaceAll("\s+", ""));
                }

我会这样做

if (line.matches(("\"([^\"]+)\"")) {
               line=   line.replaceAll("\s+", ""));
                }

如何将此添加到我上面的内容中?

concatenatedLine.append(line); 
String fullLine = concatenatedLine.toString();  
if (fullLine.contains("\"")) {
 StringBuffer sb = new StringBuffer();
 Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(fullLine);
 while (matcher.find()) {
    matcher.appendReplacement(sb, matcher.group().replaceAll("\s+", ""));
 formattedStrings.add(sb.toString());
 }else
   formattedStrings.add(fullLine);

第 3 步:

String text;
text = text.replaceAll("\s", "~");

如果要用 ~s 替换双引号内出现的空格,

if (line.contains("\"")) {
    String line = "\"This is a line with spaces\"";
    String result = "";
    Pattern p = Pattern.compile("\"([^\"]*)\"");
    Matcher m = p.matcher(line);
    while (m.find()) {
          result = m.group(1).replace(" ", "~");
    }
}

更新

我给你你需要的东西,而不是试图为你写所有的代码。您只需要找出放置这些片段的位置即可。

如果行为空 trim()

if (line.matches("\s+")) {
    line = "";
    // I don't think you want to add an empty line to your return result.  If you do, just omit the continue;
    continue;
}

如果行中包含双引号和白色 space,请将白色 space 替换为 ~。例如:"This is text within double quotes" 更改为:"This~is~text~within~double~quotes"

Matcher matcher = Pattern.compile("\"([^\"]+)\"").matcher(line);
while (matcher.find()) {
    String match = matcher.group();
    line = line.replace(match, match.replaceAll("\s+", "~"));
}

如果行以 \ 结尾,则删除该字符并添加下一行。您需要有标志来跟踪何时执行此操作。

if (line.endsWith("\")) {
    addToPreviousLine = true;
    line = line.substring(0, line.length() - 1);
} else {
    addToPreviousLine = false;
}

现在,要将下一行添加到上一行,您需要类似的东西(找出放置此代码段的位置):

if (addToPreviousLine) {
    int previousLineIndex = formattedStrings.size() - 1;
    if (previousLineIndex > -1) {
        // Combine the previous line and current line
        String previousLine = formattedStrings.remove(previousLineIndex);
        line = previousLine + " " + line;
    }
}

您仍然不需要 StringBufferStringBuilder。只需修改当前行并将当前行添加到您的 formattedStrings List.