Java

Question

我正在使用 3 个 classes：角色 class、扫描器 class 和测试 class。

这是角色 class:

public class Character {
    private char cargo = '\u0007'; 
    private String sourceText = ""; 
    private int sourceIndex = 0; 
    private int lineIndex = 0;
    private int columnIndex = 0;
    public Character(String sourceText, char cargo, int sourceIndex, int lineIndex, int columnIndex) {
        this.sourceText = sourceText;
        this.cargo = cargo;
        this.sourceIndex = sourceIndex;
        this.lineIndex = lineIndex;
        this.columnIndex = columnIndex;
    }
    /*****************************************************************************************/
    /* Returns the String representation of the Character object                      */
    /*****************************************************************************************/
    @Override
    public String toString() {
        switch (cargo) {
            case ' ': return String.format("%6d %-6d " + "    blank", lineIndex, columnIndex);
            case '\t': return String.format("%6d %-6d " + "    tab", lineIndex, columnIndex);
            case '\n': return String.format("%6d %-6d " + "    newline", lineIndex, columnIndex);
            default: return String.format("%6d %-6d " + cargo, lineIndex, columnIndex);
        }
    }
}

这是我的扫描器 class:

public class Scanner {
    private String sourceText = ""; 
    private int sourceIndex = -1; 
    private int lineIndex = 0;
    private int columnIndex = -1;
    private int lastIndex = 0;
    /*****************************************************************************************/
    /* Assign proper values                                                                  */
    /*****************************************************************************************/ 
    public Scanner(String sourceText) {
        this.sourceText = sourceText;
        lastIndex = sourceText.length() - 1;
    }
    /*****************************************************************************************/
    /* Returns the next character in the source text                                         */
    /*****************************************************************************************/   
    public Character getNextCharacter() {
        if (sourceIndex > 0 && sourceText.charAt(sourceIndex - 1) == '\n') {
            ++lineIndex;
            columnIndex = -1;
        }
        ++sourceIndex;
        ++columnIndex;
        char currentChar = sourceText.charAt(sourceIndex);
        Character objCharacter = new Character(sourceText, currentChar, sourceIndex, lineIndex, columnIndex);
        return objCharacter;
    }
}

这是测试class的主要方法：

public static void main(String[] args) {
    String sourceText = "";
    String filePath = "D:\Somepath\SampleCode.dat";
    try { sourceText = readFile(filePath, StandardCharsets.UTF_8); }
    catch (IOException io) { System.out.println(io.toString()); }
    LexicalAnalyzer.Scanner sca = new LexicalAnalyzer.Scanner(sourceText);
    LexicalAnalyzer.Character cha;
    int i =0;
    while(i < sourceText.length()) {
        cha = sca.getNextCharacter();
        System.out.println(cha.toString());
        i++;
    }
}

基本上，我要做的是打印源文件中的每个字符（包括空格、制表符和换行符），以及其他字符详细信息，例如行号和列号。另外，请注意我在字符 class.

的 toString() 方法中的 switch 和 case 语句

比方说，我的文件包含文本：

This is line #1. 
This is line #2.

从我的代码中，我期望得到：

 0 0      T
 0 1      h
 0 2      i
 0 3      s
 0 4          blank
 0 5      i
 0 6      s
 0 7          blank
 0 8      l
 0 9      i
 0 10     n
 0 11     e
 0 12         blank
 0 13     #
 0 14     1
 0 15     .
 0 16         newline
 1 0      T
 1 1      h
 1 1      i
 1 2      s
 1 3          blank
 1 4      i
 1 5      s
 1 6          blank
 1 7      l
 1 8      i
 1 9      n
 1 10     e
 1 11         blank
 1 12     #
 1 13     2
 1 14     .

但是，我得到：

 0 0      T
 0 1      h
 0 2      i
 0 3      s
 0 4          blank
 0 5      i
 0 6      s
 0 7          blank
 0 8      l
 0 9      i
 0 10     n
 0 11     e
 0 12         blank
 0 13     #
 0 14     1
 0 15     .
 0 16     
 0 17         newline
 0 18     T
 1 0      h
 1 1      i
 1 2      s
 1 3          blank
 1 4      i
 1 5      s
 1 6          blank
 1 7      l
 1 8      i
 1 9      n
 1 10     e
 1 11         blank
 1 12     #
 1 13     2
 1 14     .

注意当有换行符时打印的内容。 Space 和制表符工作正常。我得到了我想要的，但不是换行符。顺便说一句，这只是一个 Java 代码：http://parsingintro.sourceforge.net/#contents_item_4.2.

请不要攻击我。几个小时以来，我一直试图找出这背后的原因。

备注

使用 String.format 或 System.getProperty("line.separator"); 中的 %n 也可能有帮助。检查此 link：How do I get a platform-dependent new line character?

Answer 1

您运行在 Windows 系统上。

代码不处理 \r\n 形式的换行符，只是 \n。

我能够生成对这一更改有意义的输出。将这种情况添加到开关中：

case '\r': return String.format("%6d %-6d " + "    winNewline", lineIndex, columnIndex);

结果输出：

 0 0      T
 0 1      h
 0 2      i
 0 3      s
 0 4          blank
 0 5      i
 0 6      s
 0 7          blank
 0 8      l
 0 9      i
 0 10     n
 0 11     e
 0 12         blank
 0 13     #
 0 14     1
 0 15     .
 0 16         blank
 0 17         winNewline
 0 18         newline
 0 19     T
 1 0      h
 1 1      i
 1 2      s
 1 3          blank
 1 4      i
 1 5      s
 1 6          blank
 1 7      l
 1 8      i
 1 9      n
 1 10     e
 1 11         blank
 1 12     #
 1 13     2
 1 14     .

Process finished with exit code 0

Answer 2

很难通过查看输出来判断，但要尝试调试它，您可以尝试修改字符 class 中的默认 case 语句，以使用 [=12 打印字符的 ascii 代码=]

default: return String.format("%6d %-6d " + Integer.valueOf(cargo), lineIndex, columnIndex);

这将向您显示您获得的额外字符的 ascii 代码是什么。获得代码后，检查它在此处的字符：http://www.asciitable.com/

我猜你得到的额外字符是 '\r'（不同类型的 '\n' 字符）。

希望对您有所帮助！

Java - 我的代码中多了一个字符？

Java - An extra character in my code?

char

console-application

newline

scanning

备注