如何在文本块中处理意图(Java 13)

How the intents processed in a Text block(Java 13)

我刚刚尝试了 Java13 中的新文本块功能,遇到了一个小问题。

我已阅读this article from Jaxcenter

右三引号会影响格式。

String query = """
            select firstName,
            lastName,
            email
            from User
            where id= ?
        """;

System.out.println("SQL or JPL like query string :\n" + query);

上述格式效果很好。为了与结束分隔符 (""") 对齐,多行字符串在每行前留有空格。

但是当我尝试比较以下两个文本块字符串时,它们在输出控制台中的格式相同,但它们不相等,即使在 stripIntent.

之后也是如此
String hello = """
    Hello,
    Java 13
    """;

String hello2 = """
    Hello,
    Java 13
""";

System.out.println("Hello1:\n" + hello);
System.out.println("Hello2:\n" + hello);

System.out.println("hello is equals hello2:" + hello.equals(hello2));

System.out.println("hello is equals hello2 after stripIndent():" + hello.stripIndent().equals(hello2.stripIndent()));

输出控制台如下:

hello is equals hello2:false
hello is equals hello2 after stripIndent():false

不知道哪里错了,或者这是文本块设计的目的?

更新:只需打印 hello2 stripIntent,

System.out.println("hello2 after stripIntent():\n" + hello2.stripIndent());

每行前的空格 按预期被 stripIntent 删除。

更新: 阅读相关的 java 文档后,我认为在编译文本块之后,它应该已经删除了块中行的左意图. stripIntent 用于文本块的目的是什么? 我知道在普通字符串上使用它很容易理解。

完整代码为here.

使用 jshell 进行测试:

String hello = """
    Hello,
    Java 13
    """;
hello.replace(" ", ".");

结果

"Hello\nJava13\n"

注意:完全没有空格

String hello2 = """
    Hello,
    Java 13
""";
hello2.replace(" ", ".");

结果

"....Hello\n....Java13\n"

请注意,两个结果在最后一行 \n 之后的最后一行都没有空格,因此 stripIndent() 不会去除任何空格


stripIndent() 与编译器对文本块的作用相同。范例

String hello3 = ""
    + "    Hello\n"
    + "    Java13\n"
    + "  ";
hello3.stripIndent().replace(" ", ".");

结果

"..Hello\n..Java13\n"

即3行全部去掉2个空格;两个空格,因为最后一行有 2 个空格(其他行有更多,所以最多可以从所有行中删除 2 个空格)

有一个概念附带白space.

JEP 355: Text Blocks (Preview)

Compile-time processing

A text block is a constant expression of type String, just like a string literal. However, unlike a string literal, the content of a text block is processed by the Java compiler in three distinct steps:

  • Line terminators in the content are translated to LF (\u000A). The purpose of this translation is to follow the principle of least surprise when moving Java source code across platforms.

  • Incidental white space surrounding the content, introduced to match the indentation of Java source code, is removed.

  • Escape sequences in the content are interpreted. Performing interpretation as the final step means developers can write escape sequences such as \n without them being modified or deleted by earlier steps.

...

Incidental white space

Here is the HTML example using dots to visualize the spaces that the developer added for indentation:

String html = """
..............<html>
..............    <body>
..............        <p>Hello, world</p>
..............    </body>
..............</html>
..............""";

Since the opening delimiter is generally positioned to appear on the same line as the statement or expression which consumes the text block, there is no real significance to the fact that 14 visualized spaces start each line. Including those spaces in the content would mean the text block denotes a string different from the one denoted by the concatenated string literals. This would hurt migration, and be a recurring source of surprise: it is overwhelmingly likely that the developer does not want those spaces in the string. Also, the closing delimiter is generally positioned to align with the content, which further suggests that the 14 visualized spaces are insignificant.
...
Accordingly, an appropriate interpretation for the content of a text block is to differentiate incidental white space at the start and end of each line, from essential white space. The Java compiler processes the content by removing incidental white space to yield what the developer intended.

你的假设

    Hello,
    Java 13
<empty line>

等于

....Hello,
....Java 13
<empty line>

是不准确的,因为它们是 essential white spaces 并且它们不会被编译器或 String#stripIndent.

为了清楚起见,让我们继续将附带的白色 space 表示为一个点。

String hello = """
....Hello,
....Java 13
....""";

String hello2 = """
    Hello,
    Java 13
""";

打印出来吧。

Hello,
Java 13
<empty line>

    Hello,
    Java 13
<empty line>

让我们对两者都调用 String#stripIndent 并打印结果。

Hello,
Java 13
<empty line>

    Hello,
    Java 13
<empty line>

要了解为什么没有任何变化,我们需要查看文档。

String#stripIndent

Returns a string whose value is this string, with incidental white space removed from the beginning and end of every line.

Then, the minimum indentation (min) is determined as follows. For each non-blank line (as defined by isBlank()), the leading white space characters are counted. The leading white space characters on the last line are also counted even if blank. The min value is the smallest of these counts.

For each non-blank line, min leading white space characters are removed, and any trailing white space characters are removed. Blank lines are replaced with the empty string.

对于两个 String,最小缩进为 0

Hello,          // 0
Java 13         // 0    min(0, 0, 0) = 0 
<empty line>    // 0

    Hello,      // 4
    Java 13     // 4    min(4, 4, 0) = 0
<empty line>    // 0

String#stripIndent 使开发人员可以访问编译器使用的 Java 版本的重新缩进算法。

JEP 355

The re-indentation algorithm will be normative in The Java Language Specification. Developers will have access to it via String::stripIndent, a new instance method.

Specification for JEP 355

The string represented by a text block is not the literal sequence of characters in the content. Instead, the string represented by a text block is the result of applying the following transformations to the content, in order:

  1. Line terminators are normalized to the ASCII LF character (...)

  2. Incidental white space is removed, as if by execution of String::stripIndent on the characters in the content.

  3. Escape sequences are interpreted, as in a string literal.

TLDR。您的示例字符串不相等,Java 告诉您它们不相等是正确的。

考虑阅读 String.stripIndent 方法的说明。 这是来自 jaxenter.com post 的释义:

The stripIndent method removes whitespace in front of multi-line strings that all lines have in common, i.e. moves the entire text to the left without changing the formatting.

注意单词 "that all lines have in common"。

现在,将 "that all lines have in common" 应用于以下文字字符串:

String hello2 = """
    Hello,
    First, notice that the final line of this example has zero spaces.
    Next, notice that all other lines of this example have non-zero spaces.
"""; // <--- This is a line in the text block.

要点是“0 != 3”。