在 Java HotSpot vm 中将 String Literal 加载到 StringTable 的时间

Question

学习的时候问题出来了java.lang.String Java API.

我找到一篇中文文章。 Java 中new String("字面量") 中 "字面量" 是何时进入字符串常量池的?

它说，CONSTANT_String 在 HotSpot VM 中是惰性解析，所以 String Literal 被加载到 StringTable util 中使用它。

我发现了一些相关的说法。

For example, a Java Virtual Machine implementation may choose to resolve each symbolic reference in a class or interface individually when it is used ("lazy" or "late" resolution), or to resolve them all at once when the class is being verified ("eager" or "static" resolution).

我找到了一些关于 ldc

的 openjdk 代码

IRT_ENTRY(void, InterpreterRuntime::ldc(JavaThread* thread, bool wide))  
  // access constant pool  
  constantPoolOop pool = method(thread)->constants();  
  int index = wide ? get_index_u2(thread, Bytecodes::_ldc_w) :get_index_u1(thread, Bytecodes::_ldc);  
  constantTag tag = pool->tag_at(index);  

  if (tag.is_unresolved_klass() || tag.is_klass()) {  
    klassOop klass = pool->klass_at(index, CHECK);  
    oop java_class = klass->java_mirror();  
    thread->set_vm_result(java_class);  
  } else {  
#ifdef ASSERT  
    // If we entered this runtime routine, we believed the tag contained  
    // an unresolved string, an unresolved class or a resolved class.  
    // However, another thread could have resolved the unresolved string  
    // or class by the time we go there.  
    assert(tag.is_unresolved_string()|| tag.is_string(), "expected string");  
#endif  
    oop s_oop = pool->string_at(index, CHECK);  
    thread->set_vm_result(s_oop);  
  }  
IRT_END

和关于池的代码->string_at(index, CHECK)

oop constantPoolOopDesc::string_at_impl(constantPoolHandle this_oop, int which, TRAPS) {  
  oop str = NULL;  
  CPSlot entry = this_oop->slot_at(which);  
  if (entry.is_metadata()) {  
    ObjectLocker ol(this_oop, THREAD);  
    if (this_oop->tag_at(which).is_unresolved_string()) {  
      // Intern string  
      Symbol* sym = this_oop->unresolved_string_at(which);  
      str = StringTable::intern(sym, CHECK_(constantPoolOop(NULL)));  
      this_oop->string_at_put(which, str);  
   } else {  
      // Another thread beat us and interned string, read string from constant pool  
     str = this_oop->resolved_string_at(which);  
    }  
  } else {  
    str = entry.get_oop();  
  }  
  assert(java_lang_String::is_instance(str), "must be string");  
  return str;  
}

但是

这些代码只能证明 String Literal 可以加载到 StringTable util ldc，但不能证明在 HotSpot VM 中惰性解析。

谁能明确说明一下。

仅供参考，我对 c 知之甚少，但对 c++ 知之甚少。

谢谢！

Answer 1

有一个特殊情况允许在 Java 应用程序中检查字符串是否在测试之前存在于池中，但每个字符串只能执行一次。加上相同内容的字符串字面量，可以检测到延迟加载：

public class Test {
    public static void main(String[] args) {
        test('h', 'e', 'l', 'l', 'o');
        test('m', 'a', 'i', 'n');
    }
    static void test(char... arg) {
        String s1 = new String(arg), s2 = s1.intern();
        System.out.println('"'+s1+'"'
            +(s1!=s2? " existed": " did not exist")+" in the pool before");
        System.out.println("is the same as \"hello\": "+(s2=="hello"));
        System.out.println("is the same as \"main\": "+(s2=="main"));
        System.out.println();
    }
}

测试首先创建一个新的字符串实例，该实例在池中不存在。然后它调用 intern() 并比较引用。存在三种可能的情况：

如果池中存在相同内容的字符串，将返回该字符串，该字符串必须与不在池中的字符串是不同的对象。
我们的字符串被添加到池中并返回。在这种情况下，两个引用是相同的。
将创建一个具有相同内容的新字符串并将其添加到池中。那么，返回的引用就会不同。

我们无法区分 1 和 3，因此如果 JVM 通常向 intern() 中的池中添加新字符串，我们就不走运了。但是如果它添加了我们正在调用 intern() 的实例，我们可以识别场景 2 并确定该字符串不在池中，而是作为我们测试的副作用添加的。

在我的机器上，它打印：

"hello" did not exist before
is the same as "hello": true
is the same as "main": false

"main" existed before
is the same as "hello": false
is the same as "main": true

_也在Ideone

第一次进入test方法时显示"hello"不存在，尽管后面的代码中有一个字符串文字"hello"。所以这证明字符串文字是延迟解析的。由于我们已经手动添加了一个 hello 字符串，因此具有相同内容的字符串文字将解析为相同的实例。

相比之下，"main"这个字符串已经存在于池中，这就很好解释了。 Java 启动器搜索要执行的 main 方法，因此将该字符串作为副作用添加到池中。

如果我们将测试顺序交换为 test('m', 'a', 'i', 'n'); test('h', 'e', 'l', 'l', 'o');，"hello" 字符串文字将在第一个 test 调用中使用并保留在池中，因此当我们测试它时在第二次调用中，字符串将已经存在。

在 Java HotSpot vm 中将 String Literal 加载到 StringTable 的时间

the timing of String Literal loaded into StringTable in Java HotSpot vm

java

jvm-hotspot