String interning 是否会导致 String 同时在堆内存和本机内存中?

Does String interning causes a String to be both in heap and in native memory?

这里是 javaString#intern 的文档:

/**
 * Returns a canonical representation for the string object.
 * <p>
 * A pool of strings, initially empty, is maintained privately by the
 * class {@code String}.
 * <p>
 * When the intern method is invoked, if the pool already contains a
 * string equal to this {@code String} object as determined by
 * the {@link #equals(Object)} method, then the string from the pool is
 * returned. Otherwise, this {@code String} object is added to the
 * pool and a reference to this {@code String} object is returned.
 * <p>
 * It follows that for any two strings {@code s} and {@code t},
 * {@code s.intern() == t.intern()} is {@code true}
 * if and only if {@code s.equals(t)} is {@code true}.
 * <p>
 * All literal strings and string-valued constant expressions are
 * interned. String literals are defined in section 3.10.5 of the
 * <cite>The Java&trade; Language Specification</cite>.
 *
 * @return  a string that has the same contents as this string, but is
 *          guaranteed to be from a pool of unique strings.
 */

假设我有下一个代码:

String ref1 = "ref";
String ref2 = ref1.intern();

在初始化 ref 的时间点,ref1 是否仍在堆中。我问是因为如果它在不删除原始引用的情况下实习字符串将使 java 进程使用的 RSS 内存加倍。

如果我们考虑你的例子,是的,ref1 仍然在堆中,但是因为 ref1ref2 都指向 相同的 实例。您使用字符串文字初始化 ref1,并且字符串文字会自动保留 as described here:

Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (§15.28) - are "interned" so as to share unique instances, using the method String.intern.

因此,没有双重内存使用(如果您不考虑字符串存在于保存 class ConstantPool 的内容和所有 class 结构信息的单独内存区域中) .

要更详细地解释实习的实际工作原理,请参阅此示例:

public class Intern{
    public static void main(String... args){
        String str1="TestStr";
        String str2="TestStr";
        System.out.println("1. "+(str1==str2));
        String str3=str1.intern();
        System.out.println("2. "+(str1==str3));
        String str4=new String("TestStr");
        System.out.println("3. "+(str1==str4));
        String str5=str4.intern();
        System.out.println("4. "+(str4==str5));
        System.out.println("5. "+(str1==str5));
    }
}

你会得到这个输出:

1. true

从常量池加载的字符串自动驻留在字符串池中,结果为真,两个实例都引用同一个驻留对象。

2. true

str3 引用了一个已经被驻留的字符串实例。

3. false

str4为新实例,与之前的实例无关

4. false

一次性 str4 实例未指向字符串池中自开始以来就存在的同一个对象。

5. true

str5 按预期指向我们的 interned 字符串。

重要的是要注意,在 Java 7(Oracle 实现)之前,驻留字符串存储在 PermGem 中(因为 Java 8 不再存在),但自该版本以来,它们已被移动到堆。因此,在大量使用实习功能时,使用较旧版本的 JVM 可能会出现特殊的内存问题。

有关如何在不同版本中管理 interned 字符串的更多信息,请查看此 nice post