无重复字符的最长公共子串的长度

Question

给定"abcabcbb"，答案是"abc"，长度是3

给定"bbbbb"，答案是"b"，长度为1。

给定"pwwkew"，答案是"wke"，长度为3。注意答案必须是子串，"pwke"是子序列，不是子字符串。

我想出了一个可行的解决方案，但在几个测试用例中都失败了。然后我找到了一个更好的解决方案并重写了它以尝试理解它。下面的解决方案完美无缺，但是在与这个东西斗争了大约 2 个小时之后，我仍然不明白为什么这行代码有效。

import java.util.*;
import java.math.*;

public class Solution {

  public int lengthOfLongestSubstring(String str) {

  if(str.length() == 0)
    return 0;

  HashMap<Character,Integer> map = new HashMap<>();
  int startingIndexOfLongestSubstring = 0;
  int max = 0;

  for(int i = 0; i < str.length(); i++){
      char currentChar = str.charAt(i); 
      if(map.containsKey(currentChar))
         startingIndexOfLongestSubstring = Math.max(startingIndexOfLongestSubstring, map.get(currentChar) + 1);

      map.put(currentChar, i);
      max = Math.max(max, i - startingIndexOfLongestSubstring + 1);

      }//End of loop

    return max;

   }
}

有问题的行是

max = Math.max(max, i - startingIndexOfLongestSubstring + 1);

我不明白为什么会这样。我们取之前最大值之间的最大值，以及当前索引与当前最长子字符串的起始索引之间的差异，然后加 1。我知道代码正在获取当前索引与startingIndexOfSubstring，但我无法概念化为什么它可以为我们提供预期的结果；有人可以向我解释这一步吗，特别是为什么它有效？

Answer 1

因为

i - startingIndexOfLongestSubstring + 1

是 i 和 startingIndexOfLongestSubstring 索引之间的字符数。例如位置 2 和 3 之间有多少个字符？ 3-2=1 但我们有 2 个字符：位置 2 和位置 3。

我已经在代码中描述了每个动作：

public class Solution {

    public int lengthOfLongestSubstring(String str) {

        if(str.length() == 0)
            return 0;

        HashMap<Character,Integer> map = new HashMap<>();
        int startingIndexOfLongestSubstring = 0;
        int max = 0;

        // loop over all characters in the string
        for(int i = 0; i < str.length(); i++){
            // get character at position i
            char currentChar = str.charAt(i);
            // if we already met this character
            if(map.containsKey(currentChar))
                // then get maximum of previous 'startingIndexOfLongestSubstring' and 
                // map.get(currentChar) + 1 (it is last occurrence of the current character in our word before plus 1)
                // "plus 1" - it is because we should start count from the next character because our current character 
                // is the same
                startingIndexOfLongestSubstring = Math.max(startingIndexOfLongestSubstring, map.get(currentChar) + 1);

            // save position of the current character in the map. If map already has some value for current character 
            // then it will override (we don't want to know previous positions of the character)
            map.put(currentChar, i);
            // get maximum between 'max' (candidate for return value) and such value for current character
            max = Math.max(max, i - startingIndexOfLongestSubstring + 1);

        }//End of loop

        return max;

    }
}

Answer 2

我平时不太会解释，举个例子试试吧。

字符串是 "wcabcdeghi".

暂时忘掉代码，假设我们正在努力想出一个逻辑。

我们从 w 开始，一直走到 c -> a -> b -> c。
我们需要在这一点上停下来，因为 "c" 正在重复。所以我们需要一个映射来存储一个字符是否重复。（在代码中： map.put(currentChar, i); ）
既然我们知道了一个字符是否重复，我们需要知道最大值是多少。到目前为止的长度。（在代码中-）max
现在我们知道跟踪前 2 个变量 w->c 的计数没有意义。这是因为包括这个在内，我们已经拿到了Max。价值。所以从下一次迭代开始，我们只需要检查 a -> b -> soon 的长度。
让我们有一个变量（在代码中 -）startingIndexOfLongestSubstring 来跟踪这个。（这应该被命名为 startingIndexOfNonRepetativeCharacter，然后我又不喜欢命名）。
现在我们再次继续，但是我们还没有最终确定如何跟踪我们当前正在解析的子字符串。（即来自 abcd...）
想到这一点，我只需要 "a" 所在的位置（即 startingIndexOfNonRepetativeCharacter），所以要知道当前子字符串的长度，我需要做的就是 (在代码中-)i - startingIndexOfLongestSubstring + 1（当前字符位置-非重复字符长度+（减法不包含两边所以加1）。我们称之为currentLength
等等，我们要用这个计数做什么。每次我们找到一个新变量时，我们都需要检查这个 currentLength 是否可以打破我们的最大值。所以（代码中-）max = Math.max(max, i - startingIndexOfLongestSubstring + 1);
现在我们已经涵盖了我们需要的大部分语句，并且根据我们的逻辑，每当我们遇到一个已经存在的变量时，我们所需要的只是 startingIndexOfLongestSubstring = map.get(currentChar)。那我们为什么要做 Max?
考虑 String 为 "wcabcdewghi" 的场景。当我们开始将新计数器处理为 a -> b -> c -> d -> e -> w 此时，我们的逻辑会检查该字符之前是否存在。从现在开始，它从索引“1”开始计数。这完全搞乱了整个计数。所以我们需要确保，我们从 map 中获取的下一个索引总是大于我们计数的起点（即，仅当字符出现在 startingIndexOfLongestSubstring 之前时，select 来自 map 的字符。

希望我已经回答了代码中的所有行，主要是如果解释是可以理解的。

无重复字符的最长公共子串的长度

Length of the Longest Common Substring without repeating characters

java

longest-substring