在 C++ 中查找不适用于特定测试用例的最长子字符串

Question

Q) 没有重复字符的最长子串

给定"abcabcbb"，答案是"abc"，长度是3。

给定"bbbbb"，答案是"b"，长度为1.

给定"pwwkew"，答案是"wke"，长度为3。注意答案必须是子串，"pwke"是子序列，不是子串。

我的回答：

class Solution {
public:
int lengthOfLongestSubstring(string s) {
    int count[256];
    int dummy=0;
    int c=0;
    for(int i=0;i<256;i++){
        count[i]=0;
    }
    for(int i=0;i<s.length();i++){
        count[s[i]]++;
        if(count[s[i]]==1){
        c++;
        i++;
        //cout<<c;
        }
        else{
        //cout<<C;
        dummy=dummy>c?dummy:c;
        //cout<<dummy;
        c=0;
            for(int j=0;j<256;j++){
                count[j]=0;
            }
            i++;
        }

    }
     return dummy;
}
};

我知道这不是最佳解决方案，但我是菜鸟。此代码适用于许多测试用例，但字符串 "pwwkew" 除外，原始答案为 3 但我得到 0.

我尝试过的：

对于字符串 "abcabcbb" 我得到正确的输出是 3.

对于字符串 "bbbb" 我得到正确的输出是 1.

对于特定的字符串 "pwwkew" 我得到的是 0。在这种情况下答案应该是 3。

我尝试打印这些值来检查哪里出错了。对于第三种情况，它似乎没有输入 else 语句。 cout 在 if 语句中打印 123。但它应该为 pw 打印 12。 else 语句中的 cout 不起作用。

为什么只有这种情况我得到错误的输出？

请帮我解决这个问题。

Answer 1

我认为 OP 的算法有问题（或至少缺失）。

我没有完全理解该算法，但至少在一定程度上可以计算字符的出现次数。如果特定字符的 count 不是 1，则发现重复项并且 count 数组被完全重置。恕我直言，这是错误的观点。 count 数组退化为标志数组（这还不够）。

反例pwwkwe:

重复 s[2]
重置count（即重启）
重复 s[4]
重置count（即重启）。

因此，检测到的最长子串的长度最多为 2。但从 k 开始，它实际上是 3 (kwe)。

所以，我想出了一个不同的主意：

一旦发现重复项，最长的子字符串实际上可能就在该字符之前出现的位置之后开始。考虑到这一点，副本不再是副本。

这听起来可能有点令人困惑。可能是，当我展示我是如何解决它时，它会变得更容易理解：

count 不再用作标志数组。相反，最后一次出现的索引存储在那里（对于每个字符）。因此，对于每个字符，可以检查到它之前出现的距离（没有重复的字符）。但是，它必须没有任何重复。因此，我额外引入了一个起始索引 (i0)，它总是在上一次出现的字符之后设置一次。（不考虑起始索引之前的重复项。）这样，在确定正确的子字符串时只考虑最后一个重复项。

在代码中：

#include <iostream>
#include <string>

/* determines longest substring without duplicated characters.
 * param s ... string to evaluate
 * return: pair of
 *   first  ... start index of found substring
 *   second ... length of found substring
 */
std::pair<int, int> lengthOfLongestSubstring(const std::string &s)
{
  // index of last occurrence for each character
  // (-1 ... character not yet occurred)
  int iChrLast[256];
  for (int &i : iChrLast) i = -1;
  // result start index, length
  std::pair<int, int> result(0, 0);
  // check all characters (i0 ... current start)
  for (int i = 0, i0 = 0; i < (int)s.length(); ++i) {
    // cast char to index (via (unsigned char) to prevent negative indices)
    const int c = (unsigned char)s[i];
    // check if there is a duplicate after current start
    if (iChrLast[c] >= i0) i0 = iChrLast[c] + 1;
    // compute length with current start
    const int len = i - i0 + 1;
    // check whether new max. length found (update in case)
    if (result.second < len) result = std::make_pair(i0, len);
    // remember last occurrence of this char
    iChrLast[c] = i;
  }
  // done
  return result;
}

int main()
{
  const std::string tests[] = {
    "abcabcbb",
    "bbbbb",
    "pwwkew"
  };
  for (const std::string &test : tests) {
    const std::pair<int, int> result = lengthOfLongestSubstring(test);
    std::cout << test << ": " << result.first << ", " << result.second
      << " ("<< test.substr(result.first, result.second) << ")\n";
  }
  return 0;
}

输出：

abcabcbb: 0, 3 (abc)
bbbbb: 0, 1 (b)
pwwkew: 2, 3 (wke)

Live Demo on coliru

我不得不承认我同意 Craig Young 的观点

Programming is not about learning what mistakes to "not repeat". It's about understanding the state changes going on, the cause-and-effect, and figuring when & why it doesn't do as you expect.

我假设需要某种回溯甚至递归来解决这个问题（作为一个谜题）。所以，当我终于找到这个解决方案时，我自己也有点惊讶。（一旦我得到它运行我无法抗拒 post 这个作为答案。）

你是如何学习算法的？可能是天赋，可能是经验，肯定还有很多练习...

在 C++ 中查找不适用于特定测试用例的最长子字符串

Find longest substring not working for a particular test case in c++

c++

string

hashtable