字符串解的秩

Question

我正在回答一个问题，它要求您在按字典顺序排序的排列中找到字符串的排名。

O(N^2) 很清楚。

一些网站也有 O(n) solution。优化的部分基本上是预填充 count array 这样

count[i] contains count of characters which are present in str and are smaller than i.

我知道这会降低复杂性，但我无法理解我们计算此数组的方式。这是执行此操作的函数（取自 link）：

// Construct a count array where value at every index
// contains count of smaller characters in whole string
void populateAndIncreaseCount (int* count, char* str)
{
    int i;

    for( i = 0; str[i]; ++i )
        ++count[ str[i] ];

    for( i = 1; i < 256; ++i )
        count[i] += count[i-1];
}

有人可以对这个功能进行直观的解释吗？

Answer 1

该解决方案是执行 Bucket Sort 然后对输出进行排序。

桶排序是 O(items + number_of_possible_distinct_inputs)，对于固定的字母表，可以将其宣传为 O(n)。

然而在实践中，UTF 会产生一个相当大的字母表。因此，我建议改用快速排序。因为分<、>、=三个桶的quicksort对大字符集效率高，但还是占了小字符集的优势。

Answer 2

再过一遍就明白了。由于 c++ 中的错误语法而感到困惑。它实际上在做一件非常简单的事情（这是 java 版本：

void populateAndIncreaseCount(int[] count, String str) {
    // count is initialized to zero for all indices
    for (int i = 0; i < str.length(); ++i) {
      count[str.charAt(i)]++;
    }

    for (int i = 1; i < 256; ++i)
      count[i] += count[i - 1];
  }

在第一步之后，字符串中出现的字符的索引是非零的。然后，对于计数数组中的每个索引，它将是 index-1 之前所有计数的总和，因为数组表示按字典顺序排序的字符。并且，在每次搜索之后，我们还更新计数数组：

// 从 count[] 数组中删除一个字符 ch // 由 populateAndIncreaseCount()

构造

void updatecount (int* count, char ch)
{
    int i;
    for( i = ch; i < MAX_CHAR; ++i )
        --count[i];
}

字符串解的秩

Rank of string solution

algorithm

string-algorithm