计算数组中的不同切片

Question

我正在尝试解决 this 问题。

An integer M and a non-empty zero-indexed array A consisting of N non-negative integers are given. All integers in array A are less than or equal to M.

A pair of integers (P, Q), such that 0 ≤ P ≤ Q < N, is called a slice of array A. The slice consists of the elements A[P], A[P + 1], ..., A[Q]. A distinct slice is a slice consisting of only unique numbers. That is, no individual number occurs more than once in the slice.

For example, consider integer M = 6 and array A such that:
A[0] = 3
A[1] = 4
A[2] = 5
A[3] = 5
A[4] = 2 
There are exactly nine distinct slices: (0, 0), (0, 1), (0, 2), (1, 1), (1,2), (2, 2), (3, 3), (3, 4) and (4, 4).

The goal is to calculate the number of distinct slices.

提前致谢。

#include <algorithm>
#include <cstring>
#include <cmath>
#define MAX 100002

// you can write to stdout for debugging purposes, e.g.
// cout << "this is a debug message" << endl;

using namespace std;

bool check[MAX];

int solution(int M, vector<int> &A) {
    memset(check, false, sizeof(check));
    int base = 0;
    int fibot = 0;
    int sum = 0;

    while(fibot < A.size()){

        if(check[A[fibot]]){
            base = fibot;
        }

        check[A[fibot]] = true;

   sum += fibot - base + 1;
        fibot += 1;  
    }

    return min(sum, 1000000000);
}

Answer 1

解法不正确，因为你的算法有误。

首先给大家举个反例。让A = {2, 1, 2}。第一次迭代：base = 0, fibot = 0, sum += 1. 没错。第二个：base = 0, fibot = 1、sum += 2。这也对。最后一步：fibot = 2，check[A[fibot]] is true，因此，base = 2。不过应该是1。所以你的代码 returns1 + 2 + 1 = 4 而正确答案 1 + 2 + 2 = 5.

正确的做法可能是这样的：从L = 0开始。对于从 0 到 n - 1 的每个 R，继续向右移动 L 直到子数组只包含不同的值（您可以维护每个值在一个数组并使用 A[R] 是唯一可以多次出现的元素这一事实）。

您的代码还有一个问题：如果 int 在测试平台上是 32 位类型，则 sum 变量可能溢出（例如，如果 [=30= 的所有元素] 是不同的）。

至于为什么你的算法不正确的问题，我一开始就不知道为什么它应该是正确的。你能证明这个吗？ base = fibot 分配对我来说看起来很随意。

Answer 2

解决方案 100% 使用 Ruby

LIMIT = 1_000_000_000

def solution(_m, a)
  a.each_with_index.inject([0, {}]) do |(result, slice), (back, i)|
    return LIMIT if result >= LIMIT
    slice[back] = true
    a[(i + slice.size)..-1].each do |front|
      break if slice[front]
      slice[front] = true
    end
    slice.delete back
    [result + slice.size, slice]
  end.first + a.size
end

Answer 3

使用 Caterpillar 算法和 S(n+1) = S(n) + n + 1 的公式，其中 S(n) 是 n 元素数组 java 的切片计数 java 解决方案可以是：

 public int solution(int top, int[] numbers) {
        int len = numbers.length;
        long count = 0;

        if (len == 1) return 1;

        int front = 0;
        int[] counter = new int[top + 1];

        for (int i = 0; i < len; i++) {
            while(front < len && counter[numbers[front]] == 0 ) {
                count += front - i + 1;
                counter[numbers[front++]] = 1;
            }

            while(front < len && numbers[i] != numbers[front] && i < front) {
                counter[numbers[i++]] = 0;
            }

            counter[numbers[i]] = 0;

            if (count > 1_000_000_000) {
                return 1_000_000_000;
            }
        }

        return count;
    }

Answer 4

100% python 解决方案对我有帮助，感谢 https://www.martinkysel.com/codility-countdistinctslices-solution/

def solution(M, A):
    
        the_sum = 0
    
        front = back = 0
    
        seen = [False] * (M+1)
    
        while (front < len(A) and back < len(A)):
    
            while (front < len(A) and seen[A[front]] != True):
    
                the_sum += (front-back+1)
    
                seen[A[front]] = True
    
                front += 1
    
            else:
    
                while front < len(A) and back < len(A) and A[back] != A[front]:
    
                    seen[A[back]] = False
    
                    back += 1
    
                seen[A[back]] = False
    
                back += 1
                   
        return min(the_sum, 1000000000)

Answer 5

我想分享一下我用C++实现的算法的解释，然后是实际实现。

请注意，不同切片的最小数量是 N，因为每个元素都是一个不同的单项切片。
从第一个元素开始向后索引。
从第一个元素开始前面的索引。
向前推进，直到我们在序列中找到重复项。
在每次迭代中，将计数器增加必要的数量，这是前后的区别。
如果我们在任何迭代中达到最大计数，则立即 return 进行轻微优化。
在序列的每次迭代中，记录出现的元素。
找到重复项后，将后向索引提前一个。
当我们推进后向索引时，清除所有出现的元素，因为我们在这些元素之外开始一个新的切片。

此解决方案的运行时复杂度为 O(N)，因为我们遍历每个
元素.

这个解决方案的 space 复杂度是 O(M) 因为我们有一个散列要存储序列中出现的元素。这个hash的最大元素是M.

int solution(int M, vector<int> &A)                                             
{                                                                               
  int N = A.size();                                                             
  int distinct_slices = N;                                                      
  vector<bool> seq_hash(M + 1, false);                                          
  for (int back = 0, front = 0; front < N; ++back) {                            
    while (front < N and !seq_hash[A[front]]) { distinct_slices += front - back; if (distinct_slices > 1000000000) return 1000000000; seq_hash[A[front++]] = true; }
    while (front < N and back < N and A[back] != A[front]) seq_hash[A[back++]] = false;
    seq_hash[A[back]] = false;                                                  
  }                                                                             
                                                                                
  return distinct_slices;                                                       
}

计算数组中的不同切片

count distinct slices in an array

c++

algorithm

vector

problem-steps-recorder