欺诈性 Activity 通知中的超时错误 HackerRank

Timeout Error in Fraudulent Activity Notification HackerRank

我正在解决这个问题:Farudulent Activity Notification 在 HackerRank 上。我已经完成了我的代码并且正在工作,但是它 低效 以及非常大的输入。

I don't know but after all my efforts, I am able to give out good solution to a problem of a MEDIUM LEVEL but this timeout error happens every time for very large inputs. I have tried optimizing my code and still I get timeout errors. My agendas for this question and upcoming questions are:

  • How to put efficiency for very large inputs. What kind of intellect it requires.
  • How to reach to that level. What should I prepare for this.
  • Code optimization

我乐于学习,我真的很想学习如何编写更高级和优化的代码来让自己变得更好。我愿意努力工作。

我的算法:

  1. For this problem we must go from incrementing variable i till len(givenArray)-d
  2. Take a variable for the next variable to be compared, my case iterate is the variable
  3. Pass the values to the particular array to the method for counting countFraud()
  4. Add it to the count variable
  5. Increment iterate variable

代码:

# this is for counting the trailing array
def countFraud(arr, nextNum):
    count = 0
    median = 0.0
    d = len(arr)
    #for calculating the median correctly
    arr.sort()
    if d%2 != 0:
        median = arr[int(d/2)]
    else:
        n = int(d/2)
        median = (arr[n] + arr[n-1]) / 2

    #now the opeartion for count from the array
    if nextNum >= 2*median: count += 1
    return count

# Complete the activityNotifications function below.
def activityNotifications(expenditure, d):
    count = 0
    iterate = d
    # it will go upto the len of array - d, so that it will go upto the d length
    # leaving the last element everytime for the comparision
    for i in range(len(expenditure)-d):
        count += countFraud(expenditure[i:iterate], expenditure[iterate])
        iterate += 1
    return count

之前我做了两个循环,将项目添加到 new_array 并将其传递到 countFraud()。但现在我已经优化了它并使它有点 O(N)

我不知道,但由于 Timeout Error,此代码并未提交给所有 TC。操作部分没有问题。这只是代码的效率。

超时错误输入示例:

200000 10000

Link 输入 - Input Data

预期输出:

633

我已阅读这篇文章:HackerRank Environment以了解计时问题。 Python/Python310秒。我的代码肯定比 values greater than 10^3 or 4 花费的更多。

虽然我的代码已经成功通过了 3 个 TC。 请帮忙。谢谢 :)

我在这个问题上花了很多时间,并提出了我自己的新算法,它也给出了超过时间限制(TLE)并且只通过了三个测试用例。

#include <iostream>
#include <cstring>
#include <cmath>
#include <cstring>
using namespace std;
int maxlen=1,minlen=1,heapsize;
double median=0,ct=0;
void min_heapify(double arr[],int i)
{
    int l=(2*i);
    int r=(2*i+1);
    int smallest;
    if(l<=heapsize && arr[l]<arr[i])
    {
        smallest=l;
    }
    else
    {
        smallest=i;
    }
    if(r<=heapsize && arr[r]<arr[smallest])
    {
        smallest=r;
    }
    if(smallest==i)
        return;
    if(smallest!=i)
    {
        double swap=arr[i];
        arr[i]=arr[smallest];
        arr[smallest]=swap;
    }
    min_heapify(arr,smallest);
}
void max_heapify(double arr[], int i)
{
    int l=(2*i);
    int r=(2*i+1);
    int largest;
    if(l<=heapsize && arr[l]>arr[i])
    {
        largest=l;
    }
    else
    {
        largest=i;
    }
    if(r<=heapsize && arr[r]>arr[largest])
    {
        largest=r;
    }
    if(largest==i)
        return;
    if(largest!=i)
    {
        double swap=arr[i];
        arr[i]=arr[largest];
        arr[largest]=swap;
    }
    max_heapify(arr,largest);
}
void insert_valuein_minheap(double minh[], int i, double val)
{
    minh[i]=val;
    while(i>1 && minh[i/2]>minh[i])
    {
        double temp=minh[i/2];
        minh[i/2]=minh[i];
        minh[i]=temp;
        i=i/2;
    }
}
void insert_valuein_maxheap(double maxh[], int i, double val)
{
    maxh[i]=val;
    while(i>1 && maxh[i/2]<maxh[i])
    {
        double temp=maxh[i/2];
        maxh[i/2]=maxh[i];
        maxh[i]=temp;
        i=i/2;
    }
}
void insert_element(double maxh[], double minh[], double val, int size)
{
    if(val<=maxh[1])
    {
        maxlen+=1;
        insert_valuein_maxheap(maxh,maxlen,val);
    }
    else
    {
        minlen+=1;
        insert_valuein_minheap(minh,minlen,val);
    }
    if(maxlen==minlen)
    {
        median=(maxh[1]+minh[1])/2;
        ct=1;
        return;
    }
    if(maxlen<minlen)
    {
        maxlen+=1;
        insert_valuein_maxheap(maxh,maxlen,minh[1]);
        double temp=minh[1];
        minh[1]=minh[minlen];
        minh[minlen]=temp;
        minlen-=1;
        heapsize=minlen;
        min_heapify(minh,1);
    }
    else
    {
        minlen+=1;
        insert_valuein_minheap(minh,minlen,maxh[1]);
        double temp=maxh[1];
        maxh[1]=maxh[maxlen];
        maxh[maxlen]=temp;
        maxlen-=1;
        heapsize=maxlen;
        max_heapify(maxh,1);
    }
}
int main()
{
    int n,td,notif=0;
    cin>>n>>td;
    double array[n+1],maxh[n+1]={},minh[n+1]={};
    for(int i=1;i<=n;i++)
    {
        cin>>array[i];
    }
    double first,second;
    for(int i=1,j;i<=n-td;i++)
    {
        int count=2;
        first=array[i];
        second=array[i+1];
        if(first<=second)
        {
            maxh[1]=first;
            minh[1]=second;
        }
        else
        {
            maxh[1]=first;
            minh[1]=second;
        }
        maxlen=1;minlen=1;ct=0;
        for(j=i+2;count!=td;j++)
        {
            insert_element(maxh,minh,array[j],j);
            count++;
        }
        if(td%2!=0)
        {
            if(maxlen>minlen)
                median=maxh[1];
            else
                median=minh[1];
        }
        else if(ct==0)
        {
            median=(maxh[1]+minh[1])/2;
        }
        float nota=array[j];
        if(nota>=2*median)
        {
            notif++;
        }
    }
    cout<<notif<<endl;
}

因为实际上没有人给我答案。我真的必须看看排行榜中的解决方案。我发现每一种解决方案都难以消化,只有一种解决方案才是好的解决方案。

免责声明:这是一些高级编码技术,因此在继续解决方案之前,您需要更好地了解语言。

解决方案的算法:

  1. This takes two arrays, one is t having total number of array elem and other one let us name it as listD just the first d elements in the sorted manner
  2. A function to return the median value with the list containing first d elements
  3. With the loop starting from the d and going till n-1, if t[i] >= 2*median(): increment var noti
  4. Remove the first element from the listD using PYTHON BISECT ALGORITHM and add it the t[i] to the listD using PYTHON INSORT ALGORITHM in sorted manner
  5. Return noti

代码:

from bisect import bisect_left, insort_left

n, d = map(int, input().split())
t = list(map(int, input().split()))
noti = 0

listD = sorted(t[:d])

def median():
  return listD[d//2] if d%2 == 1 else ((listD[d//2] + listD[d//2-1])/2)

for i in range(d,n):
  if t[i] >= 2*median(): noti += 1
  del listD[bisect_left(listD, t[i-d])]
  insort_left(listD, t[i])
print(noti)

在这里,我们使用了BISECTINSORT,它们的作用基本上是,return要添加的元素的位置和returns添加元素后的排序列表。因此减少了一次又一次对数组进行排序的头痛,从而降低了时间复杂度并解决了所有测试用例。

您可以在这里阅读:Python Bisect and Insort Algo。谢谢,希望它能帮助将来的某个人。

这个通过了所有测试用例:-

    public static double findMedian(int a[]) {
        int n = a.length;
        if (n % 2 != 0)
            return (double) a[n / 2];

        return (double) (a[(n - 1) / 2] + a[n / 2]) / 2.0;
    }

    static void swap(int[] arr, int i, int j) {
        int temp = arr[i];
        arr[i] = arr[j];
        arr[j] = temp;
    }

    static int activityNotifications(int[] expenditure, int d) {
        if (d >= expenditure.length) return 0;

        int numNotifications = 0;
        int[] trailingArr = new int[d];
        for (int i = 0; i < trailingArr.length; i++) {
            trailingArr[i] = expenditure[i];
        }
        Arrays.sort(trailingArr);
        for (int i = d; i < expenditure.length; i++) {
            double median = findMedian(trailingArr);
            if (expenditure[i] >= 2.0 * median) {
                numNotifications += 1;
            }
            int nextToRemoveElement = expenditure[i - d];
            int toInsertElement = expenditure[i];
            adjustTrailingArray(trailingArr, nextToRemoveElement, toInsertElement);
        }
        return numNotifications;
    }

    //This whole thing is O(d) time. Note that we are not sorting again as trailing array was already sorted
    // as preprocessing and now only one new element has to find its position in sorted array.

    private static void adjustTrailingArray(int[] trailingArr, int elementToRemove, int elementToInsert) {
        if (elementToInsert == elementToRemove) return;
        int foundIndex = 0;

        //The first element of unsorted trailing array will move out of the sliding window
        //Since the trailing array was sorted by us, we have lost the position of its first element in original array.
        //Hence, I search linearly for it and replace it with the new element.

        while (foundIndex < trailingArr.length) {
            if (trailingArr[foundIndex] != elementToRemove) {
                foundIndex++;
            } else {
                trailingArr[foundIndex] = elementToInsert;
                break;
            }
        }

        //Now we bubble the new element just inserted using bubble sort to left/right based on whether it was bigger
        //or smaller than the element that got removed.

        if (elementToInsert > elementToRemove) {
            int i = foundIndex;
            while (i < trailingArr.length - 1) {
                if (trailingArr[i] > trailingArr[i + 1]) {
                    swap(trailingArr, i, i + 1);
                    i += 1;
                } else break;
            }
        } else {
            int i = foundIndex;
            while (i > 0) {
                if (trailingArr[i] < trailingArr[i - 1]) {
                    swap(trailingArr, i, i - 1);
                    i -= 1;
                } else break;
            }
        }
    }

不知道为什么没有人提到中位数的中位数算法,从数组中求中位数的复杂度是O(n),而且与顺序无关数组。

我们可以在这里使用计数排序技术。这里棘手的是,我们不能在每次向前移动范围时对整个范围进行排序,因为这会增加时间复杂度,相反,我们应该只修改频率数组,然后我们可以找到中位数,简单地将频率从范围的开始,直到总和变得大于或等于 d/2.

这里要注意的重要一点:奇数和偶数的中位数 'd' 略有不同。

int median(int arr[], int d)
{
    int med;
    
    int sum = 0;
    for(int i = 0; i <= 200; i++)
    {
        sum = sum + arr[i];
        if(sum>=d)
        {
            med = i;
            break;
        }
    }
    return med;
}

int activityNotifications(vector<int> expenditure, int d) {
    int count  = 0;
    int n = expenditure.size();
    if(n==d)
    {
        return 0;
    }
    int temp[201]={0};
    for(int i = 0; i < d; i++)
    {
        temp[expenditure[i]]++;
    }
    
    int med = median(temp, d/2+d%2);
    for(int i = d; i < n; i++)
    {
        if(d%2==0)
        {
            int temp_med = median(temp, d/2+1);
            if(expenditure[i]>=med+temp_med)
            {
                count++;
            }
        }
        else
        {
            if(expenditure[i]>=med*2)
            {
                count++;
            }
        }
        
        temp[expenditure[i-d]]--;
        temp[expenditure[i]]++;
        med = median(temp,d/2+d%2);
    }
    return count;
}

与您所做的类似,此方法使用两个函数:一个用于 activity 通知,另一个用于查找中位数。

找到中位数很容易。使用的方法是检查中位数支出的回顾天数 d 是奇数还是偶数,并根据该信息进行相应计算。

然后说到activityNotifications,关键是要知道expenditure[i]在0到200之间,包括两个数字(201)。

总而言之

def findMedian(counter, d):
    count = 0
    median = 0

    if d%2 != 0:
        for i in range(len(counter)):
            count += counter[i]

            if count > d//2:
                median = i
                break
            
    else:
        first = 0
        second = 0

        for i, _ in enumerate(counter):
            count += counter[i]
            
            if first == 0 and count >= d//2:
                first = i
                
            if second == 0 and count >= d//2 + 1:
                second = i
                break
            
        median = (first+second) / 2
        
    return median


def activityNotifications(expenditure, d):
    count = 0
    counter = [0]*201
    
    for exp in range(d):
        counter[expenditure[exp]] += 1

    for i in range(d, len(expenditure)):
        new = expenditure[i]
        old = expenditure[i-d]
        median = findMedian(counter, d)
        
        if new >= 2*median:
            count += 1
            
        counter[old] -= 1
        counter[new] += 1
        
    return count

这将通过 HackerRank

中的所有当前 8 个测试用例

灵感来源:


请注意,在 Code Review 中使用 pandas 也有一个很好的答案。虽然解决问题非常有趣,但它在 HackerRank

中不起作用
import pandas as pd

def activityNotifications(expenditure, d):
    df = pd.DataFrame(expenditure)
    return (df.shift(-1) > 2 * df.rolling(d).median())[0].sum()

简单多了

  1. 对于第一个 window 排序 - 这需要 O(dlog(d))

  2. 因为它已经排序了,我们可以利用它,对于每个下一个 window 只需用离开 window 的数字将 new-incoming 数字替换为 window 并从那里对其正确位置进行排序 - 这需要 - O(d)

总复杂度 O(dlog(d) + (n-d-1)*(d)) = O(nd)

抱歉,如果代码看起来有点乱

static int activityNotifications(int[] expenditure, int d) {
    int count = 0;
    int days = expenditure.length;
    int[]tempArr = Arrays.copyOfRange(expenditure,0,d);
    Arrays.sort(tempArr);//

    for(int i=0;d+i<days;i++){
        
       if(i>0 ){
      //rearrange them based on outgoing and incoming into the window
       int outgo = expenditure[i-1];
       int income = expenditure[i+d-1];
       rearrange(outgo,income,tempArr);}

    //get medain
     float median;
     int size= d;
     if(size%2==0){
        int mid = size/2;
       median = (float)(tempArr[mid-1]+tempArr[mid])/2;          
     }
    else{
        median = tempArr[size/2];
    }
   
    //checking notification         

        if(expenditure[d+i]>=2*median){
            count++;
        }

    }
return count;
}


  public static void rearrange(int outgo,int income,int[]arr){
  
  int len = arr.length;
  int i=0;
  for( i=0;i<len;i++){
      if(arr[i]==outgo){
          arr[i]=income;
          break;
      }          
  }
  

if(i==0){        
 if(arr[i+1]>=income){return;}
 else{
      while(i+1<len  && arr[i+1]<income){
         arr[i]=arr[i+1];
         i++;
     }
     arr[i]=income;
 }
}
else if(i==len-1){
    if(arr[i-1]<=income){return;}
 else{
     while( i>=1 & arr[i-1]>income ){
         arr[i]=arr[i-1];
         i--;
     }
     arr[i]=income;
 }
 }


else{
    if(arr[i+1]<income){
         while(i+1<len  && arr[i+1]<income){
         arr[i]=arr[i+1];
         i++;
     }
     arr[i]=income;
    }
     if(arr[i-1]>income){

         while( i>=1 && arr[i-1]>income ){
         arr[i]=arr[i-1];
         i--;
     }
     arr[i]=income;            
    }
}