为什么我的蛮力(O((n1 + n2)log(n1 + n2)))解决方案比优化解决方案(O(n1 + n2))更快?
why is my brute force(O((n1+n2)log(n1+n2))) solution faster than the optimised solution(O(n1+n2))?
我正在努力将 2 个长度为 n1 和 n2 的排序数组合并到 return 一个长度为 n1+n2 的排序数组。
我的强力解决方案应该具有 O((n1+n2)log(n1+n2)) 的时间复杂度,而我的优化解决方案应该具有 O(n1+n2) 的时间复杂度。有人能解释一下我的蛮力是如何比优化解决方案快近 10 倍的吗?请在下面找到代码:
from time import time
import random
array1 = sorted([random.randint(1, 500) for i in range(100000)])
array2 = sorted([random.randint(1, 1000000) for i in range(100000)])
def brute_force(arr1, arr2):
'''
Brute force solution O((n1+n2)log(n1+n2)), Space complexity O(1)
'''
#sorted function has the time complexity on nlog(n)
return sorted(arr1+arr2)
def optimized_soln(arr1, arr2):
'''
More optimized soltuion Time Complexity O(n1+n2), Space complexity O(n1+n2)
'''
i = 0
j = 0
merged_array = []
s1 = time()
#if one of the arrays is empty , no need to merge
if len(arr1) == 0:
return arr2
if len(arr2) ==0:
return arr1
while i<len(arr1):
if arr1[i] <= arr2[j]:
if i < (len(arr1) - 1):
merged_array.append(arr1[i])
i = i + 1
else:
merged_array.append(arr1[i])
merged_array = merged_array + arr2[j:]
i = len(arr1)
else:
if j < (len(arr2) - 1):
merged_array.append(arr2[j])
j = j + 1
else:
merged_array.append(arr2[j])
merged_array = merged_array + arr1[i:]
i = len(arr1)
return merged_array
s1 = time()
print('Brute Force: O((n1+n2)log(n1+n2)), Space complexity O(1)')
brute_force(array1,array2)
print('Time Taken: ',(time()-s1))
s2 = time()
print('Optimized Soln: Time Complexity O(n1+n2), Space complexity O(n1+n2)')
optimized_soln(array1,array2)
print('Time Taken: ',(time()-s2))
Python 中的每个操作都有一些与之相关的开销。您的蛮力解决方案具有 O(1) 此类操作,而排序本身是在较低级别的代码中实现的。
优化的解决方案具有 O(n) Python 操作,因此在这种情况下开销要大得多。因此,在后一种情况下具有更好的算法复杂性的好处被开销所抑制。
您的强力解决方案在 Python 内以线性时间运行。 sort
的时间复杂度通常为 O(n log n),但这并不排除它在某些情况下更快。 Python 使用 timsort,它旨在利用输入数据中已排序的运行。
来自https://en.wikipedia.org/wiki/Timsort
Timsort was designed to take advantage of runs of consecutive ordered
elements that already exist in most real-world data, natural runs. It
iterates over the data collecting elements into runs and
simultaneously putting those runs in a stack. Whenever the runs on the
top of the stack match a merge criterion, they are merged.
因此,当您追加两个排序列表,然后使用 timsort 对它们进行排序时,它会找到两个范围,合并它们,然后停止。考虑到复杂度都是线性的,加速是简单的事实 sort
是用优化的 C 编写的,它很容易比直接用 python.
编写的代码快 10 倍或更多
我正在努力将 2 个长度为 n1 和 n2 的排序数组合并到 return 一个长度为 n1+n2 的排序数组。
我的强力解决方案应该具有 O((n1+n2)log(n1+n2)) 的时间复杂度,而我的优化解决方案应该具有 O(n1+n2) 的时间复杂度。有人能解释一下我的蛮力是如何比优化解决方案快近 10 倍的吗?请在下面找到代码:
from time import time
import random
array1 = sorted([random.randint(1, 500) for i in range(100000)])
array2 = sorted([random.randint(1, 1000000) for i in range(100000)])
def brute_force(arr1, arr2):
'''
Brute force solution O((n1+n2)log(n1+n2)), Space complexity O(1)
'''
#sorted function has the time complexity on nlog(n)
return sorted(arr1+arr2)
def optimized_soln(arr1, arr2):
'''
More optimized soltuion Time Complexity O(n1+n2), Space complexity O(n1+n2)
'''
i = 0
j = 0
merged_array = []
s1 = time()
#if one of the arrays is empty , no need to merge
if len(arr1) == 0:
return arr2
if len(arr2) ==0:
return arr1
while i<len(arr1):
if arr1[i] <= arr2[j]:
if i < (len(arr1) - 1):
merged_array.append(arr1[i])
i = i + 1
else:
merged_array.append(arr1[i])
merged_array = merged_array + arr2[j:]
i = len(arr1)
else:
if j < (len(arr2) - 1):
merged_array.append(arr2[j])
j = j + 1
else:
merged_array.append(arr2[j])
merged_array = merged_array + arr1[i:]
i = len(arr1)
return merged_array
s1 = time()
print('Brute Force: O((n1+n2)log(n1+n2)), Space complexity O(1)')
brute_force(array1,array2)
print('Time Taken: ',(time()-s1))
s2 = time()
print('Optimized Soln: Time Complexity O(n1+n2), Space complexity O(n1+n2)')
optimized_soln(array1,array2)
print('Time Taken: ',(time()-s2))
Python 中的每个操作都有一些与之相关的开销。您的蛮力解决方案具有 O(1) 此类操作,而排序本身是在较低级别的代码中实现的。
优化的解决方案具有 O(n) Python 操作,因此在这种情况下开销要大得多。因此,在后一种情况下具有更好的算法复杂性的好处被开销所抑制。
您的强力解决方案在 Python 内以线性时间运行。 sort
的时间复杂度通常为 O(n log n),但这并不排除它在某些情况下更快。 Python 使用 timsort,它旨在利用输入数据中已排序的运行。
来自https://en.wikipedia.org/wiki/Timsort
Timsort was designed to take advantage of runs of consecutive ordered elements that already exist in most real-world data, natural runs. It iterates over the data collecting elements into runs and simultaneously putting those runs in a stack. Whenever the runs on the top of the stack match a merge criterion, they are merged.
因此,当您追加两个排序列表,然后使用 timsort 对它们进行排序时,它会找到两个范围,合并它们,然后停止。考虑到复杂度都是线性的,加速是简单的事实 sort
是用优化的 C 编写的,它很容易比直接用 python.