Python:如何使此代码更快 运行

Python: how can this code be made to run faster

我是 Python 的新手,我正在通过 Codewars 慢慢学习。我知道这可能违反规则,但我有一个效率问题。

给你一个整数列表

ls = [100, 76, 56, 44, 89, 73, 68, 56, 64, 123, 2333, 144, 50, 132, 123, 34, 89]

你必须写一个函数choose_best_sum(t, k, ls)

这样你从 ls 中找到 k 个整数的组合,使得这 k 个整数的总和接近或等于 t。

我的最终解决方案通过了测试,但在更详细的测试中失败了,这可能是因为效率问题。我想更多地了解效率。这是我的代码

import itertools

def choose_best_sum(t, k, ls):
    if sum(sorted(ls)[:k]) > t or len(ls) < k:
       return None
    else:
       combos = itertools.permutations(ls, k)
       return max([[sum(i)] for i in set(combos) if sum(i) <= t])[0]

有人可以强调这里的瓶颈在哪里(我假设在排列调用中)以及如何使这个函数更快?

编辑:

上面的解决方案给出了

1806730 次函数调用在 0.458 秒内

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    0.457    0.457 <string>:1(<module>)
    1    0.000    0.000    0.457    0.457 exercises.py:14(choose_best_sum)
742561    0.174    0.000    0.305    0.000 exercises.py:19(<genexpr>)
321601    0.121    0.000    0.425    0.000 exercises.py:20(<genexpr>)
    1    0.000    0.000    0.458    0.458 {built-in method builtins.exec}
    1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
    1    0.032    0.032    0.457    0.457 {built-in method builtins.max}
    1    0.000    0.000    0.000    0.000 {built-in method builtins.sorted}
742561    0.131    0.000    0.131    0.000 {built-in method builtins.sum}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

借助帮助,我得到的最终解决方案是:

def choose_best_sum(t, k, ls):
   ls = [i for i in ls if i < t and i < (t - sum(sorted(ls)[:k-1]))]
   if sum(sorted(ls)[:k]) > t or len(ls) < k:
      return None
   else:
      return max(s for s in (sum(i) for i in itertools.combinations(ls, k)) if s <= t)

排序依据:标准名称

0.002 秒内调用了 7090 次函数

 ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    1    0.000    0.000    0.003    0.003 <string>:1(<module>)
 2681    0.001    0.000    0.003    0.000 exercises.py:10(<genexpr>)
    1    0.000    0.000    0.003    0.003 exercises.py:5(choose_best_sum)
    1    0.000    0.000    0.000    0.000 exercises.py:6(<listcomp>)
    1    0.000    0.000    0.003    0.003 {built-in method builtins.exec}
    1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
    1    0.000    0.000    0.003    0.003 {built-in method builtins.max}
   17    0.000    0.000    0.000    0.000 {built-in method builtins.sorted}
 4385    0.001    0.000    0.001    0.000 {built-in method builtins.sum}
    1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

你的表达有几个明显的缺陷

max([[sum(i)] for i in set(combos) if sum(i) <= t])[0]
  1. 你无缘无故运行宁sum(i)两次;

  2. 您正在将结果打包到一个列表中 ([sum(i)]),然后将其解包 ([0])

  3. 您正在无缘无故地将 combos 转换为集合

尝试将其替换为

sums = [sum(c) for c in combos]
return max(s for s in sums if s <= t)

编辑:好的,关于更好算法的一些想法:

哦!首先,使用 itertools.combinations 而不是 itertools.permutations。你只是在取总和;项目的顺序没有区别。如果您 运行ning on ie k = 4,combinations 将 return 4! == 在相同的输入数据上,条目比 permutations 少 24 倍。

其次,我们希望在一开始就从 ls 中丢弃尽可能多的项目。显然我们可以丢弃任何 > t 的值;但我们可以得到比这更严格的界限。如果我们添加 (k - 1) 个最小值,则最大允许值必须 <= t - (k-1)_sum.

(如果我们正在寻找一个精确的总和,我们可以 运行 这个技巧反过来 - 添加 (k - 1) 个最大值会给我们一个最小允许值 - 我们可以重复应用这些两条规则来丢弃更多的可能性。但这并不适用于此。)

第三,我们可以查看所有 (k - 1) 个值的组合,然后使用 bisect.bisect_left 直接跳转到最佳可能的第 k 个值。有一点复杂,因为您必须仔细检查第 k 个值是否尚未被选为 (k - 1) 值之一 - 您不能直接使用内置 itertools.combinations 函数,但您可以使用 itertools.combinations code 的修改副本(即测试 bisect_left return 的索引高于当前使用的最后一个索引)。

加起来这些应该可以将您的代码速度提高 len(ls) * k * k! 倍...祝您好运!

编辑 2:

让这成为对过度优化的危险的教训:-)

from bisect import bisect_right

def choose_best_sum(t, k, ls):
    """
    Find the highest sum of `k` values from `ls` such that sum <= `t`
    """
    # enough values passed?
    n = len(ls)
    if n < k:
        return None

    # remove unusable values from consideration
    ls = sorted(ls)
    max_valid_value = t - sum(ls[:k - 1])
    first_invalid_index = bisect_right(ls, max_valid_value)
    if first_invalid_index < n:
        ls = ls[:first_invalid_index]
        # enough valid values remaining?
        n = first_invalid_index   # == len(ls)
        if n < k:
            return None

    # can we still exceed t?
    highest_sum = sum(ls[-k:])
    if highest_sum <= t:
        return highest_sum

    # we have reduced the problem as much as possible
    #   and have not found a trivial solution;
    # we will now brute-force search combinations of (k - 1) values
    #   and binary-search for the best kth value
    best_found = 0
    # n = len(ls)      # already set above
    r = k - 1
    # itertools.combinations code copied from
    #   https://docs.python.org/3/library/itertools.html#itertools.combinations
    indices = list(range(r))
    # Inserted code - evaluate instead of yielding combo
    prefix_sum = sum(ls[i] for i in indices)          #
    kth_index = bisect_right(ls, t - prefix_sum) - 1  # location of largest possible kth value
    if kth_index > indices[-1]:                       # valid with rest of combination?
        total = prefix_sum + ls[kth_index]            #
        if total > best_found:                        #
            if total == t:                            #
                return t                              #
            else:                                     #
                best_found = total                    #
    x = n - r - 1    # set back by one to leave room for the kth item
    while True:
        for i in reversed(range(r)):
            if indices[i] != i + x:
                break
        else:
            return
        indices[i] += 1
        for j in range(i+1, r):
            indices[j] = indices[j-1] + 1
        # Inserted code - evaluate instead of yielding combo
        prefix_sum = sum(ls[i] for i in indices)          #
        kth_index = bisect_right(ls, t - prefix_sum) - 1  # location of largest possible kth value
        if kth_index > indices[-1]:                       # valid with rest of combination?
            total = prefix_sum + ls[kth_index]            #
            if total > best_found:                        #
                if total == t:                            #
                    return t                              #
                else:                                     #
                    best_found = total                    #
        else:
            # short-circuit! skip ahead to next level of combinations
            indices[r - 1] = n - 2

    # highest sum found is < t
    return best_found