为什么一种回溯方法比另一种更快?

Why is one approach for backtracking faster than the other?

所以我在 Word Breaking II 上研究 Leetcode,并提出了两个相似但记忆不同的回溯实现。但是,一个会通过规格,而另一个不会由于超过时间限制而无法通过。有人可以解释为什么方法 1 比方法 2 更快吗?

对于上下文,基本上问题给了我一个字符串和一个字典。如果字符串中的某些单词也在字典中,则使用 space 将单词分开并将结果字符串(单词)放入结果数组中。词典单词可以多次使用!

例如:

s = "pineapplepenapple"
wordDict = ["apple", "pen", "applepen", "pine", "pineapple"]
Output:
[
  "pine apple pen apple",
  "pineapple pen apple",
  "pine applepen apple"
]
Explanation: Note that you are allowed to reuse a dictionary word.

方法 1(有效!):

require 'set'

def word_break(s, word_dict)
    word_dict = Set.new(word_dict)

    bt(s, word_dict, 0, {})
end

def bt(s, word_dict, i, mem)
    return mem[i] if mem[i]
    return [''] if i == s.length

    res = []

    j = i

    while j < s.length
        word = s[i..j]

        if word_dict.include?(word)
            word_breaks = bt(s, word_dict, j + 1, mem)
 
            word_breaks.each do |words|
                new_combined_words = word
                new_combined_words += ' ' + words if words.length > 0

                res << new_combined_words
            end
        end

        j += 1
    end

    # Memoizing here makes it fast!
    mem[i] = res
end

方法 2(不够快):

require 'set'

def word_break(s, word_dict)
    word_dict = Set.new(word_dict)

    bt(s, word_dict, 0, {})
end

def bt(s, word_dict, i, mem)
    return mem[i] if mem[i]
    return [''] if i== s.length

    res = []

    j = i

    while j < s.length
        word = s[i..j]

        if word_dict.include?(word)
            word_breaks = bt(s, word_dict, j + 1, mem)
 
            word_breaks.each do |words|
                new_combined_words = word
                new_combined_words += ' ' + words if words.length > 0
                
                # Memoizing here but it's too slow
                (mem[i] ||= []) << new_combined_words
                
                res << new_combined_words
            end
        end

        j += 1
    end
    
    res
end

在方法 1 中,我最后使用 mem[i] = res 进行记忆,而在方法 2 中,我在生成新的单词组合时即时记忆。

如有任何帮助,我们将不胜感激,谢谢!

mem[i]恰好是空数组时,你永远不会在方法2中设置它。空值不会被记忆。

Cary 在他的评论中提出的建议应该可以解决这个问题。该代码仍然会比方法 1 慢一点,但它可能会通过 leetcode 上的测试。

UPD:即使按照建议的编辑,当 bt(s, word_dict, j + 1, mem) returns 是一个空数组时,我们也永远不会记忆 mem[i],这使得代码渐近指数。要解决此问题,请尝试以下操作。

         mem[i] ||= []
        
         if word_dict.include?(word)
            word_breaks = bt(s, word_dict, j + 1, mem)
 

            word_breaks.each do |words|
                new_combined_words = word
                new_combined_words += ' ' + words if words.length > 0

                mem[i] << new_combined_words
                
                res << new_combined_words
            end
        end

这不是答案,而是旨在阐明问题的扩展评论。我对这两种方法进行了基准测试,但无法重现 #1 比 #2 快的结果。 word_break1(调用 bt1)是#1; word_break2(调用 bt2)是 #2。

s = "nowisthetimetohavesomefun"
word_dict = %w|now is the time to have some fun no wist he so mefun ist he ti me|
require 'benchmark'
Benchmark.bm do |x|
  x.report("#1") { word_break1(s, word_dict) }
  x.report("#2") { word_break2(s, word_dict) }
end

以下结果是通过多次运行获得的。

       user     system      total        real
#1  0.000342   0.000083   0.000425 (  0.000312)
#2  0.000264   0.000066   0.000330 (  0.000242)*

#1  0.000315   0.000075   0.000390 (  0.000288)
#2  0.000230   0.000066   0.000296 (  0.000208)*

#1  0.000292   0.000079   0.000371 (  0.000268)
#2  0.000255   0.000065   0.000320 (  0.000253)*

#1  0.000292   0.000090   0.000382 (  0.000261)*
#2  0.000337   0.000121   0.000458 (  0.000349)

#1  0.000301   0.000063   0.000364 (  0.000291)*
#2  0.000413   0.000134   0.000547 (  0.000385)

#1  0.000306   0.000079   0.000385 (  0.000280)
#2  0.000281   0.000082   0.000363 (  0.000255)*

两种方法return下面的数组。

["no wist he ti me to have so me fun", "no wist he ti me to have so mefun",
 "no wist he ti me to have some fun", "no wist he time to have so me fun",
 "no wist he time to have so mefun", "no wist he time to have some fun",
 "now is the ti me to have so me fun", "now is the ti me to have so mefun",
 "now is the ti me to have some fun", "now is the time to have so me fun",
 "now is the time to have so mefun", "now is the time to have some fun",
 "now ist he ti me to have so me fun", "now ist he ti me to have so mefun",
 "now ist he ti me to have some fun", "now ist he time to have so me fun",
 "now ist he time to have so mefun", "now ist he time to have some fun"]