混乱的数字,找到 select 数字的最佳顺序,以便使用每个字符

Jumbled numerals, find the optimal order to select numerals so that every character is used

我最近在面试中被问到一个问题,问题陈述大致说,

给定一个包含随机顺序数字的混乱单词,找到按排序顺序排列的数字的整数形式。

数字 = zero, one, .... nine

输入:一个字符串:enenoin

输出:19

解释:在输入字符串中,onenine的顺序是随机的,所以我们return19.

另一个例子:

输入:一个字符串:enenoinone

输出:19

解释:我们只考虑每个数字 1 个实例,整数按排序顺序显示。

遇到问题后,想都没想,就开始写下面贪心的解决方案:

import sys

numerals = ["zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"]

for line in sys.stdin:
    # take the input string
    # idea: we store each numeral strings zero to nine in a list
    # then, we check the input string
    # as, the string is in random order, we only have to find the count of each character
    # for example, if a string contains 2 z, 4 e, 2 r and 5 o, how many zero can we find in that string?
    # answer is min(2, 4, 2, 5) = 2, we can make at most 2 zero, as we have only 2 z and 2 r.
    # now, we know there are 2 zero in the string, so we append 0, 0 to answer and continue

    # step 1: count the frequency of each character
    freq = {} # hashmap/dict

    for char in line:
        freq[char] = freq.get(char, 0) + 1 # we use get(char, 0) because if char is not in freq, return default value 0
    
    # step 2: for each numeral, we try to count, how many instances of that numeral exist in the string

    numeral_count = {} # another dictionary to store the count of each numeral in string

    for numeral in numerals[::-1]: # traversing in reverse order
        count_numeral = 9999999 # we initialize with a large number, as we will apply minimum, min of any value and a large number is that value, so it's a initialization placeholder
        for char in numeral:
            count_numeral = min(count_numeral, freq.get(char, 0)) # we take the min of count of each character of that numeral

        for char in numeral:
            if char in freq.keys():
                freq[char] -= count_numeral # updating the frequency, as we have already used this amount of characters for current numeral     
        numeral_count[numeral] = count_numeral
    
    print(numeral_count)
        

    # step 3: now, we just add the count of each numeral to a list and make that list to a string

    ans_list = []

    for i, numeral in enumerate(numerals):
        if numeral_count[numeral] > 0:
            ans_list.append(i)

    ans_str = "".join(str(a) for a in ans_list)
    print(ans_str)

中途我发现贪心法在某些情况下肯定会失败,我们选择了一些数字但后来发现有一些未选择的字符意味着选择错误。

一个这样的案例:

fonineur

一开始我们可以贪婪地选一个作为数字之一,但后来我们会发现,漏掉的字符很多。

我时间紧迫,测试集上的hard case不多,所以我就试验了数字的顺序,结果后面的顺序通过了测试用例(毫无疑问测试用例很漂亮弱)。

numerals = ["eight", "seven", "six", "three", "zero", "four", "five", "nine", "two", "one"]

但我仍然不确定如何为这个问题找到算法上最有效(时间复杂度)的算法,可能在 O(n) 中,n 是字符串的长度。

先看偶数:零、二、四、六、八。这些数字中的每一个都有一个独特的字母,没有出现在任何其他数字中(这些字母分别是:z、w、u、x、g)。因此,如果我们在字符串中看到其中一个字母,我们就可以确定它属于哪个数字,并且可以从字符串中删除它的所有字母。我们重复此操作,直到删除所有偶数。

接下来看数字一、三、五、七。现在偶数已被删除,这些数字也可以被唯一识别,分别为字母:o、t、f、v。因此我们可以删除所有出现的这些数字。

最后,剩余的字符串(如果有的话)必须由一个或多个九个实例组成。

现在让我们尝试通过一次输入字符串来实现这个想法:

遍历字符串并计算字母 z、w、u、x、g、o、t、f、v 的出现次数。

z、w、u、x、g出现的次数分别是数字零、二、四、六、八的出现次数。

对于o,t,f,v出现的次数,我们要进行简单的计算。例如,我们想表示数字一的字母 o 也出现在零、二和四中,因此一的出现次数为 o-z-w-u。同样我们可以计算出三、五、七出现的次数。

最后,我们可以计算字符串中剩余的字符数,如果有,我们就知道其中也出现了数字九。