Python 字典排序字符串的 Anagram

Python dictionary sorting Anagram of a string

我有以下问题,我发现了这个 Permutation of string as substring of another,但这是使用 C++,我对 python 有点困惑。

Given two strings s and t, determine whether some anagram of t is a substring of s. For example: if s = "udacity" and t = "ad", then the function returns True. Your function definition should look like: question1(s, t) and return a boolean True or False.

所以我回答了这个问题,但他们希望我使用字典而不是对字符串进行排序。审稿人这样说;

We can first compile a dictionary of counts for t and check with every possible consecutive substring sets in s. If any set is anagram of t, then we return True, else False. Comparing counts of all characters will can be done in constant time since there are only limited amount of characters to check. Looping through all possible consecutive substrings will take worst case O(len(s)). Therefore, the time complexity of this algorithm is O(len(s)). space complexity is O(1) although we are creating a dictionary because we can have at most 26 characters and thus it is bounded.

你们能否帮助我如何在我的解决方案中使用字典。

这是我的解决方案;

# Check if s1 and s2 are anagram to each other
def anagram_check(s1, s2):
    # sorted returns a new list and compare
    return sorted(s1) == sorted(s2)

# Check if anagram of t is a substring of s
def question1(s, t):
    for i in range(len(s) - len(t) + 1):
        if anagram_check(s[i: i+len(t)], t):
            return True
    return False

def main():
    print question1("udacity", "city")

if __name__ == '__main__':
    main()  

'''
Test Case 1: question1("udacity", "city") -- True
Test Case 2: question1("udacity", "ud") -- True
Test Case 3: question1("udacity", "ljljl") -- False
'''

感谢任何帮助。谢谢,

import collections
print collections.Counter("google")

Counter({'o': 2, 'g': 2, 'e': 1, 'l': 1})

一个纯粹的 python 解决方案,用于获取一个对象,该对象对应于字母表中的那个字符在字符串 (t) 中的数量

使用函数 chr(),您可以将 int 转换为其对应的 ascii 值,因此您可以轻松地将 97 转换为 123 并且使用 chr() 获取字母表的值。

所以如果你有一个字符串说:

t = "abracadabra"

然后你可以for-loop像:

dt = {}
for c in range(97, 123):
   dt[chr(c)] = t.count(chr(c))

这适用于解决方案的这一部分,返回以下结果:

{'k': 0, 'v': 0, 'a': 5, 'z': 0, 'n': 0, 't': 0, 'm': 0, 'q': 0, 'f': 0, 'x': 0, 'e': 0, 'r': 2, 'b': 2, 'i': 0, 'l': 0, 'h': 0, 'c': 1, 'u': 0, 'j': 0, 'p': 0, 's': 0, 'y': 0, 'o': 0, 'd': 1, 'w': 0, 'g': 0}

不同的解决方案?

欢迎发表评论,但为什么需要存储在 dict 中?使用 count(),您不能简单地将 t 中每个字符的计数与 s 中该字符的计数进行比较吗?如果 tchar 的计数大于 s return False else True.

大致如下:

def question1(s, t):
   for c in range(97, 123):
      if t.count(chr(c)) > s.count(chr(c)):
         return False
   return True

给出结果:

>>> question1("udacity", "city")
True
>>> question1("udacity", "ud")
True
>>> question1("udacity", "ljljl")
False

如果需要 dict...

如果是,那么只需像上面那样创建两个并遍历每个键...

def question1(s, t):
   ds = {}
   dt = {}
   for c in range(97, 123):
      ds[chr(c)] = s.count(chr(c))
      dt[chr(c)] = t.count(chr(c))
   for c in range(97, 123):
      if dt[chr(c)] > ds[chr(c)]:
         return False
   return True

更新

以上答案仅检查子序列而不是子字符串 变位词。正如马拉卡在评论中向我解释的那样,两者之间存在区别,您的示例清楚地表明了这一点。

使用滑动 window 的想法(通过切片字符串),下面的代码应该适用于 substrings:

def question1(s, t):
   dt = {}
   for c in range(97, 123):
      dt[chr(c)] = t.count(chr(c))
   for i in range(len(s) - len(t) + 1):
      contains = True
      for c in range(97, 123):
         if dt[chr(c)] > s[i:i+len(t)].count(chr(c)):
            contains = False
            break
      if contains:
         return True
   return False

上面的代码适用于所有情况,并利用字典正确加速计算:)