传递要放入 trie 中的字符串列表
Passing a list of strings to be put into trie
我有代码可以在给定一个字符串时构建一个 trie 数据结构。当我试图传递一个字符串列表时,它将单词组合成一个
class TrieNode:
def __init__(self):
self.end = False
self.children = {}
def all_words(self, prefix):
if self.end:
yield prefix
for letter, child in self.children.items():
yield from child.all_words(prefix + letter)
class Trie:
def __init__(self):
self.root = TrieNode()
def __init__(self):
self.root = TrieNode()
def insert(self, words):
curr = self.root
#the line I added to read the words from a list is below
for word in words:
for letter in word:
node = curr.children.get(letter)
if not node:
node = TrieNode()
curr.children[letter] = node
curr = node
curr.end = True
def all_words_beginning_with_prefix(self, prefix):
cur = self.root
for c in prefix:
cur = cur.children.get(c)
if cur is None:
return # No words with given prefix
yield from cur.all_words(prefix)
这是我用来将所有内容插入树中的代码:
lst = ['foo', 'foob', 'foobar', 'foof']
trie = Trie()
trie.insert(lst)
我得到的输出是
['foo', 'foofoob', 'foofoobfoobar', 'foofoobfoobarfoof']
我想要得到的输出是
['foo', 'foob', 'foobar', 'foof']
这是我用来获取输出的行(为了可重复性,以防您需要 运行 代码)- 它 returns 所有以特定前缀开头的单词:
print(list(trie.all_words_beginning_with_prefix('foo')))
我该如何解决?
您不会在每次插入后将 curr
重置回词根,因此您是在上一个单词停止的地方插入下一个单词。你想要这样的东西:
def insert(self, words):
curr = self.root
for word in words:
for letter in word:
node = curr.children.get(letter)
if not node:
node = TrieNode()
curr.children[letter] = node
curr = node
curr.end = True
curr = self.root # Reset back to the root
虽然我会打破它。我认为您的 insert
函数做的太多了,不应该处理多个字符串。我会将其更改为:
def insert(self, word):
curr = self.root
for letter in word:
node = curr.children.get(letter)
if not node:
node = TrieNode()
curr.children[letter] = node
curr = node
curr.end = True
def insert_many(self, words):
for word in words:
self.insert(word) # Just loop over self.insert
现在这是一个 non-problem,因为每个 insert
都是一个独立的调用,您不能忘记重置 curr
。
我有代码可以在给定一个字符串时构建一个 trie 数据结构。当我试图传递一个字符串列表时,它将单词组合成一个
class TrieNode:
def __init__(self):
self.end = False
self.children = {}
def all_words(self, prefix):
if self.end:
yield prefix
for letter, child in self.children.items():
yield from child.all_words(prefix + letter)
class Trie:
def __init__(self):
self.root = TrieNode()
def __init__(self):
self.root = TrieNode()
def insert(self, words):
curr = self.root
#the line I added to read the words from a list is below
for word in words:
for letter in word:
node = curr.children.get(letter)
if not node:
node = TrieNode()
curr.children[letter] = node
curr = node
curr.end = True
def all_words_beginning_with_prefix(self, prefix):
cur = self.root
for c in prefix:
cur = cur.children.get(c)
if cur is None:
return # No words with given prefix
yield from cur.all_words(prefix)
这是我用来将所有内容插入树中的代码:
lst = ['foo', 'foob', 'foobar', 'foof']
trie = Trie()
trie.insert(lst)
我得到的输出是
['foo', 'foofoob', 'foofoobfoobar', 'foofoobfoobarfoof']
我想要得到的输出是
['foo', 'foob', 'foobar', 'foof']
这是我用来获取输出的行(为了可重复性,以防您需要 运行 代码)- 它 returns 所有以特定前缀开头的单词:
print(list(trie.all_words_beginning_with_prefix('foo')))
我该如何解决?
您不会在每次插入后将 curr
重置回词根,因此您是在上一个单词停止的地方插入下一个单词。你想要这样的东西:
def insert(self, words):
curr = self.root
for word in words:
for letter in word:
node = curr.children.get(letter)
if not node:
node = TrieNode()
curr.children[letter] = node
curr = node
curr.end = True
curr = self.root # Reset back to the root
虽然我会打破它。我认为您的 insert
函数做的太多了,不应该处理多个字符串。我会将其更改为:
def insert(self, word):
curr = self.root
for letter in word:
node = curr.children.get(letter)
if not node:
node = TrieNode()
curr.children[letter] = node
curr = node
curr.end = True
def insert_many(self, words):
for word in words:
self.insert(word) # Just loop over self.insert
现在这是一个 non-problem,因为每个 insert
都是一个独立的调用,您不能忘记重置 curr
。