我们如何比较两个 trie 的相似性?
How do we compare two trie for similarity?
我只是好奇是否有一种方法可以比较两个尝试数据结构的相似性?
trie1 trie2
root root
/ | / |
m b m b
| | | |
a o a o
| \ | | |
t x b x b
def compare_trie(trie1, trie2):
pass
Output["max","bob"]
编辑:到目前为止,我尝试实现 dfs 算法,但对如何管理不同尝试的两个堆栈感到震惊
我尝试过的代码仍然对两次不同的尝试管理两个堆栈感到震惊:
def compareTrie(trie1, trie2):
dfsStack = []
result = []
stack1 = [x for x in trie1.keys()]
stack2 = [y for y in trie2.keys()]
similar = list(set(stack1) & set(stack2))
dfsStack.append((similar, result))
while (dfsStack):
current, result = dfsStack.pop()
print(current, result)
result.append(current)
for c in current:
trie1 = trie1[c]
trie2 = trie2[c]
st1 = [x for x in trie1.keys()]
st2 = [x for x in trie2.keys()]
simm = list(set(st1) & set(st2))
dfsStack.append((simm, result))
print(result)
Trie 实现:
def create_trie(words):
trie = {}
for word in words:
curr = trie
for c in word:
if c not in curr:
curr[c] = {}
curr = curr[c]
# Mark the end of a word
curr['#'] = True
return trie
s1 = "mat max bob"
s2 = "max bob"
words1 = s1.split()
words2 = s2.split()
t1 = create_trie(words1)
t2 = create_trie(words2)
您使用 dfs 的想法是正确的;但是,您可以选择一种简单的递归方法来解决手头的任务。这是递归版本:
def create_trie(words):
trie = {}
for word in words:
curr = trie
for c in word:
if c not in curr:
curr[c] = {}
curr = curr[c]
# Mark the end of a word
curr['#'] = True
return trie
def compare(trie1, trie2, curr):
for i in trie1.keys():
if trie2.get(i, None):
if i=="#":
result.append(curr)
else:
compare(trie1[i], trie2[i], curr+i)
s1 = "mat max bob temp2 fg f r"
s2 = "max bob temp fg r c"
words1 = s1.split()
words2 = s2.split()
t1 = create_trie(words1)
t2 = create_trie(words2)
result = []
compare(t1, t2, "")
print(result) #['max', 'bob', 'fg', 'r']
是的,这是可能的。大多数情况下,因为您在比图灵机差一点的设备上使用 general-purpose 语言。
简单的 brute-force 方法是遍历每个 trie,生成一组所有键。取两组的交集。
您可以将递归替换为当前状态的一个堆栈。并在 compare
方法中创建 result
数组。
def compare(trie1, trie2):
result = []
stack = [(trie1, trie2, "")]
while stack:
t1, t2, curr = stack.pop()
for i in t1:
if i not in t2:
continue
if i == "#":
result.append(curr)
else:
stack.append((t1[i], t2[i], curr + i))
return result
我只是好奇是否有一种方法可以比较两个尝试数据结构的相似性?
trie1 trie2
root root
/ | / |
m b m b
| | | |
a o a o
| \ | | |
t x b x b
def compare_trie(trie1, trie2):
pass
Output["max","bob"]
编辑:到目前为止,我尝试实现 dfs 算法,但对如何管理不同尝试的两个堆栈感到震惊
我尝试过的代码仍然对两次不同的尝试管理两个堆栈感到震惊:
def compareTrie(trie1, trie2):
dfsStack = []
result = []
stack1 = [x for x in trie1.keys()]
stack2 = [y for y in trie2.keys()]
similar = list(set(stack1) & set(stack2))
dfsStack.append((similar, result))
while (dfsStack):
current, result = dfsStack.pop()
print(current, result)
result.append(current)
for c in current:
trie1 = trie1[c]
trie2 = trie2[c]
st1 = [x for x in trie1.keys()]
st2 = [x for x in trie2.keys()]
simm = list(set(st1) & set(st2))
dfsStack.append((simm, result))
print(result)
Trie 实现:
def create_trie(words):
trie = {}
for word in words:
curr = trie
for c in word:
if c not in curr:
curr[c] = {}
curr = curr[c]
# Mark the end of a word
curr['#'] = True
return trie
s1 = "mat max bob"
s2 = "max bob"
words1 = s1.split()
words2 = s2.split()
t1 = create_trie(words1)
t2 = create_trie(words2)
您使用 dfs 的想法是正确的;但是,您可以选择一种简单的递归方法来解决手头的任务。这是递归版本:
def create_trie(words):
trie = {}
for word in words:
curr = trie
for c in word:
if c not in curr:
curr[c] = {}
curr = curr[c]
# Mark the end of a word
curr['#'] = True
return trie
def compare(trie1, trie2, curr):
for i in trie1.keys():
if trie2.get(i, None):
if i=="#":
result.append(curr)
else:
compare(trie1[i], trie2[i], curr+i)
s1 = "mat max bob temp2 fg f r"
s2 = "max bob temp fg r c"
words1 = s1.split()
words2 = s2.split()
t1 = create_trie(words1)
t2 = create_trie(words2)
result = []
compare(t1, t2, "")
print(result) #['max', 'bob', 'fg', 'r']
是的,这是可能的。大多数情况下,因为您在比图灵机差一点的设备上使用 general-purpose 语言。
简单的 brute-force 方法是遍历每个 trie,生成一组所有键。取两组的交集。
您可以将递归替换为当前状态的一个堆栈。并在 compare
方法中创建 result
数组。
def compare(trie1, trie2):
result = []
stack = [(trie1, trie2, "")]
while stack:
t1, t2, curr = stack.pop()
for i in t1:
if i not in t2:
continue
if i == "#":
result.append(curr)
else:
stack.append((t1[i], t2[i], curr + i))
return result