我们如何比较两个 trie 的相似性？

Question

我只是好奇是否有一种方法可以比较两个尝试数据结构的相似性？

trie1                      trie2

   root                     root 
/     |                   /   |
m     b                   m   b
|     |                   |   |
a     o                   a   o
| \   |                   |   |
t  x  b                   x   b

def compare_trie(trie1, trie2):
    pass

Output["max","bob"]

编辑：到目前为止，我尝试实现 dfs 算法，但对如何管理不同尝试的两个堆栈感到震惊

我尝试过的代码仍然对两次不同的尝试管理两个堆栈感到震惊：

def compareTrie(trie1, trie2):
    dfsStack = []
    result = []
    stack1 = [x for x in trie1.keys()]
    stack2 = [y for y in trie2.keys()]
    similar = list(set(stack1) & set(stack2))
    dfsStack.append((similar, result))
    while (dfsStack):
        current, result = dfsStack.pop()
        print(current, result)
        result.append(current)
        for c in current:
            trie1 = trie1[c]
            trie2 = trie2[c]
            st1 = [x for x in trie1.keys()]
            st2 = [x for x in trie2.keys()]
            simm = list(set(st1) & set(st2))
            dfsStack.append((simm, result))

    print(result)

Trie 实现：

def create_trie(words):
    trie = {}
    for word in words:
        curr = trie
        for c in word:
            if c not in curr:
                curr[c] = {}
            curr = curr[c]
        # Mark the end of a word
        curr['#'] = True
    return trie


s1 = "mat max bob"
s2 = "max bob"

words1 = s1.split()
words2 = s2.split()

t1 = create_trie(words1)
t2 = create_trie(words2)

Answer 1

您使用 dfs 的想法是正确的；但是，您可以选择一种简单的递归方法来解决手头的任务。这是递归版本：

def create_trie(words):
    trie = {}
    for word in words:
        curr = trie
        for c in word:
            if c not in curr:
                curr[c] = {}
            curr = curr[c]
        # Mark the end of a word
        curr['#'] = True
    return trie

def compare(trie1, trie2, curr):
    for i in trie1.keys():
        if trie2.get(i, None):
            if i=="#":
                result.append(curr)
            else:
                compare(trie1[i], trie2[i], curr+i)
    

s1 = "mat max bob temp2 fg f r"
s2 = "max bob temp fg r c"

words1 = s1.split()
words2 = s2.split()

t1 = create_trie(words1)
t2 = create_trie(words2)
result = []
compare(t1, t2, "")
print(result)   #['max', 'bob', 'fg', 'r']

Answer 2

是的，这是可能的。大多数情况下，因为您在比图灵机差一点的设备上使用 general-purpose 语言。

简单的 brute-force 方法是遍历每个 trie，生成一组所有键。取两组的交集。

Answer 3

您可以将递归替换为当前状态的一个堆栈。并在 compare 方法中创建 result 数组。

def compare(trie1, trie2):
    result = []
    stack = [(trie1, trie2, "")]
    while stack:
        t1, t2, curr = stack.pop()
        for i in t1:
            if i not in t2:
                continue
            if i == "#":
                result.append(curr)
            else:
                stack.append((t1[i], t2[i], curr + i))
    return result

我们如何比较两个 trie 的相似性？

How do we compare two trie for similarity?

python

algorithm

trie

depth-first-search

data-structures