您可以在 Pocketsphinx 中对多个文件使用相同的解码器吗?

Can you use the same decoder in Pocketsphinx for multiple files?

在 Pocketsphinx (Python) 中是否可以对多个 wav 文件使用相同的解码器?我有以下代码片段,这是非常标准的,除了我在同一个文件上两次调用解码器。然而,输出并不相同。我还尝试在不同的文件上使用解码器两次,并且输出因我调用文件的顺序而不同 - 第一个文件解码正确,但第二个文件解码不正确。此外,只有当第一个文件有一些输出时才会发生这种情况——如果第一个文件没有任何单词,那么第二个文件就可以正常解码。这让我相信解码器在解码一个文件后以某种方式被修改。我对此是否正确?有没有办法重置解码器,或者一般来说让它适用于多个文件?似乎应该在这里给出示例:https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/decoder_test.py.

config = ps.Decoder.default_config()    
config.set_string('-hmm', os.path.join(MODELDIR, 'en-US/acoustic-model'))
config.set_string('-lm', os.path.join(MODELDIR, 'en-US/language-model.lm.bin'))
config.set_string('-dict', os.path.join(MODELDIR, 'en-US/pronounciation-dictionary.dict'))
config.set_string('-logfn', 'pocketsphinxlog')
decoder = ps.Decoder(config)

wavname16_1 =  os.path.join(DATADIR, 'arctic_a0001.wav')
# Decode streaming data.
decoder.start_utt()
stream = open(wavname16_1, 'rb')
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
decoder.end_utt()
stream.close()
words = [(seg.word, seg.prob) for seg in decoder.seg()]
print words

wavname16_2 =  os.path.join(DATADIR, 'arctic_a0002.wav')
decoder.start_utt()
stream = open(wavname16_2, 'rb')
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
decoder.end_utt()
stream.close()
words = [(seg.word, seg.prob) for seg in decoder.seg()]
print "arctic2: " + words

编辑 - 一些进一步的信息:

如果arctic_a0001.wav是http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/wav/arctic_a0001.wav, arctic_a0002.wav is http://festvox.org/cmu_arctic/cmu_arctic/cmu_us_bdl_arctic/wav/arctic_a0002.wav,字典是单行:

of AH V

则当前输出为:

arctic1: [('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
arctic2: [('<s>', -3), ('[SPEECH]', -725), ('<sil>', -1), ('[SPEECH]', -6), ('<sil>', -20), ('of', -6162), ('[SPEECH]', -397), ('</s>', 0)]

但是如果我们切换它们,输出会变成

arctic2: [('<s>', 0), ('of', 0), ('<sil>', 0), ('of', -29945), ('<sil>', -20), ('of', -26004), ('of', 0), ('of', 0), ('<sil>', 0), ('of', -84868), ('of', -35690), ('</s>', 0)]
arctic1: [('<s>', -3), ('of', -14886), ('of', -30237), ('<sil>', 0), ('of', -22103), ('of', 1), ('<sil>', 0), ('of', -30795), ('of', -65040), ('</s>', 0)]

所以 arctic1 和 arctic2 的输出取决于顺序。此外,如果我们使用 arctic1 两次,输出为

[('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
[('<s>', 1), ('of', -24424), ('of', -24554), ('<sil>', 2), ('[SPEECH]', -37257), ('of', -37008), ('<sil>', -461), ('of', -20422), ('of', 0), ('<sil>', 0), ('of', -3570), ('[SPEECH]', -42), ('</s>', 0)]

可能是我没有使用start_stream()的问题?我不确定我应该如何使用它。即使我使用 decoder.start_stream()(直接在 decoder.start_utt() 之前),输出也不同 - 它变成

[('<s>', 1), ('of', 1), ('of', -12001), ('<sil>', 0), ('of', -16211), ('<sil>', -1205), ('of', -13991), ('of', 0), ('<sil>', 0), ('of', -31232), ('</s>', 0)]
[('<s>', -2), ('of', -33113), ('of', -29715), ('<sil>', 1), ('[SPEECH]', -37258), ('of', -37009), ('<sil>', -461), ('of', -20422), ('of', 0), ('<sil>', 0), ('of', -3570), ('[SPEECH]', -42), ('</s>', 0)]

如果您想要整个日志,请在此处 (http://pastebin.com/2dNeyS1x) is the log for arctic1 before arctic2, and here (http://pastebin.com/Nkvj2G0g) is the log for arctic2 before arctic1, while here is the log for arctic1 two times in a row with start_stream (http://pastebin.com/HWq6j7X2), and here is the log for arctic1 two times in a row without start_stream (http://pastebin.com/MsadW4nh)。

Is it possible to use the same decoder for multiple wav files in Pocketsphinx (Python)?

I have the following code snippet, which is very standard, except that I call the decoder twice on the same file. The outputs are not the same, however.

您需要为第二个文件调用 decoder.start_stream() 以重置解码器计时。

I've also tried using the decoder twice on different files, and the outputs are different depending on the order in which I call the files - the first file decodes correctly, but the second file does not decode correctly. Furthermore, this only happens if there is some output from the first file - if the first file doesn't have any words, then the second file decodes fine.

嗯,可能有不同的因素会影响结果。没有例子很难说。您最好提供示例文件和有问题的输出以获得此问题的答案。