为什么 pocketsphinx 通过 java 和 kws 返回零假设?通过命令行而不是通过代码工作

Why is pocketsphinx returning a null hypothesis via java with kws? Works via commandline, not via code

我一直在 java 中使用 pocketsphinx。我从各种来源拼凑了这个。

正在尝试通过 pocketsphinx 进行关键字检测。

正如我所说的通过命令行工作:

pocketsphinx_continuous -inmic  yes -kws keyphrase.list

其中 keyphrase.list 文件包含:\

abomination /le-20/

我每次都中招。

这是我的 java 代码:

(我试过 le-1 到 le-40)

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.TargetDataLine;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.Arrays;

import edu.cmu.pocketsphinx.Decoder;
import edu.cmu.pocketsphinx.Config;
import edu.cmu.pocketsphinx.Hypothesis;

public class Controller {
    static {
        System.loadLibrary("pocketsphinx_jni");
    }

    private static ByteArrayOutputStream out;

    public static void main(String args[]) {

        AudioFormat format = new AudioFormat(44100, 16, 1, true, true);
        TargetDataLine targetLine = null;
        DataLine.Info targetInfo = new DataLine.Info(TargetDataLine.class, format);
        boolean running = true;


        try {

            targetLine = AudioSystem.getTargetDataLine(format);
            targetLine.open();
            out = new ByteArrayOutputStream();
            int numBytesRead;
            byte[] data = new byte[targetLine.getBufferSize() / 5];


            Config c = Decoder.defaultConfig();
            c.setString("-hmm", "/usr/local/share/pocketsphinx/model/en-us/en-us/");
            //c.setString("-lm", "/usr/local/share/pocketsphinx/model/en-us/en-us.lm.bin");
            c.setString("-dict", "/usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict");
            c.setString("-keyphrase", "abomination");
            c.setFloat("-kws_threshold", 1e-1);

            Decoder d = new Decoder(c);
            d.setRawdataSize(300000);

            targetLine.start();
            System.out.println("Recorder started");

            byte[] b = new byte[4096];

            d.startUtt();

            System.out.println("Decoder started");

            while ((running)) {
                int nbytes;
                short[] s = null;
                nbytes = targetLine.read(b,0,b.length);

                ByteBuffer bb = ByteBuffer.wrap(b, 0, nbytes);
                s = new short[nbytes/2];

                bb.asShortBuffer().get(s);

                d.processRaw(s, nbytes/2, false, false);
                d.setKws("abomination", );

                if (nbytes > 0) {

                    Hypothesis hypothesis = d.hyp();
                    if (hypothesis != null) {
                        System.out.println("------------------------------------------------------");
                        System.out.println(hypothesis.getHypstr());
                        System.out.println("------------------------------------------------------");

                        d.endUtt();
                        d.startUtt();
                    }
                }
            }

        }
        catch (Exception e) {
            System.err.println(e);
        }
    }
}

代码 运行 没问题。如果 if(hypothesis != null) 由于某种原因永远不会进入。

这是记录的内容:

INFO: cmn_prior.c(99): cmn_prior_update: from < 40.00  3.00 -1.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 51.22 14.61 -8.72 -0.31 -3.49  0.18 -7.35  8.43 -0.77  7.64  1.41  0.27 -1.82 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 51.22 14.61 -8.72 -0.31 -3.49  0.18 -7.35  8.43 -0.77  7.64  1.41  0.27 -1.82 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 51.80 15.37 -8.77 -0.62 -2.74  0.09 -6.18 10.24  0.14  7.79  2.59  1.86 -3.22 >

更新:更多信息。

这是运行时的输出。 -kws 未设置。

INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/en-us/en-us//feat.params
Current configuration:
[NAME]          [DEFLT]     [VALUE]
-agc            none        none
-agcthresh      2.0     2.000000e+00
-allphone               
-allphone_ci        no      no
-alpha          0.97        9.700000e-01
-ascale         20.0        2.000000e+01
-aw         1       1
-backtrace      no      no
-beam           1e-48       1.000000e-48
-bestpath       yes     yes
-bestpathlw     9.5     9.500000e+00
-ceplen         13      13
-cmn            current     current
-cmninit        8.0     40,3,-1
-compallsen     no      no
-debug                  0
-dict                   /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
-dictcase       no      no
-dither         no      no
-doublebw       no      no
-ds         1       1
-fdict                  
-feat           1s_c_d_dd   1s_c_d_dd
-featparams             
-fillprob       1e-8        1.000000e-08
-frate          100     100
-fsg                    
-fsgusealtpron      yes     yes
-fsgusefiller       yes     yes
-fwdflat        yes     yes
-fwdflatbeam        1e-64       1.000000e-64
-fwdflatefwid       4       4
-fwdflatlw      8.5     8.500000e+00
-fwdflatsfwin       25      25
-fwdflatwbeam       7e-29       7.000000e-29
-fwdtree        yes     yes
-hmm                    /usr/local/share/pocketsphinx/model/en-us/en-us/
-input_endian       little      little
-jsgf                   
-keyphrase              abomination
-kws                    
-kws_delay      10      10
-kws_plp        1e-1        1.000000e-01
-kws_threshold      1       1.000000e-20
-latsize        5000        5000
-lda                    
-ldadim         0       0
-lifter         0       22
-lm                 
-lmctl                  
-lmname                 
-logbase        1.0001      1.000100e+00
-logfn                  
-logspec        no      no
-lowerf         133.33334   1.300000e+02
-lpbeam         1e-40       1.000000e-40
-lponlybeam     7e-29       7.000000e-29
-lw         6.5     6.500000e+00
-maxhmmpf       30000       30000
-maxwpf         -1      -1
-mdef                   
-mean                   
-mfclogdir              
-min_endfr      0       0
-mixw                   
-mixwfloor      0.0000001   1.000000e-07
-mllr                   
-mmap           yes     yes
-ncep           13      13
-nfft           512     512
-nfilt          40      25
-nwpen          1.0     1.000000e+00
-pbeam          1e-48       1.000000e-48
-pip            1.0     1.000000e+00
-pl_beam        1e-10       1.000000e-10
-pl_pbeam       1e-10       1.000000e-10
-pl_pip         1.0     1.000000e+00
-pl_weight      3.0     3.000000e+00
-pl_window      5       5
-rawlogdir              
-remove_dc      no      no
-remove_noise       yes     yes
-remove_silence     yes     yes
-round_filters      yes     yes
-samprate       16000       1.600000e+04
-seed           -1      -1
-sendump                
-senlogdir              
-senmgau                
-silprob        0.005       5.000000e-03
-smoothspec     no      no
-svspec                 0-12/13-25/26-38
-tmat                   
-tmatfloor      0.0001      1.000000e-04
-topn           4       4
-topn_beam      0       0
-toprule                
-transform      legacy      dct
-unit_area      yes     yes
-upperf         6855.4976   6.800000e+03
-uw         1.0     1.000000e+00
-vad_postspeech     50      50
-vad_prespeech      20      20
-vad_startspeech    10      10
-vad_threshold      2.0     2.000000e+00
-var                    
-varfloor       0.0001      1.000000e-04
-varnorm        no      no
-verbose        no      no
-warp_params                
-warp_type      inverse_linear  inverse_linear
-wbeam          7e-29       7.000000e-29
-wip            0.65        6.500000e-01
-wlen           0.025625    2.562500e-02

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(164): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /usr/local/share/pocketsphinx/model/en-us/en-us//mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/en-us/en-us//mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/en-us/en-us//transition_matrices
INFO: acmod.c(117): Attempting to use PTM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us//means
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us//variances
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(354): 222 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /usr/local/share/pocketsphinx/model/en-us/en-us//sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(835): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 138623 * 32 bytes (4331 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
INFO: dict.c(213): Allocated 1014 KiB for strings, 1677 KiB for phones
INFO: dict.c(336): 134522 words read
INFO: dict.c(358): Reading filler dictionary: /usr/local/share/pocketsphinx/model/en-us/en-us//noisedict
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
INFO: dict2pid.c(132): Allocated 42672 bytes (41 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 42672 bytes (41 KiB) for single-phone word triphones
INFO: kws_search.c(420): KWS(beam: -1080, plp: -23, default threshold -450, delay 10)
Recorder started
Decoder started

我在此处找到了为什么您不需要 -lm 行的参考资料。

Also if you intend to use kws there is no need to use -lm in arguments. You need to remove:

"-lm", ".../model/hub4wsj_sc_8k_adapt/etc/hub4.5000.DMP",

这就是答案。

如果我更改上面的代码并删除:

c.setString("-keyphrase", "abomination");

并添加:

c.setString("-kws", "/home/pennyworth/keyphrase.list");

现在输出显示 -kws set

-kws                    /home/pennyworth/keyphrase.list

我在输出中得到了这个:

INFO: kws_search.c(420): KWS(beam: -1080, plp: -23, default threshold -450, delay 10)

尽管如此,结果为空。

NFO: cmn_prior.c(99): cmn_prior_update: from < 73.10 11.10 -10.49  1.23  0.67 -1.37 -5.29  5.17 -0.62  3.91 -0.28  2.56 -2.14 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 73.59 11.36 -9.61  2.41  2.13  0.15 -5.09  4.30  1.03  4.46 -0.13  3.36 -0.62 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 73.59 11.36 -9.61  2.41  2.13  0.15 -5.09  4.30  1.03  4.46 -0.13  3.36 -0.62 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 74.65 10.58 -9.76  4.47  3.63  1.19 -5.20  3.74  2.33  4.75 -0.11  3.06 -0.32 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 74.65 10.58 -9.76  4.47  3.63  1.19 -5.20  3.74  2.33  4.75 -0.11  3.06 -0.32 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 77.49 10.99 -8.80  5.45  4.37  2.83 -4.14  4.06  3.48  5.07 -0.41  2.82 -0.35 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 77.49 10.99 -8.80  5.45  4.37  2.83 -4.14  4.06  3.48  5.07 -0.41  2.82 -0.35 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 73.54  9.62 -11.34  3.19  3.30  2.24 -6.61  4.52  1.31  5.99 -1.28  2.24 -0.96 >

我是否假设 kws 不是 return 炒作?有一个很棒的 python 示例,但是关于 kws 的 java 没有什么用。

https://github.com/cmusphinx/pocketsphinx/blob/master/swig/python/test/kws_test.py

这是 pocketsphinx 的 api 文档,http://cmusphinx.sourceforge.net/doc/pocketsphinx/pocketsphinx_8c_source.html

我不知道如何推进这个。我要么没有正确设置解码器,要么发生了其他事情,这就是我得到 null return 的原因。

我不清楚 -kws 与 -keyphrase vx -kws-threshold 的用法。使用 -kws 是否意味着您不需要其他两个,因为它有效地设置了短语和阈值?

更新代码。添加了字节顺序。 (并确保我的 1 不是 l )

import javax.sound.sampled.AudioFormat;
import javax.sound.sampled.AudioSystem;
import javax.sound.sampled.DataLine;
import javax.sound.sampled.TargetDataLine;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;
import java.util.Arrays;

import edu.cmu.pocketsphinx.Decoder;
import edu.cmu.pocketsphinx.Config;
import edu.cmu.pocketsphinx.Hypothesis;

public class Controller {
    static {
        System.loadLibrary("pocketsphinx_jni");
    }

    private static ByteArrayOutputStream out;

    public static void main(String args[]) {

        AudioFormat format = new AudioFormat(44100, 16, 1, true, true);
        TargetDataLine targetLine = null;
        DataLine.Info targetInfo = new DataLine.Info(TargetDataLine.class, format);
        boolean running = true;


        try {

            targetLine = AudioSystem.getTargetDataLine(format);
            targetLine.open();
            out = new ByteArrayOutputStream();
            int numBytesRead;
            byte[] data = new byte[targetLine.getBufferSize() / 5];


            Config c = Decoder.defaultConfig();
            c.setString("-hmm", "/usr/local/share/pocketsphinx/model/en-us/en-us/");
            c.setString("-dict", "/usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict");
            c.setString("-keyphrase", "abomination");
            c.setFloat("-kws_threshold", 1e-20);
            //c.setString("-kws", "/home/bruce/keyphrase.list");


            Decoder d = new Decoder(c);
            d.setRawdataSize(300000);

            targetLine.start();
            System.out.println("Recorder started");

            byte[] b = new byte[4096];

            d.startUtt();

            System.out.println("Decoder started");

            while ((running)) {
                int nbytes;
                short[] s = null;
                nbytes = targetLine.read(b,0,b.length);

                ByteBuffer bb = ByteBuffer.wrap(b, 0, nbytes);
                s = new short[nbytes/2];

                bb.asShortBuffer().get(s);
                bb.order(ByteOrder.LITTLE_ENDIAN);
                d.processRaw(s, nbytes/2, false, false);


                if (nbytes > 0) {

                    Hypothesis hypothesis = d.hyp();
                    if (hypothesis != null) {
                        System.out.println("------------------------------------------------------");
                        System.out.println(hypothesis.getHypstr());
                        System.out.println("------------------------------------------------------");

                        d.endUtt();
                        d.startUtt();
                    }
                }
            }

        }
        catch (Exception e) {
            System.err.println(e);
        }
    }
}

这是更新后的输出:

INFO: cmn_prior.c(99): cmn_prior_update: from < 67.54 12.23 -8.09 -0.29  0.56 -0.37 -3.25  7.00 -1.97  3.98 -1.87  3.63 -1.49 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 66.41 12.98 -8.67 -0.63  1.35 -0.13 -3.16  7.97 -3.11  3.57 -0.74  3.91 -1.79 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 66.41 12.98 -8.67 -0.63  1.35 -0.13 -3.16  7.97 -3.11  3.57 -0.74  3.91 -1.79 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 66.66 12.12 -10.32 -1.55  0.57 -0.01 -3.20  8.83 -2.47  4.65  0.07  4.54 -2.35 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 66.66 12.12 -10.32 -1.55  0.57 -0.01 -3.20  8.83 -2.47  4.65  0.07  4.54 -2.35 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 68.35 12.82 -9.91 -1.85  0.77  0.18 -2.25  9.05 -1.75  3.84  0.66  5.82 -2.50 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 68.35 12.82 -9.91 -1.85  0.77  0.18 -2.25  9.05 -1.75  3.84  0.66  5.82 -2.50 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 64.05 14.55 -8.14 -0.36  0.64  0.75 -1.96  9.76 -0.03  5.26  1.16  5.03 -1.68 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 64.05 14.55 -8.14 -0.36  0.64  0.75 -1.96  9.76 -0.03  5.26  1.16  5.03 -1.68 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 63.08 15.30 -8.96  0.76  1.05  0.83 -1.40 10.94 -0.69  4.52 -0.80  3.58 -3.18 >
INFO: cmn_prior.c(99): cmn_prior_update: from < 63.08 15.30 -8.96  0.76  1.05  0.83 -1.40 10.94 -0.69  4.52 -0.80  3.58 -3.18 >
INFO: cmn_prior.c(116): cmn_prior_update: to   < 62.15 16.49 -10.32 -0.25  1.14 -0.32 -2.32 10.95 -2.12  2.91 -1.31  2.57 -4.05

更新:

更改了以下顺序以反映 https://github.com/cmusphinx/pocketsphinx/blob/master/swig/java/test/DecoderTest.java

        bb.order(ByteOrder.LITTLE_ENDIAN);
        bb.asShortBuffer().get(s);

一旦启动,就好像没有输入一样,完全没有输出。

INFO: pocketsphinx.c(152): Parsed model-specific feature parameters from /usr/local/share/pocketsphinx/model/en-us/en-us//feat.params
Current configuration:
[NAME]          [DEFLT]     [VALUE]
-agc            none        none
-agcthresh      2.0     2.000000e+00
-allphone               
-allphone_ci        no      no
-alpha          0.97        9.700000e-01
-ascale         20.0        2.000000e+01
-aw         1       1
-backtrace      no      no
-beam           1e-48       1.000000e-48
-bestpath       yes     yes
-bestpathlw     9.5     9.500000e+00
-ceplen         13      13
-cmn            current     current
-cmninit        8.0     40,3,-1
-compallsen     no      no
-debug                  0
-dict                   /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
-dictcase       no      no
-dither         no      no
-doublebw       no      no
-ds         1       1
-fdict                  
-feat           1s_c_d_dd   1s_c_d_dd
-featparams             
-fillprob       1e-8        1.000000e-08
-frate          100     100
-fsg                    
-fsgusealtpron      yes     yes
-fsgusefiller       yes     yes
-fwdflat        yes     yes
-fwdflatbeam        1e-64       1.000000e-64
-fwdflatefwid       4       4
-fwdflatlw      8.5     8.500000e+00
-fwdflatsfwin       25      25
-fwdflatwbeam       7e-29       7.000000e-29
-fwdtree        yes     yes
-hmm                    /usr/local/share/pocketsphinx/model/en-us/en-us/
-input_endian       little      little
-jsgf                   
-keyphrase              abomination
-kws                    
-kws_delay      10      10
-kws_plp        1e-1        1.000000e-01
-kws_threshold      1       1.000000e-20
-latsize        5000        5000
-lda                    
-ldadim         0       0
-lifter         0       22
-lm                 
-lmctl                  
-lmname                 
-logbase        1.0001      1.000100e+00
-logfn                  
-logspec        no      no
-lowerf         133.33334   1.300000e+02
-lpbeam         1e-40       1.000000e-40
-lponlybeam     7e-29       7.000000e-29
-lw         6.5     6.500000e+00
-maxhmmpf       30000       30000
-maxwpf         -1      -1
-mdef                   
-mean                   
-mfclogdir              
-min_endfr      0       0
-mixw                   
-mixwfloor      0.0000001   1.000000e-07
-mllr                   
-mmap           yes     yes
-ncep           13      13
-nfft           512     512
-nfilt          40      25
-nwpen          1.0     1.000000e+00
-pbeam          1e-48       1.000000e-48
-pip            1.0     1.000000e+00
-pl_beam        1e-10       1.000000e-10
-pl_pbeam       1e-10       1.000000e-10
-pl_pip         1.0     1.000000e+00
-pl_weight      3.0     3.000000e+00
-pl_window      5       5
-rawlogdir              
-remove_dc      no      no
-remove_noise       yes     yes
-remove_silence     yes     yes
-round_filters      yes     yes
-samprate       16000       1.600000e+04
-seed           -1      -1
-sendump                
-senlogdir              
-senmgau                
-silprob        0.005       5.000000e-03
-smoothspec     no      no
-svspec                 0-12/13-25/26-38
-tmat                   
-tmatfloor      0.0001      1.000000e-04
-topn           4       4
-topn_beam      0       0
-toprule                
-transform      legacy      dct
-unit_area      yes     yes
-upperf         6855.4976   6.800000e+03
-uw         1.0     1.000000e+00
-vad_postspeech     50      50
-vad_prespeech      20      20
-vad_startspeech    10      10
-vad_threshold      2.0     2.000000e+00
-var                    
-varfloor       0.0001      1.000000e-04
-varnorm        no      no
-verbose        no      no
-warp_params                
-warp_type      inverse_linear  inverse_linear
-wbeam          7e-29       7.000000e-29
-wip            0.65        6.500000e-01
-wlen           0.025625    2.562500e-02

INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
INFO: acmod.c(164): Using subvector specification 0-12/13-25/26-38
INFO: mdef.c(518): Reading model definition: /usr/local/share/pocketsphinx/model/en-us/en-us//mdef
INFO: mdef.c(531): Found byte-order mark BMDF, assuming this is a binary mdef file
INFO: bin_mdef.c(336): Reading binary model definition: /usr/local/share/pocketsphinx/model/en-us/en-us//mdef
INFO: bin_mdef.c(516): 42 CI-phone, 137053 CD-phone, 3 emitstate/phone, 126 CI-sen, 5126 Sen, 29324 Sen-Seq
INFO: tmat.c(206): Reading HMM transition probability matrices: /usr/local/share/pocketsphinx/model/en-us/en-us//transition_matrices
INFO: acmod.c(117): Attempting to use PTM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us//means
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(198): Reading mixture gaussian parameter: /usr/local/share/pocketsphinx/model/en-us/en-us//variances
INFO: ms_gauden.c(292): 42 codebook, 3 feature, size: 
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(294):  128x13
INFO: ms_gauden.c(354): 222 variance values floored
INFO: ptm_mgau.c(476): Loading senones from dump file /usr/local/share/pocketsphinx/model/en-us/en-us//sendump
INFO: ptm_mgau.c(500): BEGIN FILE FORMAT DESCRIPTION
INFO: ptm_mgau.c(563): Rows: 128, Columns: 5126
INFO: ptm_mgau.c(595): Using memory-mapped I/O for senones
INFO: ptm_mgau.c(835): Maximum top-N: 4
INFO: phone_loop_search.c(114): State beam -225 Phone exit beam -225 Insertion penalty 0
INFO: dict.c(320): Allocating 138623 * 32 bytes (4331 KiB) for word entries
INFO: dict.c(333): Reading main dictionary: /usr/local/share/pocketsphinx/model/en-us/cmudict-en-us.dict
INFO: dict.c(213): Allocated 1014 KiB for strings, 1677 KiB for phones
INFO: dict.c(336): 134522 words read
INFO: dict.c(358): Reading filler dictionary: /usr/local/share/pocketsphinx/model/en-us/en-us//noisedict
INFO: dict.c(213): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(361): 5 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(406): Allocating 42^3 * 2 bytes (144 KiB) for word-initial triphones
Recorder started
Decoder started
INFO: dict2pid.c(132): Allocated 42672 bytes (41 KiB) for word-final triphones
INFO: dict2pid.c(196): Allocated 42672 bytes (41 KiB) for single-phone word triphones
INFO: kws_search.c(420): KWS(beam: -1080, plp: -23, default threshold -450, delay 10)
abomination /le-20/

每个开发人员都应该知道浮点格式,它以数字开头 - 1e-10.1 相同,5e-30.005 相同。 le-20l 不是浮点数。

            d.processRaw(s, nbytes/2, false, false);
            d.setKws("abomination", );

此代码不应编译。此外,您不应在处理过程中设置搜索。

I'm not clear on the use of -kws vs -keyphrase vx -kws-threshold. Does using -kws mean you don't need the other two since it effectively sets both the phrase and threshold?

是,关键字列表替换命令行选项

Am I assuming wrong that kws doesn't return a hyp? T

这不是 return 假设,因为您没有将正确的音频传递到解码器。你需要

bb.order(ByteOrder.LITTLE_ENDIAN);`

没有它,它会传递大端数据(java 默认值)。来源中的 Swig 示例有。

INFO: cmn_prior.c(99): cmn_prior_update: from < 73.10 11.10 -10.49 1.23 0.67 -1.37 -5.29 5.17 -0.62 3.91 -0.28 2.56 -2.14 >

CMN 范围为 70 表示端序问题。使用正确的字节序,第一个 CMN 值应该在 30 到 60 之间。