在 praat 脚本中查找多个单词

Look for multiple words in praat script

我正在编写一个 praat 脚本,它将在多个文件中搜索单词列表。这是我到目前为止所拥有的。它只会放大到过程中的第一个单词,而不会遍历其余部分。我认为这与选择的内容有关。对于 For i through n,只有文本网格被选中,但在注释器中,两者都被选中。我需要脚本继续搜索每个间隔,以便也可以找到过程中的其他单词。

directory$ = "directory"
listfile$ = "test.txt"

Read Strings from raw text file... 'directory$'/'listfile$'
last = Get number of strings

# loop through each file
for a from 1 to last
    listfile2$ = listfile$ - ".txt"
    select Strings 'listfile2$'
    textgrid$ = Get string... 'a'
    Read from file... 'directory$'/'textgrid$'
    object_name$ = selected$("TextGrid")

    Read from file... 'directory$'/'object_name$'.wav

    # rearrange tiers 
    select TextGrid 'object_name$'
    Duplicate tier: 3, 1, "MAU"
    Remove tier: 4
    Insert interval tier: 1, "subphone"

    # find target word
    n = Get number of intervals: 3  
    for i to n

@instance: "strikes"
@instance: "raindrops"
@instance: "and"
@instance: "rainbow"
@instance: "into"
@instance: "round"
@instance: "its"
@instance: "its"

procedure instance: .target_word$

    label$ = Get label of interval: 3, i
        if label$ == .target_word$
        index = i
        i += n

# get the start and end point of the word
startpoint = Get starting point... 3 index
endpoint = Get end point... 3 index

        select TextGrid 'object_name$'
        plus Sound 'object_name$'
        View & Edit
        editor TextGrid 'object_name$'

# annotation
Select... startpoint endpoint
Zoom to selection
pause Annotate stops then continue
Close
endeditor

        endif # if the label = target word
    endfor # for number of intervals



select TextGrid 'object_name$'
Write to text file: directory$ + "/" + object_name$ + "_editedtext.TextGrid"

select all
minus Strings 'listfile2$'
Remove

endproc

#writeInfoLine: "done!"
#select Strings 'listfile2$'
endfor # for each of the files
clearinfo
print That's it!

编辑:这是根据答案修改后的脚本。

directory$ = "/Users/directorypath"
listfile$ = "test.txt"

Read Strings from raw text file... 'directory$'/'listfile$'
last = Get number of strings
listfile2$ = listfile$ - ".txt"

# loop through each file
for a from 1 to last
    select Strings 'listfile2$'
    textgrid$ = Get string... 'a'
    Read from file... 'directory$'/'textgrid$'
    object_name$ = selected$("TextGrid")
    Read from file... 'directory$'/'object_name$'.wav

    # rearrange tiers
    select TextGrid 'object_name$'
    Duplicate tier: 3, 1, "MAU"
    Remove tier: 4
    Insert interval tier: 1, "subphone"

    n = Get number of intervals: 3

    for i to n
        @instance: "strikes"
        @instance: "raindrops"
        @instance: "and"
        @instance: "rainbow"
        @instance: "into"
        @instance: "round"
        @instance: "its"
        @instance: "its"

    endfor
endfor

procedure instance: .target_word$

label$ = Get label of interval: 3, i
if label$ == .target_word$
    index = i
    i += n

    # get the start and end point of the word
    startpoint = Get starting point... 3 index
    endpoint = Get end point... 3 index

    select TextGrid 'object_name$'
    plus Sound 'object_name$'
    View & Edit
    editor TextGrid 'object_name$'

    # annotation
    Select... startpoint endpoint
    Zoom to selection
    pause Annotate stops then continue
    Close
    endeditor

    endif

endproc

尝试在程序初始化之后添加 select TextGrid 'object_name$'

procedure instance: .target_word$

    select TextGrid 'object_name$'

    label$ = Get label of interval: 3, i
    if label$ == .target_word$
         index = i
         i += n

你写的脚本没有仔细缩进,所以我试着格式化它,这样更容易理解发生了什么。从某种意义上说,确实如此。但是浮出水面的东西仍然需要一些努力才能理解。

以下是对正在发生的事情的逐步跟进,正如 Praat 所见:

  1. 在第 8 行开始一个 for 循环:for a from 1 to last

  2. 在该循环内,在第 25 行,您开始第二个循环:for i to n

  3. 在第二个循环中,在第 27 行,您调用了一个名为 instance 的过程。

    此时,Praat 跳到定义该过程的 last 行(因此,如果您多次定义它,您只会得到最后一个)。由于只有一个,Praat 跳转到第 36 行:procedure instance: .target_word$

  4. 在那个过程中(顺便说一下,它是在一个 for 循环中定义的,这是......不寻常的)你有一个 if 块:if label$ == .target_word$

  5. 在该块的末尾,endfor 递增 控制变量 (在本例中为 i)并关闭for 循环。但是哪一个?

    您可能希望它关闭我们输入的最后一个 for 循环(我就是这样做的)。但实际上,Praat 似乎跟踪打开 for 和关闭 endfor 语句,并将它们垂直映射。

    我没有足够详细地查看解释器以弄清楚究竟发生了什么,但在这种情况下,结果与映射最低 endfor(=最接近脚本底部的那个)到最高的 for,依此类推。

    (这很可能 而不是 真正发生的事情(否则多个非重叠循环将不起作用),但这并不重要:重要的是 an endfor 只关闭 a single for,不管它在脚本中的什么位置或 Praat 何时看到它。作为一个另外,这是 而不是 endproc 发生的事情。)

    不管具体规则如何,这个 endfor 都会映射到第二个 for,我们在第 2 点(即第 25 行)输入了它。所以我们回到 that 循环的第一行(第 26 行)。

  6. 现在我们第二次到达第 27 行 (这次是第二个间隔),我们再次调用 @instance: "strikes"。我们还没到 @instance: "raindrops"!

  7. 这对所有间隔重复,每次递增 i(每当我们点击 endfor),直到 i 变为 n.这一次,当我们调用 @instance 时,我们经过 if 块,我们再次从点 5 到达 endfor

    Praat 乖乖地增加控制变量(所以现在 i = n + 1),并检查在 for 循环开始时设置的结束条件。在这种情况下,Praat 知道 for 循环在 i == n 时结束,并且由于 i = n + 1,它不会跳回到顶部,而是继续。

    只有现在,在经历了第一个文件的所有间隔之后,我们才真正到达程序的结尾 !

  8. 程序终于结束了。 Praat 记得我们早在第 3 点就进入了这个过程,那时我们正在阅读第 27 行。因此,它尽职尽责地阅读了第 28 行,这是对同一过程的另一个调用:@instance: "raindrops".

  9. 这里(终于!)它死了。

    它死了,因为控制变量 i 现在是 n + 1(它变成了第 7 点中的那个)。这通常不会发生,因为您通常不会为已经完成的循环点击 endfor 语句。但在这种情况下我们这样做。因此,当 Praat 尝试在具有 i-1 间隔的 TextGrid 中读取间隔 i 的标签时......它抱怨说间隔数太大。因为它是。

    你最初的问题(它只做了一部分工作,实际上并没有死)我无法重现,因为在 if 块中你实际上手动更改了 i 的值(这是 risky),并且 if 块只有在标签在您的 TextGrid 中足够早匹配时才会执行(足够早以至于它发生在脚本爆炸之前)。

您可以通过脚本结构的这个简化版本看到整个混乱情况:

last = 10
for a from 1 to last
  appendInfoLine: "First line of main loop; a = ", a

  n = 5 
  for i to n
    appendInfoLine: "First line of second loop; i = ", i

    @instance: "strikes"
    @instance: "raindrops"
    @instance: "and"

    procedure instance: .target_word$
      appendInfoLine: "Called @instance: " + .target_word$
      appendInfoLine: "a=", a, " i=", i

  endfor # for number of intervals

      appendInfoLine: "End of @instance"
    endproc

  appendInfoLine: "Script claims we are done!"
endfor # for each of the files

要解决此问题,您可能应该重新构建代码,使其或多或少遵循以下模式:

for a to last
  for i to n
    @instance: "word"
  endfor
endfor

procedure instance: .word$
  # do things
endproc

这很有教育意义。 :)

我成功了! for 循环中的 manual page 很有帮助。这是代码,如果有人觉得它有用的话。

#############################################################
#
# This script requires a file listing the .Textgrid files 
# in the directory containing the .wav files and .Textgrid files.
# Using the command line, you can make this by navigating to the directory,
# and typing ls *.TextGrid > contents.txt
#
# This script reads in files and textgrids from a directory,
# rearranges the tiers that were output by MAUS, - http://www.bas.uni-muenchen.de/Bas/BasMAUS.html
# provides a subphone tier, searches for words specified in the procedure
# so that they can be annotated.
#
# Written using code from Bert Remijsen that was rewritten by Peggy Renwick
#
#############################################################

# enter your directory here
directory$ = "/Users/lisalipani/Documents/School/Graduate/Research/NSP/CorpusFiles/Test"

# enter your list file here
listfile$ = "test.txt"

Read Strings from raw text file... 'directory$'/'listfile$'
last = Get number of strings
listfile2$ = listfile$ - ".txt"

# loop through each file
for a from 1 to last
    select Strings 'listfile2$'
    textgrid$ = Get string... 'a'
    Read from file... 'directory$'/'textgrid$'
    object_name$ = selected$("TextGrid")
    Read from file... 'directory$'/'object_name$'.wav

    # rearrange tiers
    select TextGrid 'object_name$'
    Duplicate tier: 3, 1, "MAU"
    Remove tier: 4
    Insert interval tier: 1, "subphone"

    # input target words here
    @instance: "strikes"
    @instance: "friends"

    # start the procedure
    procedure instance: .target_word$

    select TextGrid 'object_name$'

    numberOfIntervals = Get number of intervals: 3

    for intervalNumber from 1 to numberOfIntervals
        label$ = Get label of interval: 3, intervalNumber
        if label$ == .target_word$

            # get the start and end point of the word
            startpoint = Get starting point... 3 intervalNumber
            endpoint = Get end point... 3 intervalNumber

            select TextGrid 'object_name$'
            plus Sound 'object_name$'
            View & Edit
            editor TextGrid 'object_name$'

            # annotation
            Select... startpoint endpoint
            Zoom to selection
            pause Annotate stops then continue
            Close
            endeditor

            select TextGrid 'object_name$'

        endif
    endfor
    endproc

    select TextGrid 'object_name$'
    Save as text file... 'directory$'/'object_name$'_annotated.TextGrid

endfor
appendInfoLine: "all done!"