在 praat 脚本中查找多个单词
Look for multiple words in praat script
我正在编写一个 praat 脚本,它将在多个文件中搜索单词列表。这是我到目前为止所拥有的。它只会放大到过程中的第一个单词,而不会遍历其余部分。我认为这与选择的内容有关。对于 For i through n
,只有文本网格被选中,但在注释器中,两者都被选中。我需要脚本继续搜索每个间隔,以便也可以找到过程中的其他单词。
directory$ = "directory"
listfile$ = "test.txt"
Read Strings from raw text file... 'directory$'/'listfile$'
last = Get number of strings
# loop through each file
for a from 1 to last
listfile2$ = listfile$ - ".txt"
select Strings 'listfile2$'
textgrid$ = Get string... 'a'
Read from file... 'directory$'/'textgrid$'
object_name$ = selected$("TextGrid")
Read from file... 'directory$'/'object_name$'.wav
# rearrange tiers
select TextGrid 'object_name$'
Duplicate tier: 3, 1, "MAU"
Remove tier: 4
Insert interval tier: 1, "subphone"
# find target word
n = Get number of intervals: 3
for i to n
@instance: "strikes"
@instance: "raindrops"
@instance: "and"
@instance: "rainbow"
@instance: "into"
@instance: "round"
@instance: "its"
@instance: "its"
procedure instance: .target_word$
label$ = Get label of interval: 3, i
if label$ == .target_word$
index = i
i += n
# get the start and end point of the word
startpoint = Get starting point... 3 index
endpoint = Get end point... 3 index
select TextGrid 'object_name$'
plus Sound 'object_name$'
View & Edit
editor TextGrid 'object_name$'
# annotation
Select... startpoint endpoint
Zoom to selection
pause Annotate stops then continue
Close
endeditor
endif # if the label = target word
endfor # for number of intervals
select TextGrid 'object_name$'
Write to text file: directory$ + "/" + object_name$ + "_editedtext.TextGrid"
select all
minus Strings 'listfile2$'
Remove
endproc
#writeInfoLine: "done!"
#select Strings 'listfile2$'
endfor # for each of the files
clearinfo
print That's it!
编辑:这是根据答案修改后的脚本。
directory$ = "/Users/directorypath"
listfile$ = "test.txt"
Read Strings from raw text file... 'directory$'/'listfile$'
last = Get number of strings
listfile2$ = listfile$ - ".txt"
# loop through each file
for a from 1 to last
select Strings 'listfile2$'
textgrid$ = Get string... 'a'
Read from file... 'directory$'/'textgrid$'
object_name$ = selected$("TextGrid")
Read from file... 'directory$'/'object_name$'.wav
# rearrange tiers
select TextGrid 'object_name$'
Duplicate tier: 3, 1, "MAU"
Remove tier: 4
Insert interval tier: 1, "subphone"
n = Get number of intervals: 3
for i to n
@instance: "strikes"
@instance: "raindrops"
@instance: "and"
@instance: "rainbow"
@instance: "into"
@instance: "round"
@instance: "its"
@instance: "its"
endfor
endfor
procedure instance: .target_word$
label$ = Get label of interval: 3, i
if label$ == .target_word$
index = i
i += n
# get the start and end point of the word
startpoint = Get starting point... 3 index
endpoint = Get end point... 3 index
select TextGrid 'object_name$'
plus Sound 'object_name$'
View & Edit
editor TextGrid 'object_name$'
# annotation
Select... startpoint endpoint
Zoom to selection
pause Annotate stops then continue
Close
endeditor
endif
endproc
尝试在程序初始化之后添加 select TextGrid 'object_name$'
。
procedure instance: .target_word$
select TextGrid 'object_name$'
label$ = Get label of interval: 3, i
if label$ == .target_word$
index = i
i += n
你写的脚本没有仔细缩进,所以我试着格式化它,这样更容易理解发生了什么。从某种意义上说,确实如此。但是浮出水面的东西仍然需要一些努力才能理解。
以下是对正在发生的事情的逐步跟进,正如 Praat 所见:
在第 8 行开始一个 for 循环:for a from 1 to last
在该循环内,在第 25 行,您开始第二个循环:for i to n
在第二个循环中,在第 27 行,您调用了一个名为 instance
的过程。
此时,Praat 跳到定义该过程的 last 行(因此,如果您多次定义它,您只会得到最后一个)。由于只有一个,Praat 跳转到第 36 行:procedure instance: .target_word$
在那个过程中(顺便说一下,它是在一个 for 循环中定义的,这是......不寻常的)你有一个 if
块:if label$ == .target_word$
在该块的末尾,endfor
递增 控制变量 (在本例中为 i
)并关闭for
循环。但是哪一个?
您可能希望它关闭我们输入的最后一个 for 循环(我就是这样做的)。但实际上,Praat 似乎跟踪打开 for
和关闭 endfor
语句,并将它们垂直映射。
我没有足够详细地查看解释器以弄清楚究竟发生了什么,但在这种情况下,结果与映射最低 endfor
(=最接近脚本底部的那个)到最高的 for
,依此类推。
(这很可能 而不是 真正发生的事情(否则多个非重叠循环将不起作用),但这并不重要:重要的是 an endfor
只关闭 a single for
,不管它在脚本中的什么位置或 Praat 何时看到它。作为一个另外,这是 而不是 endproc
发生的事情。)
不管具体规则如何,这个 endfor
都会映射到第二个 for
,我们在第 2 点(即第 25 行)输入了它。所以我们回到 that 循环的第一行(第 26 行)。
现在我们第二次到达第 27 行 (这次是第二个间隔),我们再次调用 @instance: "strikes"
。我们还没到 @instance: "raindrops"
!
这对所有间隔重复,每次递增 i
(每当我们点击 endfor
),直到 i
变为 n
.这一次,当我们调用 @instance
时,我们经过 if
块,我们再次从点 5 到达 endfor
。
Praat 乖乖地增加控制变量(所以现在 i = n + 1
),并检查在 for 循环开始时设置的结束条件。在这种情况下,Praat 知道 for 循环在 i == n
时结束,并且由于 i = n + 1
,它不会跳回到顶部,而是继续。
只有现在,在经历了第一个文件的所有间隔之后,我们才真正到达程序的结尾 !
程序终于结束了。 Praat 记得我们早在第 3 点就进入了这个过程,那时我们正在阅读第 27 行。因此,它尽职尽责地阅读了第 28 行,这是对同一过程的另一个调用:@instance: "raindrops"
.
这里(终于!)它死了。
它死了,因为控制变量 i
现在是 n + 1
(它变成了第 7 点中的那个)。这通常不会发生,因为您通常不会为已经完成的循环点击 endfor
语句。但在这种情况下我们这样做。因此,当 Praat 尝试在具有 i-1
间隔的 TextGrid 中读取间隔 i
的标签时......它抱怨说间隔数太大。因为它是。
你最初的问题(它只做了一部分工作,实际上并没有死)我无法重现,因为在 if
块中你实际上手动更改了 i
的值(这是 risky),并且 if
块只有在标签在您的 TextGrid 中足够早匹配时才会执行(足够早以至于它发生在脚本爆炸之前)。
您可以通过脚本结构的这个简化版本看到整个混乱情况:
last = 10
for a from 1 to last
appendInfoLine: "First line of main loop; a = ", a
n = 5
for i to n
appendInfoLine: "First line of second loop; i = ", i
@instance: "strikes"
@instance: "raindrops"
@instance: "and"
procedure instance: .target_word$
appendInfoLine: "Called @instance: " + .target_word$
appendInfoLine: "a=", a, " i=", i
endfor # for number of intervals
appendInfoLine: "End of @instance"
endproc
appendInfoLine: "Script claims we are done!"
endfor # for each of the files
要解决此问题,您可能应该重新构建代码,使其或多或少遵循以下模式:
for a to last
for i to n
@instance: "word"
endfor
endfor
procedure instance: .word$
# do things
endproc
这很有教育意义。 :)
我成功了! for 循环中的 manual page 很有帮助。这是代码,如果有人觉得它有用的话。
#############################################################
#
# This script requires a file listing the .Textgrid files
# in the directory containing the .wav files and .Textgrid files.
# Using the command line, you can make this by navigating to the directory,
# and typing ls *.TextGrid > contents.txt
#
# This script reads in files and textgrids from a directory,
# rearranges the tiers that were output by MAUS, - http://www.bas.uni-muenchen.de/Bas/BasMAUS.html
# provides a subphone tier, searches for words specified in the procedure
# so that they can be annotated.
#
# Written using code from Bert Remijsen that was rewritten by Peggy Renwick
#
#############################################################
# enter your directory here
directory$ = "/Users/lisalipani/Documents/School/Graduate/Research/NSP/CorpusFiles/Test"
# enter your list file here
listfile$ = "test.txt"
Read Strings from raw text file... 'directory$'/'listfile$'
last = Get number of strings
listfile2$ = listfile$ - ".txt"
# loop through each file
for a from 1 to last
select Strings 'listfile2$'
textgrid$ = Get string... 'a'
Read from file... 'directory$'/'textgrid$'
object_name$ = selected$("TextGrid")
Read from file... 'directory$'/'object_name$'.wav
# rearrange tiers
select TextGrid 'object_name$'
Duplicate tier: 3, 1, "MAU"
Remove tier: 4
Insert interval tier: 1, "subphone"
# input target words here
@instance: "strikes"
@instance: "friends"
# start the procedure
procedure instance: .target_word$
select TextGrid 'object_name$'
numberOfIntervals = Get number of intervals: 3
for intervalNumber from 1 to numberOfIntervals
label$ = Get label of interval: 3, intervalNumber
if label$ == .target_word$
# get the start and end point of the word
startpoint = Get starting point... 3 intervalNumber
endpoint = Get end point... 3 intervalNumber
select TextGrid 'object_name$'
plus Sound 'object_name$'
View & Edit
editor TextGrid 'object_name$'
# annotation
Select... startpoint endpoint
Zoom to selection
pause Annotate stops then continue
Close
endeditor
select TextGrid 'object_name$'
endif
endfor
endproc
select TextGrid 'object_name$'
Save as text file... 'directory$'/'object_name$'_annotated.TextGrid
endfor
appendInfoLine: "all done!"
我正在编写一个 praat 脚本,它将在多个文件中搜索单词列表。这是我到目前为止所拥有的。它只会放大到过程中的第一个单词,而不会遍历其余部分。我认为这与选择的内容有关。对于 For i through n
,只有文本网格被选中,但在注释器中,两者都被选中。我需要脚本继续搜索每个间隔,以便也可以找到过程中的其他单词。
directory$ = "directory"
listfile$ = "test.txt"
Read Strings from raw text file... 'directory$'/'listfile$'
last = Get number of strings
# loop through each file
for a from 1 to last
listfile2$ = listfile$ - ".txt"
select Strings 'listfile2$'
textgrid$ = Get string... 'a'
Read from file... 'directory$'/'textgrid$'
object_name$ = selected$("TextGrid")
Read from file... 'directory$'/'object_name$'.wav
# rearrange tiers
select TextGrid 'object_name$'
Duplicate tier: 3, 1, "MAU"
Remove tier: 4
Insert interval tier: 1, "subphone"
# find target word
n = Get number of intervals: 3
for i to n
@instance: "strikes"
@instance: "raindrops"
@instance: "and"
@instance: "rainbow"
@instance: "into"
@instance: "round"
@instance: "its"
@instance: "its"
procedure instance: .target_word$
label$ = Get label of interval: 3, i
if label$ == .target_word$
index = i
i += n
# get the start and end point of the word
startpoint = Get starting point... 3 index
endpoint = Get end point... 3 index
select TextGrid 'object_name$'
plus Sound 'object_name$'
View & Edit
editor TextGrid 'object_name$'
# annotation
Select... startpoint endpoint
Zoom to selection
pause Annotate stops then continue
Close
endeditor
endif # if the label = target word
endfor # for number of intervals
select TextGrid 'object_name$'
Write to text file: directory$ + "/" + object_name$ + "_editedtext.TextGrid"
select all
minus Strings 'listfile2$'
Remove
endproc
#writeInfoLine: "done!"
#select Strings 'listfile2$'
endfor # for each of the files
clearinfo
print That's it!
编辑:这是根据答案修改后的脚本。
directory$ = "/Users/directorypath"
listfile$ = "test.txt"
Read Strings from raw text file... 'directory$'/'listfile$'
last = Get number of strings
listfile2$ = listfile$ - ".txt"
# loop through each file
for a from 1 to last
select Strings 'listfile2$'
textgrid$ = Get string... 'a'
Read from file... 'directory$'/'textgrid$'
object_name$ = selected$("TextGrid")
Read from file... 'directory$'/'object_name$'.wav
# rearrange tiers
select TextGrid 'object_name$'
Duplicate tier: 3, 1, "MAU"
Remove tier: 4
Insert interval tier: 1, "subphone"
n = Get number of intervals: 3
for i to n
@instance: "strikes"
@instance: "raindrops"
@instance: "and"
@instance: "rainbow"
@instance: "into"
@instance: "round"
@instance: "its"
@instance: "its"
endfor
endfor
procedure instance: .target_word$
label$ = Get label of interval: 3, i
if label$ == .target_word$
index = i
i += n
# get the start and end point of the word
startpoint = Get starting point... 3 index
endpoint = Get end point... 3 index
select TextGrid 'object_name$'
plus Sound 'object_name$'
View & Edit
editor TextGrid 'object_name$'
# annotation
Select... startpoint endpoint
Zoom to selection
pause Annotate stops then continue
Close
endeditor
endif
endproc
尝试在程序初始化之后添加 select TextGrid 'object_name$'
。
procedure instance: .target_word$
select TextGrid 'object_name$'
label$ = Get label of interval: 3, i
if label$ == .target_word$
index = i
i += n
你写的脚本没有仔细缩进,所以我试着格式化它,这样更容易理解发生了什么。从某种意义上说,确实如此。但是浮出水面的东西仍然需要一些努力才能理解。
以下是对正在发生的事情的逐步跟进,正如 Praat 所见:
在第 8 行开始一个 for 循环:
for a from 1 to last
在该循环内,在第 25 行,您开始第二个循环:
for i to n
在第二个循环中,在第 27 行,您调用了一个名为
instance
的过程。此时,Praat 跳到定义该过程的 last 行(因此,如果您多次定义它,您只会得到最后一个)。由于只有一个,Praat 跳转到第 36 行:
procedure instance: .target_word$
在那个过程中(顺便说一下,它是在一个 for 循环中定义的,这是......不寻常的)你有一个
if
块:if label$ == .target_word$
在该块的末尾,
endfor
递增 控制变量 (在本例中为i
)并关闭for
循环。但是哪一个?您可能希望它关闭我们输入的最后一个 for 循环(我就是这样做的)。但实际上,Praat 似乎跟踪打开
for
和关闭endfor
语句,并将它们垂直映射。我没有足够详细地查看解释器以弄清楚究竟发生了什么,但在这种情况下,结果与映射最低
endfor
(=最接近脚本底部的那个)到最高的for
,依此类推。(这很可能 而不是 真正发生的事情(否则多个非重叠循环将不起作用),但这并不重要:重要的是 an
endfor
只关闭 a singlefor
,不管它在脚本中的什么位置或 Praat 何时看到它。作为一个另外,这是 而不是endproc
发生的事情。)不管具体规则如何,这个
endfor
都会映射到第二个for
,我们在第 2 点(即第 25 行)输入了它。所以我们回到 that 循环的第一行(第 26 行)。现在我们第二次到达第 27 行 (这次是第二个间隔),我们再次调用
@instance: "strikes"
。我们还没到@instance: "raindrops"
!这对所有间隔重复,每次递增
i
(每当我们点击endfor
),直到i
变为n
.这一次,当我们调用@instance
时,我们经过if
块,我们再次从点 5 到达endfor
。Praat 乖乖地增加控制变量(所以现在
i = n + 1
),并检查在 for 循环开始时设置的结束条件。在这种情况下,Praat 知道 for 循环在i == n
时结束,并且由于i = n + 1
,它不会跳回到顶部,而是继续。只有现在,在经历了第一个文件的所有间隔之后,我们才真正到达程序的结尾 !
程序终于结束了。 Praat 记得我们早在第 3 点就进入了这个过程,那时我们正在阅读第 27 行。因此,它尽职尽责地阅读了第 28 行,这是对同一过程的另一个调用:
@instance: "raindrops"
.这里(终于!)它死了。
它死了,因为控制变量
i
现在是n + 1
(它变成了第 7 点中的那个)。这通常不会发生,因为您通常不会为已经完成的循环点击endfor
语句。但在这种情况下我们这样做。因此,当 Praat 尝试在具有i-1
间隔的 TextGrid 中读取间隔i
的标签时......它抱怨说间隔数太大。因为它是。你最初的问题(它只做了一部分工作,实际上并没有死)我无法重现,因为在
if
块中你实际上手动更改了i
的值(这是 risky),并且if
块只有在标签在您的 TextGrid 中足够早匹配时才会执行(足够早以至于它发生在脚本爆炸之前)。
您可以通过脚本结构的这个简化版本看到整个混乱情况:
last = 10
for a from 1 to last
appendInfoLine: "First line of main loop; a = ", a
n = 5
for i to n
appendInfoLine: "First line of second loop; i = ", i
@instance: "strikes"
@instance: "raindrops"
@instance: "and"
procedure instance: .target_word$
appendInfoLine: "Called @instance: " + .target_word$
appendInfoLine: "a=", a, " i=", i
endfor # for number of intervals
appendInfoLine: "End of @instance"
endproc
appendInfoLine: "Script claims we are done!"
endfor # for each of the files
要解决此问题,您可能应该重新构建代码,使其或多或少遵循以下模式:
for a to last
for i to n
@instance: "word"
endfor
endfor
procedure instance: .word$
# do things
endproc
这很有教育意义。 :)
我成功了! for 循环中的 manual page 很有帮助。这是代码,如果有人觉得它有用的话。
#############################################################
#
# This script requires a file listing the .Textgrid files
# in the directory containing the .wav files and .Textgrid files.
# Using the command line, you can make this by navigating to the directory,
# and typing ls *.TextGrid > contents.txt
#
# This script reads in files and textgrids from a directory,
# rearranges the tiers that were output by MAUS, - http://www.bas.uni-muenchen.de/Bas/BasMAUS.html
# provides a subphone tier, searches for words specified in the procedure
# so that they can be annotated.
#
# Written using code from Bert Remijsen that was rewritten by Peggy Renwick
#
#############################################################
# enter your directory here
directory$ = "/Users/lisalipani/Documents/School/Graduate/Research/NSP/CorpusFiles/Test"
# enter your list file here
listfile$ = "test.txt"
Read Strings from raw text file... 'directory$'/'listfile$'
last = Get number of strings
listfile2$ = listfile$ - ".txt"
# loop through each file
for a from 1 to last
select Strings 'listfile2$'
textgrid$ = Get string... 'a'
Read from file... 'directory$'/'textgrid$'
object_name$ = selected$("TextGrid")
Read from file... 'directory$'/'object_name$'.wav
# rearrange tiers
select TextGrid 'object_name$'
Duplicate tier: 3, 1, "MAU"
Remove tier: 4
Insert interval tier: 1, "subphone"
# input target words here
@instance: "strikes"
@instance: "friends"
# start the procedure
procedure instance: .target_word$
select TextGrid 'object_name$'
numberOfIntervals = Get number of intervals: 3
for intervalNumber from 1 to numberOfIntervals
label$ = Get label of interval: 3, intervalNumber
if label$ == .target_word$
# get the start and end point of the word
startpoint = Get starting point... 3 intervalNumber
endpoint = Get end point... 3 intervalNumber
select TextGrid 'object_name$'
plus Sound 'object_name$'
View & Edit
editor TextGrid 'object_name$'
# annotation
Select... startpoint endpoint
Zoom to selection
pause Annotate stops then continue
Close
endeditor
select TextGrid 'object_name$'
endif
endfor
endproc
select TextGrid 'object_name$'
Save as text file... 'directory$'/'object_name$'_annotated.TextGrid
endfor
appendInfoLine: "all done!"