为什么 OpenFST 似乎没有 'run' 或 'accept' 或 'transduce' 命令?
Why OpenFST does not seem to have 'run' or 'accept' or 'transduce' command?
我听说过很多关于 OpenFST 的好事,但我很难让它发挥作用。我正在构建一个 FST 自动机 (fstcompile),我想将其用作接受器来检查一组字符串是否匹配(非常类似于正则表达式,但具有 OpenFST 提供的自动机优化所提供的优势)。事情是这样的:
如何检查生成的自动机是否接受字符串?
我发现 a suggestion 输入字符串应该变成一个简单的自动机并与接受自动机组合以获得结果。我发现它非常麻烦和奇怪。有没有更简单的方法(通过 cmd 行或 Python/C++)?
这里有一个快速示例,说明如何使用 Open FST's Python wrapper 测试自动机是否接受字符串。事实上,您必须将您的输入变成一个自动机,而 Open FST 甚至不会为您创建这个 "linear chain automata"!幸运的是,自动化这个过程很简单,如下所示:
def linear_fst(elements, automata_op, keep_isymbols=True, **kwargs):
"""Produce a linear automata."""
compiler = fst.Compiler(isymbols=automata_op.input_symbols().copy(),
acceptor=keep_isymbols,
keep_isymbols=keep_isymbols,
**kwargs)
for i, el in enumerate(elements):
print >> compiler, "{} {} {}".format(i, i+1, el)
print >> compiler, str(i+1)
return compiler.compile()
def apply_fst(elements, automata_op, is_project=True, **kwargs):
"""Compose a linear automata generated from `elements` with `automata_op`.
Args:
elements (list): ordered list of edge symbols for a linear automata.
automata_op (Fst): automata that will be applied.
is_project (bool, optional): whether to keep only the output labels.
kwargs:
Additional arguments to the compiler of the linear automata .
"""
linear_automata = linear_fst(elements, automata_op, **kwargs)
out = fst.compose(linear_automata, automata_op)
if is_project:
out.project(project_output=True)
return out
def accepted(output_apply):
"""Given the output of `apply_fst` for acceptor, return True is sting was accepted."""
return output_apply.num_states() != 0
让我们定义一个简单的接受器,它只接受一系列 "ab":
f_ST = fst.SymbolTable()
f_ST.add_symbol("<eps>", 0)
f_ST.add_symbol("a", 1)
f_ST.add_symbol("b", 2)
compiler = fst.Compiler(isymbols=f_ST, osymbols=f_ST, keep_isymbols=True, keep_osymbols=True, acceptor=True)
print >> compiler, "0 1 a"
print >> compiler, "1 2 b"
print >> compiler, "2 0 <eps>"
print >> compiler, "2"
fsa_abs = compiler.compile()
fsa_abs
现在我们可以简单地应用接受器使用:
accepted(apply_fst(list("abab"), fsa_abs))
# True
accepted(apply_fst(list("ba"), fsa_abs))
# False
要了解如何使用换能器,请查看我的other answer
我听说过很多关于 OpenFST 的好事,但我很难让它发挥作用。我正在构建一个 FST 自动机 (fstcompile),我想将其用作接受器来检查一组字符串是否匹配(非常类似于正则表达式,但具有 OpenFST 提供的自动机优化所提供的优势)。事情是这样的:
如何检查生成的自动机是否接受字符串?
我发现 a suggestion 输入字符串应该变成一个简单的自动机并与接受自动机组合以获得结果。我发现它非常麻烦和奇怪。有没有更简单的方法(通过 cmd 行或 Python/C++)?
这里有一个快速示例,说明如何使用 Open FST's Python wrapper 测试自动机是否接受字符串。事实上,您必须将您的输入变成一个自动机,而 Open FST 甚至不会为您创建这个 "linear chain automata"!幸运的是,自动化这个过程很简单,如下所示:
def linear_fst(elements, automata_op, keep_isymbols=True, **kwargs):
"""Produce a linear automata."""
compiler = fst.Compiler(isymbols=automata_op.input_symbols().copy(),
acceptor=keep_isymbols,
keep_isymbols=keep_isymbols,
**kwargs)
for i, el in enumerate(elements):
print >> compiler, "{} {} {}".format(i, i+1, el)
print >> compiler, str(i+1)
return compiler.compile()
def apply_fst(elements, automata_op, is_project=True, **kwargs):
"""Compose a linear automata generated from `elements` with `automata_op`.
Args:
elements (list): ordered list of edge symbols for a linear automata.
automata_op (Fst): automata that will be applied.
is_project (bool, optional): whether to keep only the output labels.
kwargs:
Additional arguments to the compiler of the linear automata .
"""
linear_automata = linear_fst(elements, automata_op, **kwargs)
out = fst.compose(linear_automata, automata_op)
if is_project:
out.project(project_output=True)
return out
def accepted(output_apply):
"""Given the output of `apply_fst` for acceptor, return True is sting was accepted."""
return output_apply.num_states() != 0
让我们定义一个简单的接受器,它只接受一系列 "ab":
f_ST = fst.SymbolTable()
f_ST.add_symbol("<eps>", 0)
f_ST.add_symbol("a", 1)
f_ST.add_symbol("b", 2)
compiler = fst.Compiler(isymbols=f_ST, osymbols=f_ST, keep_isymbols=True, keep_osymbols=True, acceptor=True)
print >> compiler, "0 1 a"
print >> compiler, "1 2 b"
print >> compiler, "2 0 <eps>"
print >> compiler, "2"
fsa_abs = compiler.compile()
fsa_abs
现在我们可以简单地应用接受器使用:
accepted(apply_fst(list("abab"), fsa_abs))
# True
accepted(apply_fst(list("ba"), fsa_abs))
# False
要了解如何使用换能器,请查看我的other answer