Stanford Parser - 因式分解模型和 PCFG

Stanford Parser - Factored model and PCFG

stanford 解析器的分解模型和 PCFG 模型有什么区别? (在理论工作和数学视角方面)

This FAQ answer 长段解释区别。相关部分引述如下:

Can you explain the different parsers?

This answer is specific to English. It mostly applies to other languages although some components are missing in some languages. The file englishPCFG.ser.gz comprises just an unlexicalized PCFG grammar. It is basically the parser described in the ACL 2003 Accurate Unlexicalized Parsing paper.

… The file englishFactored.ser.gz contains two grammars and leads the system to run three parsers. It first runs a (simpler) PCFG parser and then an untyped dependency parser, and then runs a third parser which finds the parse with the best joint score across the two other parsers via a product model. This is described in the NIPS Fast Exact Inference paper.

… For English, although the grammars and parsing methods differ, the average quality of englishPCFG.ser.gz and englishFactored.ser.gz is similar, and so many people opt for the faster englishPCFG.ser.gz, though englishFactored.ser.gz sometimes does better because it does include lexicalization. For other languages, the factored models are considerably better than the PCFG models, and are what people generally use.

有指向 the main parser page 上引用的论文的链接。