无法计算得到跟随集
Unable calculate to get follow set
一段时间以来,我一直在尝试计算语法的后续集,并且 运行 遇到了另一个问题。这是我的跟随集计算器:
def gen_follow_set(grammar, start_sym, first_sets):
follow_sets = {nterm: set() for nterm in grammar}
follow_sets[start_sym].add("$")
for _, prods in grammar.items():
for alt in prods:
for item in alt:
if item.isupper():
follow_sets[item] = set()
while True:
changes = copy.deepcopy(follow_sets)
for nterm, prods in grammar.items():
for alt in prods:
for i, item in enumerate(alt):
la = alt[i + 1] if i + 1 != len(alt) else nterm
if i == len(alt) - 1 and item != "":
follow_sets[item] |= follow_sets[nterm]
elif item != "":
if "" in first_sets[la]:
follow_sets[item] |= first_sets[la].union(
first_sets[alt[i + 2] if i + 2 <= len(alt) -
1 else nterm]) - {""}
else:
follow_sets[item] |= first_sets[la]
if changes == follow_sets:
return follow_sets
这是这样称呼的:
grammar = {
"expr": [["term", "etail"]],
"term": [["LPAREN", "expr", "RPAREN"], ["INT", "ttail"]],
"etail": [["PLUS", "expr"], [""]],
"ttail": [["TIMES", "term"], [""]]
}
first = calc_first_set(...)
pprint.pprint(gen_follow_set(grammar, "expr", first))
这输出:
Working on: term ; la has epsilon; la: etail
Working on: etail ; it is at the end of the production: expr
Working on: LPAREN ; la doesn't have epsilon; la: expr
Working on: expr ; la doesn't have epsilon; la: RPAREN
Working on: RPAREN ; it is at the end of the production: term
Working on: INT ; la has epsilon; la: ttail
Working on: ttail ; it is at the end of the production: term
Working on: PLUS ; la doesn't have epsilon; la: expr
Working on: expr ; it is at the end of the production: etail
Working on: TIMES ; la doesn't have epsilon; la: term
Working on: term ; it is at the end of the production: ttail
Working on: term ; la has epsilon; la: etail
Working on: etail ; it is at the end of the production: expr
Working on: LPAREN ; la doesn't have epsilon; la: expr
Working on: expr ; la doesn't have epsilon; la: RPAREN
Working on: RPAREN ; it is at the end of the production: term
Working on: INT ; la has epsilon; la: ttail
Working on: ttail ; it is at the end of the production: term
Working on: PLUS ; la doesn't have epsilon; la: expr
Working on: expr ; it is at the end of the production: etail
Working on: TIMES ; la doesn't have epsilon; la: term
Working on: term ; it is at the end of the production: ttail
Working on: term ; la has epsilon; la: etail
Working on: etail ; it is at the end of the production: expr
Working on: LPAREN ; la doesn't have epsilon; la: expr
Working on: expr ; la doesn't have epsilon; la: RPAREN
Working on: RPAREN ; it is at the end of the production: term
Working on: INT ; la has epsilon; la: ttail
Working on: ttail ; it is at the end of the production: term
Working on: PLUS ; la doesn't have epsilon; la: expr
Working on: expr ; it is at the end of the production: etail
Working on: TIMES ; la doesn't have epsilon; la: term
Working on: term ; it is at the end of the production: ttail
{'INT': {'INT', 'TIMES', 'LPAREN'},
'LPAREN': {'INT', 'LPAREN'},
'PLUS': {'INT', 'LPAREN'},
'RPAREN': {'INT', 'LPAREN', 'PLUS'},
'TIMES': {'INT', 'LPAREN'},
'etail': {'$', 'RPAREN'},
'expr': {'$', 'RPAREN'},
'term': {'INT', 'LPAREN', 'PLUS'},
'ttail': {'INT', 'LPAREN', 'PLUS'}}
etail
和 expr
是正确的,但是 term
和 ttail
是不正确的。我怎样才能得到正确的答案?
每当非终结符 N
出现在产生式中时
M → α N β
我们有
FIRST(α) ⊂ FOLLOW(N)
如果 β
可以为空,则 FOLLOW(M) ⊂ FOLLOW(N)
如果 β 为空(即 N
在生产结束时)或如果 β 中的第一个符号不可为空,则您的代码可以正常工作。在其余情况下,您的代码有错误:
如果 β 中的第一个符号可以为空,则将 FIRST(β) 计算为 β 中前两个符号的 FIRST 集合的并集。由于您从不检查第二个(或后续)符号是否可为空,因此您可能会错过 FIRST(β) 中的符号。
仅测试下一个符号的可空性的另一个结果是您不计算 NULLABLE(β);相反,您使用 β 中第一个符号的可空性。所以你可能会错过 FOLLOW(M)
中的符号。
我认为这些错误中的任何一个都不是由您的实际语法触发的。但下一个是;
如果您的(不充分的)测试表明 β 可以为空,则使用 FIRST(M)
而不是 FOLLOW(M)
。
一个密切相关的问题是 la
的计算,如果已经达到生产结束,它会建议 term
作为下一个符号。这将导致使用 FIRST(term)
而不是 FOLLOW(term)
,但当然这永远不会发生,因为如果 N
在生产结束。既然如此,la
其实是没有必要的。
一段时间以来,我一直在尝试计算语法的后续集,并且 运行 遇到了另一个问题。这是我的跟随集计算器:
def gen_follow_set(grammar, start_sym, first_sets):
follow_sets = {nterm: set() for nterm in grammar}
follow_sets[start_sym].add("$")
for _, prods in grammar.items():
for alt in prods:
for item in alt:
if item.isupper():
follow_sets[item] = set()
while True:
changes = copy.deepcopy(follow_sets)
for nterm, prods in grammar.items():
for alt in prods:
for i, item in enumerate(alt):
la = alt[i + 1] if i + 1 != len(alt) else nterm
if i == len(alt) - 1 and item != "":
follow_sets[item] |= follow_sets[nterm]
elif item != "":
if "" in first_sets[la]:
follow_sets[item] |= first_sets[la].union(
first_sets[alt[i + 2] if i + 2 <= len(alt) -
1 else nterm]) - {""}
else:
follow_sets[item] |= first_sets[la]
if changes == follow_sets:
return follow_sets
这是这样称呼的:
grammar = {
"expr": [["term", "etail"]],
"term": [["LPAREN", "expr", "RPAREN"], ["INT", "ttail"]],
"etail": [["PLUS", "expr"], [""]],
"ttail": [["TIMES", "term"], [""]]
}
first = calc_first_set(...)
pprint.pprint(gen_follow_set(grammar, "expr", first))
这输出:
Working on: term ; la has epsilon; la: etail
Working on: etail ; it is at the end of the production: expr
Working on: LPAREN ; la doesn't have epsilon; la: expr
Working on: expr ; la doesn't have epsilon; la: RPAREN
Working on: RPAREN ; it is at the end of the production: term
Working on: INT ; la has epsilon; la: ttail
Working on: ttail ; it is at the end of the production: term
Working on: PLUS ; la doesn't have epsilon; la: expr
Working on: expr ; it is at the end of the production: etail
Working on: TIMES ; la doesn't have epsilon; la: term
Working on: term ; it is at the end of the production: ttail
Working on: term ; la has epsilon; la: etail
Working on: etail ; it is at the end of the production: expr
Working on: LPAREN ; la doesn't have epsilon; la: expr
Working on: expr ; la doesn't have epsilon; la: RPAREN
Working on: RPAREN ; it is at the end of the production: term
Working on: INT ; la has epsilon; la: ttail
Working on: ttail ; it is at the end of the production: term
Working on: PLUS ; la doesn't have epsilon; la: expr
Working on: expr ; it is at the end of the production: etail
Working on: TIMES ; la doesn't have epsilon; la: term
Working on: term ; it is at the end of the production: ttail
Working on: term ; la has epsilon; la: etail
Working on: etail ; it is at the end of the production: expr
Working on: LPAREN ; la doesn't have epsilon; la: expr
Working on: expr ; la doesn't have epsilon; la: RPAREN
Working on: RPAREN ; it is at the end of the production: term
Working on: INT ; la has epsilon; la: ttail
Working on: ttail ; it is at the end of the production: term
Working on: PLUS ; la doesn't have epsilon; la: expr
Working on: expr ; it is at the end of the production: etail
Working on: TIMES ; la doesn't have epsilon; la: term
Working on: term ; it is at the end of the production: ttail
{'INT': {'INT', 'TIMES', 'LPAREN'},
'LPAREN': {'INT', 'LPAREN'},
'PLUS': {'INT', 'LPAREN'},
'RPAREN': {'INT', 'LPAREN', 'PLUS'},
'TIMES': {'INT', 'LPAREN'},
'etail': {'$', 'RPAREN'},
'expr': {'$', 'RPAREN'},
'term': {'INT', 'LPAREN', 'PLUS'},
'ttail': {'INT', 'LPAREN', 'PLUS'}}
etail
和 expr
是正确的,但是 term
和 ttail
是不正确的。我怎样才能得到正确的答案?
每当非终结符 N
出现在产生式中时
M → α N β
我们有
FIRST(α) ⊂ FOLLOW(N)
如果
β
可以为空,则FOLLOW(M) ⊂ FOLLOW(N)
如果 β 为空(即 N
在生产结束时)或如果 β 中的第一个符号不可为空,则您的代码可以正常工作。在其余情况下,您的代码有错误:
如果 β 中的第一个符号可以为空,则将 FIRST(β) 计算为 β 中前两个符号的 FIRST 集合的并集。由于您从不检查第二个(或后续)符号是否可为空,因此您可能会错过 FIRST(β) 中的符号。
仅测试下一个符号的可空性的另一个结果是您不计算 NULLABLE(β);相反,您使用 β 中第一个符号的可空性。所以你可能会错过
FOLLOW(M)
中的符号。
我认为这些错误中的任何一个都不是由您的实际语法触发的。但下一个是;
如果您的(不充分的)测试表明 β 可以为空,则使用
FIRST(M)
而不是FOLLOW(M)
。一个密切相关的问题是
la
的计算,如果已经达到生产结束,它会建议term
作为下一个符号。这将导致使用FIRST(term)
而不是FOLLOW(term)
,但当然这永远不会发生,因为如果N
在生产结束。既然如此,la
其实是没有必要的。