层数 Lex 不算入

Question

我正在尝试编写一个程序来计算 C 程序的某些内容，我遇到的问题是我正在尝试使用以下代码来计算行数：

def t_newline(t):
    r'\n+'
    t.lexer.lineno += len(t.value)

它不算我的行数，这里是输入和输出的示例：

for
if
else
switch
exit
Number of if´s: 1
Number of for´s: 1
Number of While´s: 0
Number of else´s: 1
Number of switche´s: 1
Number of lines: 1

但是我每次按回车写新的一行代码都不算数，而且如果我不写任何东西按回车也会出现这个错误：

Traceback (most recent call last): File "C:/Users/User/PycharmProjects/practicas/firma_digital.py", line 80, in if tok.type is not None: AttributeError: 'NoneType' object has no attribute 'type'

这是我的全部代码：

import ply.lex as lex
import ply.yacc as yacc
FinishProgram=0
Enters=0
Fors=0
Whiles=0
ifs=0
elses=0
Switches=0

reserved = {
   'if' : 'IF',
   'for' : 'FOR',
   'while': 'WHILE',
   'else': 'ELSE',
   'switch': 'SWITCH'
}
tokens = [
    'ID',
    'COLON',
    'SEMICOLON',

    ]+ list(reserved.values()) #Palabras reservadas

t_COLON= r','
t_SEMICOLON=r';'


def t_ID(t):
    r'[a-zA-Z_][a-zA-Z0-9_]*'
    t.type = reserved.get(t.value, 'ID')
    return t

t_ignore=r' '

def t_newline(t):
    r'\n+'
    t.lexer.lineno += len(t.value)

def t_error(t):
    print("This thing failed")
    t.lexer.skip(1)

lexer=lex.lex()


#def p_gram_sets(p):
 #   '''

  #  gram : SETS SEMICOLON
   #      | empty
    #'''
    #if p[1]:
     #   print(p[1])
      #  print("SETS")



def p_empty(p):
    '''
    empty :
    '''
    p[0]=None





def p_error(p):
    print("Syntax error in input!")


parser=yacc.yacc()

while FinishProgram==0:
    s=input('')
    lexer.input(s)
    tok = lexer.token()

    if tok.type is not None:
        if tok.type=='IF':
            ifs+=1
        elif tok.type=='FOR':
            Fors+=1
        elif tok.type=='WHILE':
            Whiles+=1
        elif tok.type=='ELSE':
            elses+=1
        elif tok.type=='SWITCH':
            Switches+=1

    #parser.parse(s)
    if "exit" in s:
        print("Number of if´s: "+ str(ifs) + "\n"+"Number of for´s: "+str(Fors)+"\n"+"Number of While´s: "+str(Whiles)+"\n"+"Number of else´s: "+str(elses)+"\n"+"Number of switche´s: "+str(Switches)+"\n"+"Number of lines: "+str(tok.lineno))
        FinishProgram=1

Answer 1

并不是说 ply 没有计算换行符。它永远不会看到它们，因为您使用 input().

重复调用它

来自 Python 文档（强调已添加）：

input([prompt])

If the prompt argument is present, it is written to standard output without a trailing newline. The function then reads a line from input, converts it to a string (stripping a trailing newline), and returns that.

lex.lex的正常用法是

此外，您正在打印

... + str(tok.lineno)

而不是

... + str(lexer.lineno)

在最后一个标记被标记化后，lex.lex returns None，所以你可以期望 tok 在你的循环终止时成为 Null，并且因此尝试提取它的 lineno 属性是错误的。（但是，在您的情况下，只有当您刚刚尝试标记化的行为空时才会发生，因为您只使用每行的第一个标记。）您希望行数记录在词法分析器对象中，这是您更新的计数你的行动。

如果你想处理整个文件（这是解析器的常见情况，而不是逐行计算器），你需要读取文件的全部内容（或标准输入，视情况而定）也许）。对于非交互式使用，您通常会使用文件对象的 read 函数来实现。如果你想测试你的词法分析器，那么你将使用 lex 函数实现 Python 的迭代协议这一事实，因此它将在 for 语句中工作。所以你的主循环会是这样的：

import sys
lexer.input(sys.stdin.read())
for tok in lexer:
  # Update counts

并且您可以通过在行首键入文件结束字符来终止输入（Linux 上的 control-D 或 Windows 上的 control-Z）。

就我个人而言，我会使用 defaultdict:

实现令牌类型计数

from collections import defaultdict
counts = defaultdict(int)
for tok in lexer:
  counts[tok.type] += 1
for type, count in counts.items():
  print ("Number of %s's: %d\n" % (type, count))
# Or: print('\n'.join("Number of %s's: %d\n" % (type, count) for type, count in counts.items())
print ("Number of lines: %d\n" % lexer.lineno)

层数 Lex 不算入

Lex of ply is not counting enters

python

ply