使用 PrettyTable 打印出目录中 Python 个文件的详细信息
Using PrettyTable to print out details of Python files in directory
我正在尝试编写一个 FileAnalyzer class,它将在目录中搜索 Python 文件,并以 PrettyTable 的形式提供每个 Python 文件的详细信息。我对每个 Python 文件中的 classes、函数、行和字符的数量感兴趣。
了解 OOP 的诀窍...这是我目前拥有的代码:
class FileAnalyzer:
def __init__(self, directory: str) -> None:
"""
The files_summary attribute stores the summarized data for each Python file in the specified directory.
"""
self.directory: str = os.listdir(directory) #Directory to be scanned
self.analyze_files() # summarize the python files data
self.files_summary: Dict[str, Dict[str, int]] = {
dir: {
'Number of Classes': cls,
'Number of Functions': funccount,
'Number of Lines of Code': codelines,
'Number of Characters': characters
}
}
def analyze_files(self) -> None:
"""
This method scans a directory for python files. For every python file, it determines the number of classes,
functions, lines of code, and characters. The count for each one is returned in a tuple.
"""
for dir in self.directory:
if dir.endswith('.py'): # Check for python files
with open(dir, "r") as pyfile:
cls = 0 # Initialize classes count
for line in pyfile:
if line.startswith('Class'):
cls += 1
funccount = 0 # Initialize function count
for line in pyfile:
if line.startswith('def'):
funccount += 1
#Get number of lines of code
i = -1 #Account for empty files
for i, line in enumerate(pyfile):
pass
codelines = i + 1
#Get number of characters
characters = 0
characters += sum(len(line) for line in pyfile)
return [cls, funccount, codelines, characters]
def pretty_print(self) -> None:
"""
This method creates a table with the desired counts from the Python files using the PrettyTable module.
"""
pt: PrettyTable = PrettyTable(field_names=['# of Classes', '# of Functions', '# Lines of Code (Excluding Comments)',
'# of characters in file (Including Comments)'])
for cls, funccount, codelines, characters in self.files_summary():
pt.add_row([cls, funccount, codelines, characters])
print(pt)
FileAnalyzer('/path/to/directory/withpythonfiles')
当我尝试 运行 代码时,当前出现 NameError: name 'cls' is not defined
错误。在 __init__
中调用 self.analyze_files()
是否不足以将返回值传递给 __init__
?理想情况下,对于
的python文件
def func1():
pass
def func2():
pass
class Foo:
def __init__(self):
pass
class Bar:
def __init__(self):
pass
if __name__ == "__main__":
main()
PrettyTable 会告诉我有 2 个 classes、4 个函数、25 行和 270 个字符。对于以下文件:
definitely not function
This is def not a function def
PrettyTable 会告诉我该文件有 0 个函数。我希望 self.analyze_files()
将汇总数据填充到 self.files_summary
中,而不将任何其他参数传递给 analyze_files()
。并且同样将数据从 files_summary
传递到 pretty_print
而没有将单独的参数传递给 pretty_print
.
编辑:
self.files_summary: Dict[str, Dict[str, int]] = {
dir: {
'Number of Classes': self.analyze_files()[0],
'Number of Functions': self.analyze_files()[1],
'Number of Lines of Code': self.analyze_files()[2],
'Number of Characters': self.analyze_files()[3]
}
}
抑制了错误,但是
for self.analyze_files()[0], self.analyze_files()[1], self.analyze_files()[2], self.analyze_files()[3] in self.files_summary():
pt.add_row([self.analyze_files()[0], self.analyze_files()[1], self.analyze_files()[2], self.analyze_files()[3]])
return pt
当我调用 FileAnalyzer 时,in pretty_print
没有执行任何操作...
这个问题有点宽泛,所以很难给出一个简洁的答案。你在评论中说:
If I do something like [cls, funccount, codelines, characters] = self.analyze_files()
within init, doesn't seem to reference the returned values properly either
虽然在风格上有点奇怪,但这实际上是非常好的语法。如果您的 __init__
方法看起来像这样,它 运行 没有错误:
def __init__(self, directory: str) -> None:
"""
The files_summary attribute stores the summarized data for each Python file in the specified directory.
"""
self.directory: str = os.listdir(directory) #Directory to be scanned
[cls, funccount, codelines, characters] = self.analyze_files()
self.files_summary: Dict[str, Dict[str, int]] = {
dir: {
'Number of Classes': cls,
'Number of Functions': funccount,
'Number of Lines of Code': codelines,
'Number of Characters': characters
}
}
然而,存在许多问题。首先,在上面的方法中,您使用了变量名 dir
,但作用域中没有这样的变量。不幸的是,dir
也是 Python built-in 函数的名称。如果你在这段代码之后插入一个 breakpoint()
并打印 self.files_summary
的值,你会看到它看起来像这样:
{<built-in function dir>: {'Number of Classes': 0, 'Number of Functions': 0, 'Number of Lines of Code': 0, 'Number of Characters': 0}}
一般来说,永远不要选择隐藏 Python built-in 的变量名,因为它会导致意外且难以调试的问题。如果您使用支持 Python 语法高亮显示的像样的编辑器,您会看到这些 built-in 被调用,这样您就可以避免这个错误。
我 认为 而不是 dir
你的意思是 self.directory
(或者只是 directory
,因为此时该变量在范围内) .
但是还有一个问题。
在您的 pretty_print
方法中,您正在尝试 调用 self.files_summary
,如下所示:
for cls, funccount, codelines, characters in self.files_summary():
但是 self.files_summary
不是函数,不可调用。它是一本字典,这也意味着在这样的 for
循环中使用它并没有什么意义。由于您在 __init__
.
中设置的方式,它只会有一个键
如果我是你,我会把这个程序分解成单独的部分,并在尝试将它们组合在一起之前先让每个部分都正常工作。充分利用交互式 Python 提示符和调试器;在使用变量之前,在代码中使用 breakpoint()
语句来调查变量的内容。
如果我要重写你的代码,我可能会这样做:
import os
import re
from prettytable import PrettyTable
re_class = re.compile(r'class')
re_def = re.compile(r'\s*def')
class FileAnalyzer:
def __init__(self, path: str) -> None:
self.path = path
self.analyze_files()
def analyze_files(self) -> None:
self.files = []
for entry in os.listdir(self.path):
if not entry.endswith('.py'):
continue
with open(entry, "r") as pyfile:
cls = 0
funccount = 0
codelines = 0
characters = 0
for line in pyfile:
codelines += 1
characters += len(line)
if re_class.match(line):
cls += 1
elif re_def.match(line):
funccount += 1
self.files.append((entry, cls, funccount, codelines, characters))
def pretty_print(self) -> None:
pt: PrettyTable = PrettyTable(
field_names=['Filename',
'# of Classes', '# of Functions',
'# Lines of Code (Excluding Comments)',
'# of characters in file (Including Comments)'])
for path, cls, funccount, codelines, characters in self.files:
pt.add_row([path, cls, funccount, codelines, characters])
print(pt)
x = FileAnalyzer('.')
x.pretty_print()
请注意,我在您的 analyze_files
函数中删除了多个 for
循环;没有理由多次遍历每个文件。这将构建一个名为 files
的实例变量,它是一个结果列表。 pretty_print
方法简单地迭代这个列表。
如果我 运行 我的 Python 临时目录中的上述代码,我得到:
+--------------------------+--------------+----------------+--------------------------------------+----------------------------------------------+
| Filename | # of Classes | # of Functions | # Lines of Code (Excluding Comments) | # of characters in file (Including Comments) |
+--------------------------+--------------+----------------+--------------------------------------+----------------------------------------------+
| yamltest.py | 0 | 0 | 30 | 605 |
| analyzer.py | 1 | 3 | 53 | 1467 |
| quake.py | 0 | 0 | 37 | 1035 |
| test_compute_examples.py | 1 | 1 | 10 | 264 |
| compute_examples.py | 1 | 1 | 4 | 82 |
+--------------------------+--------------+----------------+--------------------------------------+----------------------------------------------+
我正在尝试编写一个 FileAnalyzer class,它将在目录中搜索 Python 文件,并以 PrettyTable 的形式提供每个 Python 文件的详细信息。我对每个 Python 文件中的 classes、函数、行和字符的数量感兴趣。
了解 OOP 的诀窍...这是我目前拥有的代码:
class FileAnalyzer:
def __init__(self, directory: str) -> None:
"""
The files_summary attribute stores the summarized data for each Python file in the specified directory.
"""
self.directory: str = os.listdir(directory) #Directory to be scanned
self.analyze_files() # summarize the python files data
self.files_summary: Dict[str, Dict[str, int]] = {
dir: {
'Number of Classes': cls,
'Number of Functions': funccount,
'Number of Lines of Code': codelines,
'Number of Characters': characters
}
}
def analyze_files(self) -> None:
"""
This method scans a directory for python files. For every python file, it determines the number of classes,
functions, lines of code, and characters. The count for each one is returned in a tuple.
"""
for dir in self.directory:
if dir.endswith('.py'): # Check for python files
with open(dir, "r") as pyfile:
cls = 0 # Initialize classes count
for line in pyfile:
if line.startswith('Class'):
cls += 1
funccount = 0 # Initialize function count
for line in pyfile:
if line.startswith('def'):
funccount += 1
#Get number of lines of code
i = -1 #Account for empty files
for i, line in enumerate(pyfile):
pass
codelines = i + 1
#Get number of characters
characters = 0
characters += sum(len(line) for line in pyfile)
return [cls, funccount, codelines, characters]
def pretty_print(self) -> None:
"""
This method creates a table with the desired counts from the Python files using the PrettyTable module.
"""
pt: PrettyTable = PrettyTable(field_names=['# of Classes', '# of Functions', '# Lines of Code (Excluding Comments)',
'# of characters in file (Including Comments)'])
for cls, funccount, codelines, characters in self.files_summary():
pt.add_row([cls, funccount, codelines, characters])
print(pt)
FileAnalyzer('/path/to/directory/withpythonfiles')
当我尝试 运行 代码时,当前出现 NameError: name 'cls' is not defined
错误。在 __init__
中调用 self.analyze_files()
是否不足以将返回值传递给 __init__
?理想情况下,对于
def func1():
pass
def func2():
pass
class Foo:
def __init__(self):
pass
class Bar:
def __init__(self):
pass
if __name__ == "__main__":
main()
PrettyTable 会告诉我有 2 个 classes、4 个函数、25 行和 270 个字符。对于以下文件:
definitely not function
This is def not a function def
PrettyTable 会告诉我该文件有 0 个函数。我希望 self.analyze_files()
将汇总数据填充到 self.files_summary
中,而不将任何其他参数传递给 analyze_files()
。并且同样将数据从 files_summary
传递到 pretty_print
而没有将单独的参数传递给 pretty_print
.
编辑:
self.files_summary: Dict[str, Dict[str, int]] = {
dir: {
'Number of Classes': self.analyze_files()[0],
'Number of Functions': self.analyze_files()[1],
'Number of Lines of Code': self.analyze_files()[2],
'Number of Characters': self.analyze_files()[3]
}
}
抑制了错误,但是
for self.analyze_files()[0], self.analyze_files()[1], self.analyze_files()[2], self.analyze_files()[3] in self.files_summary():
pt.add_row([self.analyze_files()[0], self.analyze_files()[1], self.analyze_files()[2], self.analyze_files()[3]])
return pt
当我调用 FileAnalyzer 时,in pretty_print
没有执行任何操作...
这个问题有点宽泛,所以很难给出一个简洁的答案。你在评论中说:
If I do something like
[cls, funccount, codelines, characters] = self.analyze_files()
within init, doesn't seem to reference the returned values properly either
虽然在风格上有点奇怪,但这实际上是非常好的语法。如果您的 __init__
方法看起来像这样,它 运行 没有错误:
def __init__(self, directory: str) -> None:
"""
The files_summary attribute stores the summarized data for each Python file in the specified directory.
"""
self.directory: str = os.listdir(directory) #Directory to be scanned
[cls, funccount, codelines, characters] = self.analyze_files()
self.files_summary: Dict[str, Dict[str, int]] = {
dir: {
'Number of Classes': cls,
'Number of Functions': funccount,
'Number of Lines of Code': codelines,
'Number of Characters': characters
}
}
然而,存在许多问题。首先,在上面的方法中,您使用了变量名 dir
,但作用域中没有这样的变量。不幸的是,dir
也是 Python built-in 函数的名称。如果你在这段代码之后插入一个 breakpoint()
并打印 self.files_summary
的值,你会看到它看起来像这样:
{<built-in function dir>: {'Number of Classes': 0, 'Number of Functions': 0, 'Number of Lines of Code': 0, 'Number of Characters': 0}}
一般来说,永远不要选择隐藏 Python built-in 的变量名,因为它会导致意外且难以调试的问题。如果您使用支持 Python 语法高亮显示的像样的编辑器,您会看到这些 built-in 被调用,这样您就可以避免这个错误。
我 认为 而不是 dir
你的意思是 self.directory
(或者只是 directory
,因为此时该变量在范围内) .
但是还有一个问题。
在您的 pretty_print
方法中,您正在尝试 调用 self.files_summary
,如下所示:
for cls, funccount, codelines, characters in self.files_summary():
但是 self.files_summary
不是函数,不可调用。它是一本字典,这也意味着在这样的 for
循环中使用它并没有什么意义。由于您在 __init__
.
如果我是你,我会把这个程序分解成单独的部分,并在尝试将它们组合在一起之前先让每个部分都正常工作。充分利用交互式 Python 提示符和调试器;在使用变量之前,在代码中使用 breakpoint()
语句来调查变量的内容。
如果我要重写你的代码,我可能会这样做:
import os
import re
from prettytable import PrettyTable
re_class = re.compile(r'class')
re_def = re.compile(r'\s*def')
class FileAnalyzer:
def __init__(self, path: str) -> None:
self.path = path
self.analyze_files()
def analyze_files(self) -> None:
self.files = []
for entry in os.listdir(self.path):
if not entry.endswith('.py'):
continue
with open(entry, "r") as pyfile:
cls = 0
funccount = 0
codelines = 0
characters = 0
for line in pyfile:
codelines += 1
characters += len(line)
if re_class.match(line):
cls += 1
elif re_def.match(line):
funccount += 1
self.files.append((entry, cls, funccount, codelines, characters))
def pretty_print(self) -> None:
pt: PrettyTable = PrettyTable(
field_names=['Filename',
'# of Classes', '# of Functions',
'# Lines of Code (Excluding Comments)',
'# of characters in file (Including Comments)'])
for path, cls, funccount, codelines, characters in self.files:
pt.add_row([path, cls, funccount, codelines, characters])
print(pt)
x = FileAnalyzer('.')
x.pretty_print()
请注意,我在您的 analyze_files
函数中删除了多个 for
循环;没有理由多次遍历每个文件。这将构建一个名为 files
的实例变量,它是一个结果列表。 pretty_print
方法简单地迭代这个列表。
如果我 运行 我的 Python 临时目录中的上述代码,我得到:
+--------------------------+--------------+----------------+--------------------------------------+----------------------------------------------+
| Filename | # of Classes | # of Functions | # Lines of Code (Excluding Comments) | # of characters in file (Including Comments) |
+--------------------------+--------------+----------------+--------------------------------------+----------------------------------------------+
| yamltest.py | 0 | 0 | 30 | 605 |
| analyzer.py | 1 | 3 | 53 | 1467 |
| quake.py | 0 | 0 | 37 | 1035 |
| test_compute_examples.py | 1 | 1 | 10 | 264 |
| compute_examples.py | 1 | 1 | 4 | 82 |
+--------------------------+--------------+----------------+--------------------------------------+----------------------------------------------+