如何在 python 文件中输出所有未分配的字符串

Question

我有一个 python 文件（脚本），如下所示：

script.py

"""
Multiline comment with unique
text pertaining to the Foo class
"""
class Foo():
    pass


"""
Multiline comment with unique
text pertaining to the Bar class
"""
class Bar():
    pass


"""
Multiline comment with unique
text pertaining to the FooBar class
"""
class FooBar():
    pass


def print_comments():
    # NotImplementedError

有什么方法可以让 print_comments 检测并输出所有未分配的字符串，这样我就可以看到这个：

Multiline comment with unique text pertaining to the Foo class

Multiline comment with unique text pertaining to the Bar class

Multiline comment with unique text pertaining to the FooBar class

Answer 1

假设您在问题中指出的格式，应该是这样的：

class Show_Script():
    def construct(self):
        with open(os.path.abspath(__file__)) as f:
            my_lines = f.readlines()

        comments = []
        in_comment = 0

        for line in my_lines:
            # detected the start of a comment
            if line.strip().startswith('"""') and in_comment == 0:
                in_comment = 1
                comments.append('')
            # detected the end of a comment
            elif line.strip().endswith('"""') and in_comment == 1:
                in_comment = 0
            # the contents of a comment
            elif in_comment == 1:
                comments[-1] += line

        print '\n'.join(comments)

Answer 2

使用正则表达式：

$ cat script.py
from __future__ import print_function
import sys, re

"""
Multiline comment with unique
text pertaining to the Foo class
"""
class Foo():
    pass


"""
Multiline comment with unique
text pertaining to the Bar class
"""
class Bar():
    pass


"""
Multiline comment with unique
text pertaining to the FooBar class
"""
class FooBar():
    pass

def print_comments():
    with open(sys.argv[0]) as f:
        file_contents = f.read()

    map(print, re.findall(r'"""\n([^"""]*)"""', file_contents, re.S))

print_comments()
$ python script.py
Multiline comment with unique
text pertaining to the Foo class

Multiline comment with unique
text pertaining to the Bar class

Multiline comment with unique
text pertaining to the FooBar class

正则表达式解释：

"""\n([^"""]*)"""

Debuggex Demo

执行此操作的理想方法是使用 ast 模块，解析整个文档，然后在类型为 ast.FunctionDef、ast.ClassDef 或 [ 的所有节点上打印调用 ast.get_docstring =28=]。但是，您的评论不是文档字符串。如果文件是这样的：

$ cat script.py

import sys, re, ast

class Foo():
    """
    Multiline comment with unique
    text pertaining to the Foo class
    """
    pass


class Bar():
    """
    Multiline comment with unique
    text pertaining to the Bar class
    """
    pass


class FooBar():
    """
    Multiline comment with unique
    text pertaining to the FooBar class
    """
    pass

def print_docstrings():
    with open(sys.argv[0]) as f:
        file_contents = f.read()

    tree = ast.parse(file_contents)
    class_nodes = filter((lambda x: type(x) in [ast.ClassDef, ast.FunctionDef, ast.Module]), ast.walk(tree))
    for node in class_nodes:
        doc_str = ast.get_docstring(node)
        if doc_str:
            print doc_str

print_docstrings()

$ python script.py
Multiline comment with unique
text pertaining to the Foo class
Multiline comment with unique
text pertaining to the Bar class
Multiline comment with unique
text pertaining to the FooBar class

如何在 python 文件中输出所有未分配的字符串

How to output all unassigned strings in a python file

python

string

unassigned-variable

script.py