用 Python 提取粗体字的位置

Question

我想提取在 .docx 文件中检测到的粗体字的位置。

为此，我使用了 docx 库，它成功地检测到粗体格式的单词。但是，仅提取单词不是很有用，因为您可能会找到相同的单词，但格式不同。

例如：

假设我的 file.docx 包含： "My cat is not a normal cat"

from docx import *

document = Document('/path/to/file.docx')
            def bold(document):
                for para in document.paragraphs:
                    Listbolds = []
                    for run in para.runs:
                        if run.bold:
                            print run.text
                            word = run.text
                            Listbolds.append(word)
                return Listbolds

这个函数会给我 "cat" 这个词作为输出。但是，如果我尝试通过那些非粗体字来过滤我的文本，并且我使用它，我也会消除第二个 "cat"，它不是粗体。

知道如何只得到这个词的位置吗？例如，获取2作为单词位置。

谢谢大家！

Answer 1

我没有得到 docx 库，只是通过查看代码，也许可以将其更改为 return 布尔列表？

document = Document('/path/to/file.docx')

def get_bold_list(para):
    bold_list = []
    for run in para.runs:
        bold_list.append(run.bold)
    return bold_list

for para in document.paragraphs:
    bold_list = get_bold_list(para)
    #do something with bold_list

用 Python 提取粗体字的位置

Extract positions of bold words with Python

python

docx