Python 脚本用用户提供的值替换 HTML 文件中的文本

Question

我有一个 hmtl 文件，如下所示：

...
<!-- Special_ID -->
<p> stuff1 </p>
<p> stuff2 </p>
<!-- /Special_ID -->
...

我有一个 INI 文件：

[general]
param=stuff1
 stuff2

如果用户编辑文件并将 param 值更改为 test，我希望将 html 文件更改为：

...
<!-- Special_ID -->
<p> test </p>
<!-- /Special_ID -->
...

目前，我正在做的是解析INI文件（Python的ConfigParser），然后将部分（"general"）和选项（"param") 像上面的例子一样变成一个开始和停止的特殊 id。

然后：

while we haven't found the start id:
    just write a line to some temporary file

write our start id to the temp file
write out new value ("test") to the temp file # surround with <p>

loop through original file until we find the stop id
then write the stop id and the rest of the file to temp

replace original file with tmp file

有更聪明的方法吗？

也许 Python 模块已经这样做了。

我也不是特别喜欢要求 ，但我没有使用网络框架（只是一个简单的应用程序），所以我不能只是做一个花哨的 <p py:for ...>...就像在 TurboGears 中一样。

Answer 1

总体上不确定您目前提出的方法，但您可以通过以下方式在特定评论后替换所有 p 元素并插入新的 p 元素（使用 BeautifulSoup HTML 解析器）。想法是：

在HTML
遍历所有 p sibling elements
删除每个 p 元素 .extract()
使用.insert_after()在评论后插入一个新的p元素

工作代码：

from bs4 import BeautifulSoup, Comment

data = """
<!-- Special_ID -->
<p> stuff1 </p>
<p> stuff2 </p>
<!-- /Special_ID -->
"""
soup = BeautifulSoup(data, "html.parser")

# find "Special_ID" comment
special_id = soup.find(text=lambda text: isinstance(text, Comment) and "Special_ID" in text)

# find all sibling "p" elements
for p in special_id.find_next_siblings("p"):
    p.extract()

# create new "p" element
tag = soup.new_tag("p")
tag.string = "test"

# insert the new "p" element after the comment
special_id.insert_after(tag)

print(soup.prettify())

打印：

<!-- Special_ID -->
<p>
 test
</p>
<!-- /Special_ID -->

Python 脚本用用户提供的值替换 HTML 文件中的文本

Python script to substitute text in HTML file with user-supplied values

html

python

substitution

python-2.7