Python 用模式内的编号替换

Python substitute with numbering inside patterns

我正在尝试想出一个 python 脚本来自动为 pandoc markdown 中的脚注编号。给出这样的输入:

This is a testing document for testing purposes only.[^0] This is a testing document for testing purposes only. This is a testing document for testing purposes only.[^121][^5] This is a testing document for testing purposes only.

[^0]: Footnote contents.

[^0]: Footnote contents.

[^0]: Footnote contents.

它应该产生这样的输出:

This is a testing document for testing purposes only.[^1] This is a testing document for testing purposes only. This is a testing document for testing purposes only.[^2][^3] This is a testing document for testing purposes only.

[^1]: Footnote contents.

[^2]: Footnote contents.

[^3]: Footnote contents.

我已经能够使用基本功能,但我仍然无法解决如何在一行中包含两个脚注的情况。也许循环不应该基于行?还是我应该选择某种嵌套循环,替换模式的第 n 次出现(据我从 this 问题中理解,这不是微不足道的)?

由于我正在尝试从中尽可能多地学习,请随时留下任何评论或指示以进一步改进。谢谢!

这是我目前的脚本:

import re
from sys import argv

script, first = argv

i=0
n=0
buff = ''

# open the file specified through the first argument
f = open(first, 'rU')

for line in f:
    if re.search(r'\[\^[0-9]+\]:', line):
        i += 1
        line2 = re.sub(r'\[\^[0-9]+\]:', '[^' + str(i) + ']:', line)
        buff += line2

    elif re.search(r'\[\^[0-9]+\]', line):
        n += 1
        line3 = re.sub(r'\[\^[0-9]+\]', '[^' + str(n) + ']', line)
        buff += line3

    else:
        buff += line

print buff

f.close()
my_text="""This is a testing document for testing purposes only.[^0] This is a testing document for testing purposes only. This is a testing document for testing purposes only.[^121][^5] This is a testing document for testing purposes only.

[^0]: Footnote contents.

[^0]: Footnote contents.

[^0]: Footnote contents."""


num_notes = len(re.findall("\[\^\d+\]",my_text))
i = -1 
def do_sub(m):
    global i
    i+=1
    return "[^%d]"%(i if i < num_notes//2 else i-num_notes//2)

re.sub("\[\^\d+\]",do_sub,my_text)

我想会如你所愿