re.sub 改变替换的方向 Persian/Arabic 内容

re.sub change the direction of the replaced Persian/Arabic content

这是我的代码:

import re

CUSTOMIZED_SUB_PATTERN = "\{{\{{(?:\s)*{tag_key}(?:\s)*\|(?:\s)*([^|}}]+)(?:\s)*\}}\}}"
pattern = re.compile(CUSTOMIZED_SUB_PATTERN.format(tag_key='name'))
title = "عزیز {{ name | default value 1}} سلام"
re.sub(pattern, "محمد", title)

输出:

'عزیز محمد سلام'

但是我想要的是:

'سلام محمد عزیز'

所以你可以看到句子的方向已经通过替换改变了。

问题: 我该如何解决这个问题?

这个答案不正确

>>> x = 'walk down street'
>>> x.split(' ')
['walk', 'down', 'street']
>>> x.split(' ')[::-1]
['street', 'down', 'walk']
>>> ' '.join(x.split(' ')[::-1])
'street down walk'

希望我帮到你了!

通过此模块,您可以更正您的文字形状方向。只需安装 pips 并使用它。

# install: pip install --upgrade arabic-reshaper
import arabic_reshaper

# install: pip install python-bidi
from bidi.algorithm import get_display

text = "ذهب الطالب الى المدرسة"
reshaped_text = arabic_reshaper.reshape(text)    # correct its shape
bidi_text = get_display(reshaped_text)           # correct its direction

您可以使用 bidi and arabic_reshaper 库来相应地重塑和替换 RTL 文本。

There is a special option in get_Display() method which is base_dir which has ‘L’ or ‘R’, override the calculated base_level.

你可以试试:

import re
import arabic_reshaper
from bidi.algorithm import get_display

title = "عزیز {{ name | defalue value 1}} سلام"
substr = "محمد"
reshaped_text = arabic_reshaper.reshape(title) 
new_title = get_display(reshaped_text, base_dir = 'L') # 'L' option indicates the text to appear from Left to Right. By default; it is RTL for Arabic texts.       
reshaped_text2 = arabic_reshaper.reshape(substr)
new_substr = get_display(reshaped_text2, base_dir = 'L')

CUSTOMIZED_SUB_PATTERN = "\{{\{{(?:\s)*{tag_key}(?:\s)*\|(?:\s)*([^|}}]+)(?:\s)*\}}\}}"
pattern = re.compile(CUSTOMIZED_SUB_PATTERN.format(tag_key='name'))
print(re.sub(pattern, new_substr, new_title))

您可以在 here.

中找到上述实施的示例 运行 结果
from reportlab.lib.units import mm
from reportlab.lib.pagesizes import A4
from reportlab.pdfgen import canvas
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
import arabic_reshaper
from bidi.algorithm import get_display

pdfmetrics.registerFont(TTFont('Scheherazade', 'Janna LT Bold.ttf'))

page = canvas.Canvas("test.pdf", pagesize=A4)
page.setFont('Scheherazade', 12)
text_to_be_reshaped = 'اللغة العربية رائعة'
reshaped_text = arabic_reshaper.reshape(text_to_be_reshaped)
bidi_text = get_display(reshaped_text)
page.drawString(10*mm, 267*mm, bidi_text) 
page.showPage()
page.save()