re.sub 改变替换的方向 Persian/Arabic 内容
re.sub change the direction of the replaced Persian/Arabic content
这是我的代码:
import re
CUSTOMIZED_SUB_PATTERN = "\{{\{{(?:\s)*{tag_key}(?:\s)*\|(?:\s)*([^|}}]+)(?:\s)*\}}\}}"
pattern = re.compile(CUSTOMIZED_SUB_PATTERN.format(tag_key='name'))
title = "عزیز {{ name | default value 1}} سلام"
re.sub(pattern, "محمد", title)
输出:
'عزیز محمد سلام'
但是我想要的是:
'سلام محمد عزیز'
所以你可以看到句子的方向已经通过替换改变了。
问题:
我该如何解决这个问题?
这个答案不正确
>>> x = 'walk down street'
>>> x.split(' ')
['walk', 'down', 'street']
>>> x.split(' ')[::-1]
['street', 'down', 'walk']
>>> ' '.join(x.split(' ')[::-1])
'street down walk'
希望我帮到你了!
通过此模块,您可以更正您的文字形状方向。只需安装 pips 并使用它。
# install: pip install --upgrade arabic-reshaper
import arabic_reshaper
# install: pip install python-bidi
from bidi.algorithm import get_display
text = "ذهب الطالب الى المدرسة"
reshaped_text = arabic_reshaper.reshape(text) # correct its shape
bidi_text = get_display(reshaped_text) # correct its direction
您可以使用 bidi
and arabic_reshaper
库来相应地重塑和替换 RTL 文本。
There is a special option in get_Display() method which is base_dir
which has ‘L’ or ‘R’, override the calculated base_level.
你可以试试:
import re
import arabic_reshaper
from bidi.algorithm import get_display
title = "عزیز {{ name | defalue value 1}} سلام"
substr = "محمد"
reshaped_text = arabic_reshaper.reshape(title)
new_title = get_display(reshaped_text, base_dir = 'L') # 'L' option indicates the text to appear from Left to Right. By default; it is RTL for Arabic texts.
reshaped_text2 = arabic_reshaper.reshape(substr)
new_substr = get_display(reshaped_text2, base_dir = 'L')
CUSTOMIZED_SUB_PATTERN = "\{{\{{(?:\s)*{tag_key}(?:\s)*\|(?:\s)*([^|}}]+)(?:\s)*\}}\}}"
pattern = re.compile(CUSTOMIZED_SUB_PATTERN.format(tag_key='name'))
print(re.sub(pattern, new_substr, new_title))
您可以在 here.
中找到上述实施的示例 运行 结果
from reportlab.lib.units import mm
from reportlab.lib.pagesizes import A4
from reportlab.pdfgen import canvas
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
import arabic_reshaper
from bidi.algorithm import get_display
pdfmetrics.registerFont(TTFont('Scheherazade', 'Janna LT Bold.ttf'))
page = canvas.Canvas("test.pdf", pagesize=A4)
page.setFont('Scheherazade', 12)
text_to_be_reshaped = 'اللغة العربية رائعة'
reshaped_text = arabic_reshaper.reshape(text_to_be_reshaped)
bidi_text = get_display(reshaped_text)
page.drawString(10*mm, 267*mm, bidi_text)
page.showPage()
page.save()
这是我的代码:
import re
CUSTOMIZED_SUB_PATTERN = "\{{\{{(?:\s)*{tag_key}(?:\s)*\|(?:\s)*([^|}}]+)(?:\s)*\}}\}}"
pattern = re.compile(CUSTOMIZED_SUB_PATTERN.format(tag_key='name'))
title = "عزیز {{ name | default value 1}} سلام"
re.sub(pattern, "محمد", title)
输出:
'عزیز محمد سلام'
但是我想要的是:
'سلام محمد عزیز'
所以你可以看到句子的方向已经通过替换改变了。
问题: 我该如何解决这个问题?
这个答案不正确
>>> x = 'walk down street'
>>> x.split(' ')
['walk', 'down', 'street']
>>> x.split(' ')[::-1]
['street', 'down', 'walk']
>>> ' '.join(x.split(' ')[::-1])
'street down walk'
希望我帮到你了!
通过此模块,您可以更正您的文字形状方向。只需安装 pips 并使用它。
# install: pip install --upgrade arabic-reshaper
import arabic_reshaper
# install: pip install python-bidi
from bidi.algorithm import get_display
text = "ذهب الطالب الى المدرسة"
reshaped_text = arabic_reshaper.reshape(text) # correct its shape
bidi_text = get_display(reshaped_text) # correct its direction
您可以使用 bidi
and arabic_reshaper
库来相应地重塑和替换 RTL 文本。
There is a special option in get_Display() method which is
base_dir
which has ‘L’ or ‘R’, override the calculated base_level.
你可以试试:
import re
import arabic_reshaper
from bidi.algorithm import get_display
title = "عزیز {{ name | defalue value 1}} سلام"
substr = "محمد"
reshaped_text = arabic_reshaper.reshape(title)
new_title = get_display(reshaped_text, base_dir = 'L') # 'L' option indicates the text to appear from Left to Right. By default; it is RTL for Arabic texts.
reshaped_text2 = arabic_reshaper.reshape(substr)
new_substr = get_display(reshaped_text2, base_dir = 'L')
CUSTOMIZED_SUB_PATTERN = "\{{\{{(?:\s)*{tag_key}(?:\s)*\|(?:\s)*([^|}}]+)(?:\s)*\}}\}}"
pattern = re.compile(CUSTOMIZED_SUB_PATTERN.format(tag_key='name'))
print(re.sub(pattern, new_substr, new_title))
您可以在 here.
中找到上述实施的示例 运行 结果from reportlab.lib.units import mm
from reportlab.lib.pagesizes import A4
from reportlab.pdfgen import canvas
from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont
import arabic_reshaper
from bidi.algorithm import get_display
pdfmetrics.registerFont(TTFont('Scheherazade', 'Janna LT Bold.ttf'))
page = canvas.Canvas("test.pdf", pagesize=A4)
page.setFont('Scheherazade', 12)
text_to_be_reshaped = 'اللغة العربية رائعة'
reshaped_text = arabic_reshaper.reshape(text_to_be_reshaped)
bidi_text = get_display(reshaped_text)
page.drawString(10*mm, 267*mm, bidi_text)
page.showPage()
page.save()