突出显示打印语句中的子字符串

Question

我一直在尝试直接在 jupyter notebook 中编写一个简短的脚本运行。它只是在 pandas df 中滚动文本（平均 400 个单词）并要求用户提供标签。

我正在努力寻找一个优雅的解决方案来突出显示要打印的文本中的所有子字符串 'eu'。

在另一个线程中，我发现了我用来突出显示 "eu" 子字符串的 printmd 函数。然而，这只适用于第一次出现，并且也打破了线条。

import sys
from IPython.display import clear_output
from IPython.display import Markdown, display

def printmd(string):
    display(Markdown(string))
printmd('**bold**')

labels = []

for i in range(0,len(SampleDf)):

    clear_output() # clear the output before displaying another article
    print(SampleDf.loc[i]['article_title'])

    lc = SampleDf.loc[i]['article_body'].lower() # the search is case sensitive
    pos = lc.find('eu') # where is the 'eu' mentioned

    print(SampleDf.loc[i]['article_body'][:pos])
    printmd('**eu**')

    print(SampleDf.loc[i]['article_body'][pos+2:])

    var = input("press y if the text is irrelevant" )

    if var == 'y':
        label = 0   # 0 for thrash
    else: 
        label = 1   # 1 for relevant

    labels.append(label)

我很想摆脱单独打印语句引入的换行符，并突出显示所有提到的 "eu"。

Answer 1

看这个是字符串处理，不是输出问题。如果我正确理解您的需求，这是一个简单的 replace 用法：

new_text = old_text.replace("eu", "**eu**")

如果你还需要你的单令牌模式，那么抑制换行很简单，为此目的使用 print 参数：

print('**eu**', end='')

突出显示打印语句中的子字符串

Highlighting substrings in print statements

python

regex

ipython

jupyter-notebook