将字符串中的单词序列替换为字符串 python

Replace sequence of words in string with string python

我找不到任何可以解决这个问题的方法(replace() 方法不起作用)。

我有这样一句话:

sentence_noSlots = "Albania compared to other CountriesThe Internet users of Albania is similar to that of Poland , Portugal , Russia , Macedonia , Saudi Arabia , Argentina , Greece , Dominica , Azerbaijan , Italy with a respective Internet users of 62.8 , 62.1 , 61.4 , 61.2 , 60.5 , 59.9 , 59.9 , 59.0 , 58.7 , 58.5 -LRB- per 100 people -RRB- and a global rank of 62 , 63 , 64 , 65 , 66 , 68 , 69 , 70 , 71 , 72.10 years growthAlbania 's Internet users had a positive growth of 5,910 -LRB- % -RRB- in the last 10 years from -LRB- 2003 to 2013 -RRB- ."

然后我有一个像这样的字符串:

extracted_country = Saudi Arabia 
extracted_value = 58.5

我需要将字符串中的 Saudi Arabia 替换为 <location>empty</location>,将 58.5 替换为 <number>empty</number>。我目前的方法是:

sentence_noSlots.replace(str(extracted_country),"<location>empty</location>")
sentence_noSlots.replace(str(extracted_value),"<number>empty</number>")

但是因为 Saudi Arabia 是两个单词,简单的单词替换是行不通的。由于同一类型的问题,首先标记化和替换工作也不起作用:

 sentenceTokens = sentence_noSlots.split()
                            for i,token in enumerate(sentenceTokens):
                                if token==extracted_country:
                                    sentenceTokens[i]="<location>empty</location>"
                                if token==extracted_value:
                                    sentenceTokens[i]="<number>empty</number>"
                            sentence_noSlots = (" ").join(sentenceTokens)

我怎样才能达到我想达到的目的?

string.replace() 不在位。 python.

中的字符串是不可变的

来自python docs

string.replace(s, old, new[, maxreplace]) Return a copy of string s with all occurrences of substring old replaced by new. If the optional argument maxreplace is given, the first maxreplace occurrences are replaced.

这样做:

>>> sentence_noSlots = "Albania compared to other CountriesThe Internet users of Albania is similar to that of Poland , Portugal , Russia , Macedonia , Saudi Arabia , Argentina , Greece , Dominica , Azerbaijan , Italy with a respective Internet users of 62.8 , 62.1 , 61.4 , 61.2 , 60.5 , 59.9 , 59.9 , 59.0 , 58.7 , 58.5 -LRB- per 100 people -RRB- and a global rank of 62 , 63 , 64 , 65 , 66 , 68 , 69 , 70 , 71 , 72.10 years growthAlbania 's Internet users had a positive growth of 5,910 -LRB- % -RRB- in the last 10 years from -LRB- 2003 to 2013 -RRB- ."
>>> 
>>> extracted_country = "Saudi Arabia"
>>> extracted_value = 58.5
>>> s = sentence_noSlots.replace(str(extracted_country),"<location>empty</location>").replace(str(extracted_value),"<number>empty</number>")
>>> s
"Albania compared to other CountriesThe Internet users of Albania is similar to that of Poland , Portugal , Russia , Macedonia , <location>empty</location> , Argentina , Greece , Dominica , Azerbaijan , Italy with a respective Internet users of 62.8 , 62.1 , 61.4 , 61.2 , 60.5 , 59.9 , 59.9 , 59.0 , 58.7 , <number>empty</number> -LRB- per 100 people -RRB- and a global rank of 62 , 63 , 64 , 65 , 66 , 68 , 69 , 70 , 71 , 72.10 years growthAlbania 's Internet users had a positive growth of 5,910 -LRB- % -RRB- in the last 10 years from -LRB- 2003 to 2013 -RRB- ."

我假设你的意思是:

extracted_country = "Saudi Arabia"
extracted_value = "58.5"

然后,.replace 方法按预期工作。不过要小心,它不是修饰符:它 returns 是一个经过修饰的新字符串。 "sentence_noSlots" 将保持不变。

因此,通过链接两个 .replace,您可以像这样实现它:

sentence_slots = sentence_noSlots.replace(str(extracted_country),"<location>empty</location>").replace(str(extracted_value),"<number>empty</number>")