将字符串中的单词序列替换为字符串 python
Replace sequence of words in string with string python
我找不到任何可以解决这个问题的方法(replace()
方法不起作用)。
我有这样一句话:
sentence_noSlots = "Albania compared to other CountriesThe Internet users of Albania is similar to that of Poland , Portugal , Russia , Macedonia , Saudi Arabia , Argentina , Greece , Dominica , Azerbaijan , Italy with a respective Internet users of 62.8 , 62.1 , 61.4 , 61.2 , 60.5 , 59.9 , 59.9 , 59.0 , 58.7 , 58.5 -LRB- per 100 people -RRB- and a global rank of 62 , 63 , 64 , 65 , 66 , 68 , 69 , 70 , 71 , 72.10 years growthAlbania 's Internet users had a positive growth of 5,910 -LRB- % -RRB- in the last 10 years from -LRB- 2003 to 2013 -RRB- ."
然后我有一个像这样的字符串:
extracted_country = Saudi Arabia
extracted_value = 58.5
我需要将字符串中的 Saudi Arabia
替换为 <location>empty</location>
,将 58.5
替换为 <number>empty</number>
。我目前的方法是:
sentence_noSlots.replace(str(extracted_country),"<location>empty</location>")
sentence_noSlots.replace(str(extracted_value),"<number>empty</number>")
但是因为 Saudi Arabia 是两个单词,简单的单词替换是行不通的。由于同一类型的问题,首先标记化和替换工作也不起作用:
sentenceTokens = sentence_noSlots.split()
for i,token in enumerate(sentenceTokens):
if token==extracted_country:
sentenceTokens[i]="<location>empty</location>"
if token==extracted_value:
sentenceTokens[i]="<number>empty</number>"
sentence_noSlots = (" ").join(sentenceTokens)
我怎样才能达到我想达到的目的?
string.replace()
不在位。 python.
中的字符串是不可变的
来自python docs:
string.replace(s, old, new[, maxreplace]) Return a copy of string s
with all occurrences of substring old replaced by new. If the optional
argument maxreplace is given, the first maxreplace occurrences are
replaced.
这样做:
>>> sentence_noSlots = "Albania compared to other CountriesThe Internet users of Albania is similar to that of Poland , Portugal , Russia , Macedonia , Saudi Arabia , Argentina , Greece , Dominica , Azerbaijan , Italy with a respective Internet users of 62.8 , 62.1 , 61.4 , 61.2 , 60.5 , 59.9 , 59.9 , 59.0 , 58.7 , 58.5 -LRB- per 100 people -RRB- and a global rank of 62 , 63 , 64 , 65 , 66 , 68 , 69 , 70 , 71 , 72.10 years growthAlbania 's Internet users had a positive growth of 5,910 -LRB- % -RRB- in the last 10 years from -LRB- 2003 to 2013 -RRB- ."
>>>
>>> extracted_country = "Saudi Arabia"
>>> extracted_value = 58.5
>>> s = sentence_noSlots.replace(str(extracted_country),"<location>empty</location>").replace(str(extracted_value),"<number>empty</number>")
>>> s
"Albania compared to other CountriesThe Internet users of Albania is similar to that of Poland , Portugal , Russia , Macedonia , <location>empty</location> , Argentina , Greece , Dominica , Azerbaijan , Italy with a respective Internet users of 62.8 , 62.1 , 61.4 , 61.2 , 60.5 , 59.9 , 59.9 , 59.0 , 58.7 , <number>empty</number> -LRB- per 100 people -RRB- and a global rank of 62 , 63 , 64 , 65 , 66 , 68 , 69 , 70 , 71 , 72.10 years growthAlbania 's Internet users had a positive growth of 5,910 -LRB- % -RRB- in the last 10 years from -LRB- 2003 to 2013 -RRB- ."
我假设你的意思是:
extracted_country = "Saudi Arabia"
extracted_value = "58.5"
然后,.replace 方法按预期工作。不过要小心,它不是修饰符:它 returns 是一个经过修饰的新字符串。 "sentence_noSlots" 将保持不变。
因此,通过链接两个 .replace,您可以像这样实现它:
sentence_slots = sentence_noSlots.replace(str(extracted_country),"<location>empty</location>").replace(str(extracted_value),"<number>empty</number>")
我找不到任何可以解决这个问题的方法(replace()
方法不起作用)。
我有这样一句话:
sentence_noSlots = "Albania compared to other CountriesThe Internet users of Albania is similar to that of Poland , Portugal , Russia , Macedonia , Saudi Arabia , Argentina , Greece , Dominica , Azerbaijan , Italy with a respective Internet users of 62.8 , 62.1 , 61.4 , 61.2 , 60.5 , 59.9 , 59.9 , 59.0 , 58.7 , 58.5 -LRB- per 100 people -RRB- and a global rank of 62 , 63 , 64 , 65 , 66 , 68 , 69 , 70 , 71 , 72.10 years growthAlbania 's Internet users had a positive growth of 5,910 -LRB- % -RRB- in the last 10 years from -LRB- 2003 to 2013 -RRB- ."
然后我有一个像这样的字符串:
extracted_country = Saudi Arabia
extracted_value = 58.5
我需要将字符串中的 Saudi Arabia
替换为 <location>empty</location>
,将 58.5
替换为 <number>empty</number>
。我目前的方法是:
sentence_noSlots.replace(str(extracted_country),"<location>empty</location>")
sentence_noSlots.replace(str(extracted_value),"<number>empty</number>")
但是因为 Saudi Arabia 是两个单词,简单的单词替换是行不通的。由于同一类型的问题,首先标记化和替换工作也不起作用:
sentenceTokens = sentence_noSlots.split()
for i,token in enumerate(sentenceTokens):
if token==extracted_country:
sentenceTokens[i]="<location>empty</location>"
if token==extracted_value:
sentenceTokens[i]="<number>empty</number>"
sentence_noSlots = (" ").join(sentenceTokens)
我怎样才能达到我想达到的目的?
string.replace()
不在位。 python.
来自python docs:
string.replace(s, old, new[, maxreplace]) Return a copy of string s with all occurrences of substring old replaced by new. If the optional argument maxreplace is given, the first maxreplace occurrences are replaced.
这样做:
>>> sentence_noSlots = "Albania compared to other CountriesThe Internet users of Albania is similar to that of Poland , Portugal , Russia , Macedonia , Saudi Arabia , Argentina , Greece , Dominica , Azerbaijan , Italy with a respective Internet users of 62.8 , 62.1 , 61.4 , 61.2 , 60.5 , 59.9 , 59.9 , 59.0 , 58.7 , 58.5 -LRB- per 100 people -RRB- and a global rank of 62 , 63 , 64 , 65 , 66 , 68 , 69 , 70 , 71 , 72.10 years growthAlbania 's Internet users had a positive growth of 5,910 -LRB- % -RRB- in the last 10 years from -LRB- 2003 to 2013 -RRB- ."
>>>
>>> extracted_country = "Saudi Arabia"
>>> extracted_value = 58.5
>>> s = sentence_noSlots.replace(str(extracted_country),"<location>empty</location>").replace(str(extracted_value),"<number>empty</number>")
>>> s
"Albania compared to other CountriesThe Internet users of Albania is similar to that of Poland , Portugal , Russia , Macedonia , <location>empty</location> , Argentina , Greece , Dominica , Azerbaijan , Italy with a respective Internet users of 62.8 , 62.1 , 61.4 , 61.2 , 60.5 , 59.9 , 59.9 , 59.0 , 58.7 , <number>empty</number> -LRB- per 100 people -RRB- and a global rank of 62 , 63 , 64 , 65 , 66 , 68 , 69 , 70 , 71 , 72.10 years growthAlbania 's Internet users had a positive growth of 5,910 -LRB- % -RRB- in the last 10 years from -LRB- 2003 to 2013 -RRB- ."
我假设你的意思是:
extracted_country = "Saudi Arabia"
extracted_value = "58.5"
然后,.replace 方法按预期工作。不过要小心,它不是修饰符:它 returns 是一个经过修饰的新字符串。 "sentence_noSlots" 将保持不变。
因此,通过链接两个 .replace,您可以像这样实现它:
sentence_slots = sentence_noSlots.replace(str(extracted_country),"<location>empty</location>").replace(str(extracted_value),"<number>empty</number>")