Python 3:在一定条件下拆分字符串
Python 3: Split string under certain condition
我在 Python 3 中很难将字符串拆分成特定的部分。
该字符串基本上是一个以冒号 (:) 作为分隔符的列表。
只有当冒号(:)以反斜杠(\)为前缀时,它才会
不算作分隔符,而是列表项的一部分。
示例:
String --> I:would:like:to:find\:out:how:this\:works
Converted List --> ['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']
知道这是怎么回事吗?
@Bertrand 我试着给你一些代码,我想出了一个解决方法,但这可能不是最漂亮的解决方案
text = "I:would:like:to:find\:out:how:this\:works"
values = text.split(":")
new = []
concat = False
temp = None
for element in values:
# when one element ends with \
if element.endswith("\"):
temp = element
concat = True
# when the following element ends with \
# concatenate both before appending them to new list
elif element.endswith("\") and temp is not None:
temp = temp + ":" + element
concat = True
# when the following element does not end with \
# append and set concat to False and temp to None
elif concat is True:
new.append(temp + ":" + element)
concat = False
temp = None
# Append element to new list
else:
new.append(element)
print(new)
输出:
['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']
使用re.split
例如:
import re
s = "I:would:like:to:find\:out:how:this\:works"
print( re.split(r"(?<=\w):", s) )
输出:
['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']
您应该使用 re.split 并执行否定回溯以检查反斜杠字符。
import re
pattern = r'(?<!\):'
s = 'I:would:like:to:find\:out:how:this\:works'
print(re.split(pattern, s))
输出:
['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']
你可以用一些东西替换“:\”(只要确保这是在其他地方的字符串中不存在的东西......你可以使用长期或其他东西),然后拆分通过“:”并将其替换回去。
[x.replace("$","\:") for x in str1.replace("\:","$").split(":")]
解释:
str1 = 'I:would:like:to:find\:out:how:this\:works'
将“:”替换为“$”(或其他名称):
str1.replace("\:","$")
Out: 'I:would:like:to:find$out:how:this$works'
现在按“:”拆分
str1.replace("\:","$").split(":")
Out: ['I', 'would', 'like', 'to', 'find$out', 'how', 'this$works']
并将每个元素的“$”替换为“:”:
[x.replace("$","\:") for x in str1.replace("\:","$").split(":")]
Out: ['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']
我在 Python 3 中很难将字符串拆分成特定的部分。 该字符串基本上是一个以冒号 (:) 作为分隔符的列表。
只有当冒号(:)以反斜杠(\)为前缀时,它才会 不算作分隔符,而是列表项的一部分。
示例:
String --> I:would:like:to:find\:out:how:this\:works
Converted List --> ['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']
知道这是怎么回事吗?
@Bertrand 我试着给你一些代码,我想出了一个解决方法,但这可能不是最漂亮的解决方案
text = "I:would:like:to:find\:out:how:this\:works"
values = text.split(":")
new = []
concat = False
temp = None
for element in values:
# when one element ends with \
if element.endswith("\"):
temp = element
concat = True
# when the following element ends with \
# concatenate both before appending them to new list
elif element.endswith("\") and temp is not None:
temp = temp + ":" + element
concat = True
# when the following element does not end with \
# append and set concat to False and temp to None
elif concat is True:
new.append(temp + ":" + element)
concat = False
temp = None
# Append element to new list
else:
new.append(element)
print(new)
输出:
['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']
使用re.split
例如:
import re
s = "I:would:like:to:find\:out:how:this\:works"
print( re.split(r"(?<=\w):", s) )
输出:
['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']
您应该使用 re.split 并执行否定回溯以检查反斜杠字符。
import re
pattern = r'(?<!\):'
s = 'I:would:like:to:find\:out:how:this\:works'
print(re.split(pattern, s))
输出:
['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']
你可以用一些东西替换“:\”(只要确保这是在其他地方的字符串中不存在的东西......你可以使用长期或其他东西),然后拆分通过“:”并将其替换回去。
[x.replace("$","\:") for x in str1.replace("\:","$").split(":")]
解释:
str1 = 'I:would:like:to:find\:out:how:this\:works'
将“:”替换为“$”(或其他名称):
str1.replace("\:","$")
Out: 'I:would:like:to:find$out:how:this$works'
现在按“:”拆分
str1.replace("\:","$").split(":")
Out: ['I', 'would', 'like', 'to', 'find$out', 'how', 'this$works']
并将每个元素的“$”替换为“:”:
[x.replace("$","\:") for x in str1.replace("\:","$").split(":")]
Out: ['I', 'would', 'like', 'to', 'find\:out', 'how', 'this\:works']