Python: 查找包含通配符的字符串中是否存在子字符串
Python: Find If Substring Exists in String Including Wildcard
我想创建一个接受两个输入的函数,fullstring
和 substring
。如果子字符串存在于完整字符串中,函数应该 return True
。否则,它将return False
。如果子字符串包含通配符 (*),则通配符可以表示任何 单个 字符。
例如:
arg1: fullstring = "hitherehello"
arg2: substring = "the*e
output: True
我尝试过的:
下面是一个匹配子字符串和父字符串的函数,但我似乎无法弄清楚如何整合通配符。
def count_substring(string, sub_string):
len_ss = len(sub_string)
for i in range(len(string) - len_ss + 1):
if string[i:i+len_ss] == sub_string:
return True
return False
约束条件:
我不能使用正则表达式或内置 python 函数,例如 find
.
您可以查找字符串中的每个 'wildcard' 部分,return 位置,然后检查是否跟进:
def count_substring(string, sub_string, lastIndex):
len_ss = len(sub_string)
for i in range(len(string) - len_ss + 1):
if string[i:i + len_ss] == sub_string and lastIndex < i + len_ss:
return i + len_ss
return -1
def count_wrapper(string, sub_string, index):
positions = []
for wild_sub_string in sub_string.split('*'):
index = count_substring(string, wild_sub_string, index)
positions.append(index)
# check pairwise
return all([x + 2 == y for x, y in zip(*[iter(positions)] * 2)])
print(count_wrapper("there", "the*e", 0))
print(count_wrapper("thera", "the*e", 0))
print(count_wrapper("theresa", "the*e*a", 0))
输出:
True
False
True
def search(fullstring, substring):
def check(s1, s2):
for a, b in zip(s1, s2):
if a != b and b != "*":
return False
return True
for i in range(len(fullstring) - len(substring) + 1):
if check(fullstring[i : i + len(substring)], substring):
return True
return False
print(search("hitherehello", "the*e"))
打印:
True
更多测试:
print(search("hiXherehello", "*he*e")) # True
print(search("hitherXhello", "the*e")) # False
这似乎对我有用:
def check_all_substrings(index, string, substrings):
for ss in substrings:
if string[index:index+len(ss)] != ss:
return False
# If we matched, move the index along by the length of the substring + 1, so we skip a character
index = index + len(ss) + 1
return True
def match(string, substring):
i = 0
substrings = substring.split('*')
while (i < (len(string) - len(substring))):
if check_all_substrings(i, string, substrings):
return True
i += 1
return False
# match substring
assert match("hitherehello", "there")
#match substring with 1 wild card
assert match("hitherehello", "the*e")
# match
assert match("hithat", "h*t*a")
# wild cards should match exactly 1 character
assert not match("hitherrrrehello", "the*e")
# do not match invalid substrings
assert not match("hithat", "the*e")
assert not match("hithat", "there")
我想创建一个接受两个输入的函数,fullstring
和 substring
。如果子字符串存在于完整字符串中,函数应该 return True
。否则,它将return False
。如果子字符串包含通配符 (*),则通配符可以表示任何 单个 字符。
例如:
arg1: fullstring = "hitherehello"
arg2: substring = "the*e
output: True
我尝试过的: 下面是一个匹配子字符串和父字符串的函数,但我似乎无法弄清楚如何整合通配符。
def count_substring(string, sub_string):
len_ss = len(sub_string)
for i in range(len(string) - len_ss + 1):
if string[i:i+len_ss] == sub_string:
return True
return False
约束条件:
我不能使用正则表达式或内置 python 函数,例如 find
.
您可以查找字符串中的每个 'wildcard' 部分,return 位置,然后检查是否跟进:
def count_substring(string, sub_string, lastIndex):
len_ss = len(sub_string)
for i in range(len(string) - len_ss + 1):
if string[i:i + len_ss] == sub_string and lastIndex < i + len_ss:
return i + len_ss
return -1
def count_wrapper(string, sub_string, index):
positions = []
for wild_sub_string in sub_string.split('*'):
index = count_substring(string, wild_sub_string, index)
positions.append(index)
# check pairwise
return all([x + 2 == y for x, y in zip(*[iter(positions)] * 2)])
print(count_wrapper("there", "the*e", 0))
print(count_wrapper("thera", "the*e", 0))
print(count_wrapper("theresa", "the*e*a", 0))
输出:
True
False
True
def search(fullstring, substring):
def check(s1, s2):
for a, b in zip(s1, s2):
if a != b and b != "*":
return False
return True
for i in range(len(fullstring) - len(substring) + 1):
if check(fullstring[i : i + len(substring)], substring):
return True
return False
print(search("hitherehello", "the*e"))
打印:
True
更多测试:
print(search("hiXherehello", "*he*e")) # True
print(search("hitherXhello", "the*e")) # False
这似乎对我有用:
def check_all_substrings(index, string, substrings):
for ss in substrings:
if string[index:index+len(ss)] != ss:
return False
# If we matched, move the index along by the length of the substring + 1, so we skip a character
index = index + len(ss) + 1
return True
def match(string, substring):
i = 0
substrings = substring.split('*')
while (i < (len(string) - len(substring))):
if check_all_substrings(i, string, substrings):
return True
i += 1
return False
# match substring
assert match("hitherehello", "there")
#match substring with 1 wild card
assert match("hitherehello", "the*e")
# match
assert match("hithat", "h*t*a")
# wild cards should match exactly 1 character
assert not match("hitherrrrehello", "the*e")
# do not match invalid substrings
assert not match("hithat", "the*e")
assert not match("hithat", "there")