如何使用正则表达式从 python 中的图像路径获取起始文本和结束文本?
how to get the starting text and ending text from image path in python using regex?
这里给出两个场景
示例 1
the Image Path is https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here
and the url was this and Click here to view the detail goes here
示例 2
https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here
Click here to view screenshot the detail goes here
我的代码如下
import re
str_text = "the Image Path is https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here and the url was this and Click here to view the detail goes here"
urls = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', str_text)
print("Urls: ",":".join(urls))
结果
https://ictagrisindh.gov.pk/img/inauguration1.jpg
我想提取从起点到终点之间的文本 & 还想从图像路径中的任何地方提取文本
如有任何帮助,我们将不胜感激并提前致谢
import re
e1 = 'the Image Path is https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here' + \
'and the url was this and Click here to view the detail goes here'
e2 = 'https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here' + \
'Click here to view screenshot the detail goes here'
start_pattern = '(^.+)(?=http.+.jpg)'
image_url_pattern = '(http.+.jpg)'
end_pattern = '(?:^.+.jpg)(.+$)'
start = re.findall(start_pattern, e1)
url = re.findall(image_url_pattern, e1)
end = re.findall(end_pattern, e1)
print(f'start: {start}')
print(f'url: {url}')
print(f'end: {end}')
示例 1:
start: ['the Image Path is ']
url: ['https://ictagrisindh.gov.pk/img/inauguration1.jpg']
end: [' the detail goes hereand the url was this and Click here to view the detail goes here']
示例 2:
start: []
url: ['https://ictagrisindh.gov.pk/img/inauguration1.jpg']
end: [' the detail goes hereClick here to view screenshot the detail goes here']
这里给出两个场景
示例 1
the Image Path is https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here
and the url was this and Click here to view the detail goes here
示例 2
https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here
Click here to view screenshot the detail goes here
我的代码如下
import re
str_text = "the Image Path is https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here and the url was this and Click here to view the detail goes here"
urls = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', str_text)
print("Urls: ",":".join(urls))
结果
https://ictagrisindh.gov.pk/img/inauguration1.jpg
我想提取从起点到终点之间的文本 & 还想从图像路径中的任何地方提取文本
如有任何帮助,我们将不胜感激并提前致谢
import re
e1 = 'the Image Path is https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here' + \
'and the url was this and Click here to view the detail goes here'
e2 = 'https://ictagrisindh.gov.pk/img/inauguration1.jpg the detail goes here' + \
'Click here to view screenshot the detail goes here'
start_pattern = '(^.+)(?=http.+.jpg)'
image_url_pattern = '(http.+.jpg)'
end_pattern = '(?:^.+.jpg)(.+$)'
start = re.findall(start_pattern, e1)
url = re.findall(image_url_pattern, e1)
end = re.findall(end_pattern, e1)
print(f'start: {start}')
print(f'url: {url}')
print(f'end: {end}')
示例 1:
start: ['the Image Path is ']
url: ['https://ictagrisindh.gov.pk/img/inauguration1.jpg']
end: [' the detail goes hereand the url was this and Click here to view the detail goes here']
示例 2:
start: []
url: ['https://ictagrisindh.gov.pk/img/inauguration1.jpg']
end: [' the detail goes hereClick here to view screenshot the detail goes here']