该命令的使用说明（美汤re.compile）

Question

我有以下命令，出于教学目的，我正在努力分解和解释它。

images = bs.find_all('img', {'src':re.compile('.jpg')})

这个函数的全部代码是：

def imagescrape():
    result_images=[]
    html = urlopen('https://en.wikipedia.org/wiki/Rembrandt')
    bs = BeautifulSoup(html, 'html.parser')
    images = bs.find_all('img', {'src':re.compile('.jpg')})
    for image in images:
        result_images.append("https:"+image['src']+'\n') #concatenation!
    return result_images

当然，我们知道images是存放命令结果的变量。我也知道 bs.find_all 的结果（bs 是创建的对象，.find all 是用于查找 'img' 标签的所有实例的方法）。

find_all 方法有两个参数。一个是 'img' （这可能是任何字符串，如果错误请纠正我）。第二个参数似乎是 dictionary 在这里我有点迷路：

'src':re.compile('.jpg')}

通过研究，我了解到以下内容：

"src" 是字典中的关键字以及下一部分：

re.compile('.jpg')}

是字典的值部分。

但是为什么要用字典呢？

此外，更重要的是，在这种特定情况下，re.compile('jpg')}实际上在做什么？返回的是什么？为什么在字典中使用它？我需要一个非常适合学生的初学者，请分解解释。

Answer 1

Beautifulsoup 在他们自己的文档中回答了这个问题： https://www.crummy.com/software/BeautifulSoup/bs4/doc/#a-regular-expression

归结为 find_all 使用的正则表达式

Answer 2

你不必使用字典。
可以使用以下选项之一。

images = bs.find_all('img', {'src':re.compile('.jpg')})
等效于以下内容，它使用命名参数。
images = bs.find_all('img', attrs={'src': re.compile('.jpg')})

images = bs.find_all('img', **{'src':re.compile('.jpg')})
与以下相同。
images = bs.find_all('img', src=re.compile('.jpg'))

表达式 re.compile('.jpg') returns 一个正则表达式对象，可以多次使用，也可以多次使用。这是正则表达式的模块化方法。
请记住，某些字符在正则表达式中具有更深的含义。在这种情况下，这涉及字符“.”。我认为你应该使用 '\.jpg'。否则，它将匹配任何后跟 jpg 的字符的含义。

该命令的使用说明（美汤re.compile）

Explanation of the use of this command (Beautiful soup and re.compile)

python

dictionary

python-re