获取图片 - Python-pptx

Get a picture - Python-pptx

我正在尝试使用 python-pptx 读取 .pptx 文件。我设法从演示文稿中获取了除图像之外的所有内容。下面是我用来识别演示文稿中文本框以外的图像的代码。识别后我得到 auto_shape_type 作为 RECTANGLE (1) 但与图像无关。

from pptx import Presentation
from pptx.shapes.picture import Picture

def read_ppt(file):
    prs = Presentation(file)
    for slide_no, slide in enumerate(prs.slides):
        for shape in slide.shapes:
            if not shape.has_text_frame:
                print(shape.auto_shape_type)

对于理解此问题的任何帮助表示赞赏。也欢迎其他选项。

尝试查询 shape.shape_type。默认情况下,auto_shape_type returns 矩形如您所见,但图片也可以插入到其他形状中并被其他形状遮盖。

Note the default value for a newly-inserted picture is MSO_AUTO_SHAPE_TYPE.RECTANGLE, which performs no cropping because the extents of the rectangle exactly correspond to the extents of the picture.

shape_type 应该 return:

Unique integer identifying the type of this shape, unconditionally MSO_SHAPE_TYPE.PICTURE in this case.

您可以使用其 blob 属性 并写出二进制文件来将图像内容提取到文件中:

from pptx import Presentation
pres = Presentation('ppt_image.pptx')
slide = pres.slides[0]
shape = slide.shapes[0]
image = shape.image
blob = image.blob
ext = image.ext
with open(f'image.{ext}', 'wb') as file:
    file.write(blob)