是否可以将图像写入 csv 文件?

is it possible to write image to csv file?

大家好,这是我第一次 post 来这里,我想知道如何将我从网站上抓取的图像文件写入 csv 文件,或者如果无法在 csv 上写入,我该如何写入this header,description,time info and image to a maybe word file 这是代码 一切正常只是想知道如何将我下载到磁盘的图像写入 csv 或 word 文件 感谢您的帮助

import csv
import requests
from bs4 import BeautifulSoup
site_link = requests.get("websitenamehere").text
soup = BeautifulSoup(site_link,"lxml")

read_file = open("blogger.csv","w",encoding="UTF-8")
csv_writer = csv.writer(read_file)
csv_writer.writerow(["Header","links","Publish Time"])
counter = 0

for article in soup.find_all("article"):
    ###Counting lines
    counter += 1
    print(counter)

    #Article Headers
    headers = article.find("a")["title"]
    print(headers)

    #### Links
    links = article.find("a")["href"]
    print(links)

    #### Publish time
    publish_time = article.find("div",class_="mkdf-post-info-date entry-date published updated")
    publish_time = publish_time.a.text.strip()
    print(publish_time)

    ###image links
    images = article.find("img",class_="attachment-full size-full wp-post-image nitro-lazy")["nitro-lazy-src"]
    print(images)

    ###Download Article Pictures to disk
    pic_name = f"{counter}.jpg"
    with open(pic_name, 'wb') as handle:
        response = requests.get(images, stream=True)
        for block in response.iter_content(1024):
            handle.write(block)

    ###CSV Rows
    csv_writer.writerow([headers, links, publish_time])
    print()

read_file.close()

您基本上可以转换为 base64 并根据需要写入文件

import base64

with open("image.png", "rb") as image_file:
    encoded_string= base64.b64encode(img_file.read())
    print(encoded_string.decode('utf-8'))

csv 文件应该只包含文本字段。即使 csv 模块尽最大努力引用字段以允许其中的几乎任何字符,包括分隔符或换行符,它也无法处理图像文件中可能存在的 NULL 字符。

这意味着如果要将图像存储在 csv 文件中,则必须对图像字节进行编码。 Base64 是 Python 标准库原生支持的一种众所周知的格式。因此,您可以将代码更改为:

import base64
...

    ###Download Article Pictures
    response = requests.get(images, stream=True)
    image = b''.join(block for block in response.iter_content(1024))  # raw image bytes
    image = base64.b64encode(image)                    # base 64 encoded (text) string

    ###CSV Rows
    csv_writer.writerow([headers, links, publish_time, image])

只是图像在使用前必须解码...