是否可以将图像写入 csv 文件?
is it possible to write image to csv file?
大家好,这是我第一次 post 来这里,我想知道如何将我从网站上抓取的图像文件写入 csv 文件,或者如果无法在 csv 上写入,我该如何写入this header,description,time info and image to a maybe word file 这是代码
一切正常只是想知道如何将我下载到磁盘的图像写入 csv 或 word 文件
感谢您的帮助
import csv
import requests
from bs4 import BeautifulSoup
site_link = requests.get("websitenamehere").text
soup = BeautifulSoup(site_link,"lxml")
read_file = open("blogger.csv","w",encoding="UTF-8")
csv_writer = csv.writer(read_file)
csv_writer.writerow(["Header","links","Publish Time"])
counter = 0
for article in soup.find_all("article"):
###Counting lines
counter += 1
print(counter)
#Article Headers
headers = article.find("a")["title"]
print(headers)
#### Links
links = article.find("a")["href"]
print(links)
#### Publish time
publish_time = article.find("div",class_="mkdf-post-info-date entry-date published updated")
publish_time = publish_time.a.text.strip()
print(publish_time)
###image links
images = article.find("img",class_="attachment-full size-full wp-post-image nitro-lazy")["nitro-lazy-src"]
print(images)
###Download Article Pictures to disk
pic_name = f"{counter}.jpg"
with open(pic_name, 'wb') as handle:
response = requests.get(images, stream=True)
for block in response.iter_content(1024):
handle.write(block)
###CSV Rows
csv_writer.writerow([headers, links, publish_time])
print()
read_file.close()
您基本上可以转换为 base64 并根据需要写入文件
import base64
with open("image.png", "rb") as image_file:
encoded_string= base64.b64encode(img_file.read())
print(encoded_string.decode('utf-8'))
csv 文件应该只包含文本字段。即使 csv 模块尽最大努力引用字段以允许其中的几乎任何字符,包括分隔符或换行符,它也无法处理图像文件中可能存在的 NULL 字符。
这意味着如果要将图像存储在 csv 文件中,则必须对图像字节进行编码。 Base64 是 Python 标准库原生支持的一种众所周知的格式。因此,您可以将代码更改为:
import base64
...
###Download Article Pictures
response = requests.get(images, stream=True)
image = b''.join(block for block in response.iter_content(1024)) # raw image bytes
image = base64.b64encode(image) # base 64 encoded (text) string
###CSV Rows
csv_writer.writerow([headers, links, publish_time, image])
只是图像在使用前必须解码...
大家好,这是我第一次 post 来这里,我想知道如何将我从网站上抓取的图像文件写入 csv 文件,或者如果无法在 csv 上写入,我该如何写入this header,description,time info and image to a maybe word file 这是代码 一切正常只是想知道如何将我下载到磁盘的图像写入 csv 或 word 文件 感谢您的帮助
import csv
import requests
from bs4 import BeautifulSoup
site_link = requests.get("websitenamehere").text
soup = BeautifulSoup(site_link,"lxml")
read_file = open("blogger.csv","w",encoding="UTF-8")
csv_writer = csv.writer(read_file)
csv_writer.writerow(["Header","links","Publish Time"])
counter = 0
for article in soup.find_all("article"):
###Counting lines
counter += 1
print(counter)
#Article Headers
headers = article.find("a")["title"]
print(headers)
#### Links
links = article.find("a")["href"]
print(links)
#### Publish time
publish_time = article.find("div",class_="mkdf-post-info-date entry-date published updated")
publish_time = publish_time.a.text.strip()
print(publish_time)
###image links
images = article.find("img",class_="attachment-full size-full wp-post-image nitro-lazy")["nitro-lazy-src"]
print(images)
###Download Article Pictures to disk
pic_name = f"{counter}.jpg"
with open(pic_name, 'wb') as handle:
response = requests.get(images, stream=True)
for block in response.iter_content(1024):
handle.write(block)
###CSV Rows
csv_writer.writerow([headers, links, publish_time])
print()
read_file.close()
您基本上可以转换为 base64 并根据需要写入文件
import base64
with open("image.png", "rb") as image_file:
encoded_string= base64.b64encode(img_file.read())
print(encoded_string.decode('utf-8'))
csv 文件应该只包含文本字段。即使 csv 模块尽最大努力引用字段以允许其中的几乎任何字符,包括分隔符或换行符,它也无法处理图像文件中可能存在的 NULL 字符。
这意味着如果要将图像存储在 csv 文件中,则必须对图像字节进行编码。 Base64 是 Python 标准库原生支持的一种众所周知的格式。因此,您可以将代码更改为:
import base64
...
###Download Article Pictures
response = requests.get(images, stream=True)
image = b''.join(block for block in response.iter_content(1024)) # raw image bytes
image = base64.b64encode(image) # base 64 encoded (text) string
###CSV Rows
csv_writer.writerow([headers, links, publish_time, image])
只是图像在使用前必须解码...