如何将 pdf 对象从 Databricks 发送到 Sharepoint?

How to send a pdf object from Databricks to Sharepoint?

简介:我有一个 Databricks 笔记本,我在其中根据一些数据创建了一个 pdf 文件。 为了生成文件,我使用 fpdf 库:

from fpdf import FPDF, HTMLMixin

感谢图书馆,我生成了一个类型为:<__main__.HTML2PDF at 0x7f3b73720fd0> 的 pdf 文件。 我现在的目标是将此 pdf 发送到共享点文件夹。为此,我使用了以下代码行:

from office365.runtime.auth.user_credential import UserCredential
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.files.file import File

# paths
sharepoint_site = "MySharepointSite" 
sharepoint_folder = "Shared Documents/General/PDFs/" 
sharepoint_user = "aaa@bbb.onmicrosoft.com" 
sharepoint_user_pw = "xyz" 
sharepoint_folder = sharepoint_folder.strip("/")

# set environment variables
SITE_URL = f"https://sharepoint.com/sites/{sharepoint_site}"
RELATIVE_URL = f"/sites/{sharepoint_site}/{sharepoint_folder}"

# connect to sharepoint
ctx = ClientContext(SITE_URL).with_credentials(UserCredential(sharepoint_user, sharepoint_user_pw))
web = ctx.web
ctx.load(web).execute_query()

# Generate PDF
pdf = generate_pdf(ctx, row['ServerRelativeUrl'])

# HERE IS MY ISSUE!
ctx.web.get_folder_by_server_relative_url(sharepoint_folder).upload_file('test.pdf', pdf).execute_query()

问题:当我到达最后一行时,我收到以下错误消息:

TypeError: Object of type HTML2PDF is not JSON serializable

我相信 pdf 对象不能序列化为 JSON 因此我被卡住了,我不知道如何将 PDF 发送到共享点。

问题:您能否建议一种巧妙而优雅的方法来实现我的目标,即将 pdf 文件发送到共享点请问?

我通过将 pdf 保存为字符串,然后对其进行编码并最终将其推送到共享点来解决这个问题:

pdf_binary = pdf.output(dest='S').encode("latin1")
ctx.web.get_folder_by_server_relative_url(sharepoint_folder).upload_file("test.pdf", pdf_binary).execute_query()

注意:如果还是不行,换个编码类型试试