Flask / postgres - 使用 PDFJS 显示 pdf
Flask / postgres - display pdf with PDFJS
我有一个非常简单的应用程序。用户通过 Web 前端将 pdf 文件上传到 postgres 数据库。然后应该通过 pdfjs 在浏览器中呈现该 pdf。
我相当确定我的问题是编码问题,但我认为我对编码的了解不够深入,无法自行回答这个问题。
我的模型:
class Lesson(Base):
__tablename__ = 'lessons'
# Name of the lesson
lesson_order = db.Column(db.Enum(LessonIndexes), nullable=False)
name = db.Column(db.String(128), nullable=False)
summary = db.Column(db.String(500))
lesson_plan_id = db.Column(db.Integer(), ForeignKey('lesson_plans.id'), nullable=False)
pdf = db.Column(db.LargeBinary())
我的控制器:
@mod_lp.route('/<lesson_plan_id>/create_lesson', methods=["POST"])
def create_lesson(lesson_plan_id):
form = LessonForm()
file = request.files['pdf'] # type: FileStorage
if form.validate_on_submit():
file = request.files['pdf']
lesson = Lesson(form.lesson_order.data, form.name.data, form.summary.data, lesson_plan_id,
pdf=file.read() # this line here
)
db.session.add(lesson)
db.session.commit()
return redirect(url_for('lesson_plan.show', lesson_plan_id=lesson_plan_id))
这存储的数据类似于:
%PDF-1.4
%����
1 0 obj
<</Creator (Mozilla/5.0 \(Macintosh; Intel Mac OS X 10_12_6\) AppleWebKit/537.36 \(KHTML, like Gecko\) Chrome/60.0.3112.113 Safari/537.36)
/Producer (Skia/PDF m60)
/CreationDate (D:20170916222407+00'00')
/ModDate (D:20170916222407+00'00')>>
endobj
2 0 obj
<</Filter /FlateDecode
/Length 1370>> stream
x���ݎ�4��<������� qq�@%`aB�H�_�����T�E���ړ�c'�t�Z��[������}�{�I���@���
(etc...)
我的 javasript(取自 PDFJS,你好世界):
var pdfString = "{{ pdf_data}}";
var pdfData = atob(pdfString);
if (pdfData) {
var loadingTask = PDFJS.getDocument({data: pdfData});
loadingTask.promise.then(function (pdf) {
console.log('PDF loaded');
// Fetch the first page
var pageNumber = 1;
pdf.getPage(pageNumber).then(function (page) {
console.log('Page loaded');
var scale = 1.5;
var viewport = page.getViewport(scale);
// Prepare canvas using PDF page dimensions
var canvas = document.getElementById('pdf-canvas');
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
// Render PDF page into canvas context
var renderContext = {
canvasContext: context,
viewport: viewport
};
var renderTask = page.render(renderContext);
renderTask.then(function () {
console.log('Page rendered');
});
});
}, function (reason) {
// PDF loading error
console.error(reason);
});
我目前的错误是:
6:108 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
我尝试过的事情:
file.stream.getvalue()
file.stream.getvalue().decode("latin-1") # for whatever reason, this was the only 'decode' that didn't throw an error
file.stream.getvalue().decode("latin-1").encode()
base64.b64encode(file.stream.getvalue().decode("latin-1").encode())
但是这些都以各种方式失败了。
更新:
如果我将数据库中的二进制数据发送到我的模板:
pdf_data = lesson.pdf
忘记调用 atob
了:
var pdfData = pdfString;
if (pdfData) {
...
我收到这个错误:
Error: Invalid XRef stream header
pdf.worker.js:340 at error (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:340:17)
at XRef_readXRef [as readXRef] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20943:13)
at XRef_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20613:28)
at PDFDocument_setup [as setup] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26445:17)
at PDFDocument_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26336:12)
at http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36120:28
at Promise (<anonymous>)
at LocalPdfManager_ensure [as ensure] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36115:14)
at LocalPdfManager.BasePdfManager_ensureDoc [as ensureDoc] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36067:19)
atob 需要一个 base64 编码的字符串。我有一个基本示例,至少可以成功调用 atob。很确定这就是您所看到的问题。您可能只需将 base64 编码的内容保存在该 postgres table 中,这样您就不需要一直对其进行解码。 'source.pdf' 只是我在磁盘上的一个示例 pdf。但是,您可以将其与 postgres table.
中的数据交换
flask_app.py
from flask import Flask, request, render_template
import base64
app = Flask(__name__)
@app.route("/testing", methods=["GET"])
def get_test_file():
with open("source.pdf", "rb") as data_file:
data = data_file.read()
encoded_data = base64.b64encode(data).decode('utf-8')
return render_template("test.html", encoded_data=encoded_data)
test.html
<html>
<head>
</head>
<body>
<script>
var encoded_data = '{{ encoded_data }}';
var pdf_data = atob(encoded_data);
</script>
</body>
</html>
我有一个非常简单的应用程序。用户通过 Web 前端将 pdf 文件上传到 postgres 数据库。然后应该通过 pdfjs 在浏览器中呈现该 pdf。
我相当确定我的问题是编码问题,但我认为我对编码的了解不够深入,无法自行回答这个问题。
我的模型:
class Lesson(Base):
__tablename__ = 'lessons'
# Name of the lesson
lesson_order = db.Column(db.Enum(LessonIndexes), nullable=False)
name = db.Column(db.String(128), nullable=False)
summary = db.Column(db.String(500))
lesson_plan_id = db.Column(db.Integer(), ForeignKey('lesson_plans.id'), nullable=False)
pdf = db.Column(db.LargeBinary())
我的控制器:
@mod_lp.route('/<lesson_plan_id>/create_lesson', methods=["POST"])
def create_lesson(lesson_plan_id):
form = LessonForm()
file = request.files['pdf'] # type: FileStorage
if form.validate_on_submit():
file = request.files['pdf']
lesson = Lesson(form.lesson_order.data, form.name.data, form.summary.data, lesson_plan_id,
pdf=file.read() # this line here
)
db.session.add(lesson)
db.session.commit()
return redirect(url_for('lesson_plan.show', lesson_plan_id=lesson_plan_id))
这存储的数据类似于:
%PDF-1.4
%����
1 0 obj
<</Creator (Mozilla/5.0 \(Macintosh; Intel Mac OS X 10_12_6\) AppleWebKit/537.36 \(KHTML, like Gecko\) Chrome/60.0.3112.113 Safari/537.36)
/Producer (Skia/PDF m60)
/CreationDate (D:20170916222407+00'00')
/ModDate (D:20170916222407+00'00')>>
endobj
2 0 obj
<</Filter /FlateDecode
/Length 1370>> stream
x���ݎ�4��<������� qq�@%`aB�H�_�����T�E���ړ�c'�t�Z��[������}�{�I���@���
(etc...)
我的 javasript(取自 PDFJS,你好世界):
var pdfString = "{{ pdf_data}}";
var pdfData = atob(pdfString);
if (pdfData) {
var loadingTask = PDFJS.getDocument({data: pdfData});
loadingTask.promise.then(function (pdf) {
console.log('PDF loaded');
// Fetch the first page
var pageNumber = 1;
pdf.getPage(pageNumber).then(function (page) {
console.log('Page loaded');
var scale = 1.5;
var viewport = page.getViewport(scale);
// Prepare canvas using PDF page dimensions
var canvas = document.getElementById('pdf-canvas');
var context = canvas.getContext('2d');
canvas.height = viewport.height;
canvas.width = viewport.width;
// Render PDF page into canvas context
var renderContext = {
canvasContext: context,
viewport: viewport
};
var renderTask = page.render(renderContext);
renderTask.then(function () {
console.log('Page rendered');
});
});
}, function (reason) {
// PDF loading error
console.error(reason);
});
我目前的错误是:
6:108 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.
我尝试过的事情:
file.stream.getvalue()
file.stream.getvalue().decode("latin-1") # for whatever reason, this was the only 'decode' that didn't throw an error
file.stream.getvalue().decode("latin-1").encode()
base64.b64encode(file.stream.getvalue().decode("latin-1").encode())
但是这些都以各种方式失败了。 更新:
如果我将数据库中的二进制数据发送到我的模板:
pdf_data = lesson.pdf
忘记调用 atob
了:
var pdfData = pdfString;
if (pdfData) {
...
我收到这个错误:
Error: Invalid XRef stream header
pdf.worker.js:340 at error (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:340:17)
at XRef_readXRef [as readXRef] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20943:13)
at XRef_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20613:28)
at PDFDocument_setup [as setup] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26445:17)
at PDFDocument_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26336:12)
at http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36120:28
at Promise (<anonymous>)
at LocalPdfManager_ensure [as ensure] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36115:14)
at LocalPdfManager.BasePdfManager_ensureDoc [as ensureDoc] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36067:19)
atob 需要一个 base64 编码的字符串。我有一个基本示例,至少可以成功调用 atob。很确定这就是您所看到的问题。您可能只需将 base64 编码的内容保存在该 postgres table 中,这样您就不需要一直对其进行解码。 'source.pdf' 只是我在磁盘上的一个示例 pdf。但是,您可以将其与 postgres table.
中的数据交换flask_app.py
from flask import Flask, request, render_template
import base64
app = Flask(__name__)
@app.route("/testing", methods=["GET"])
def get_test_file():
with open("source.pdf", "rb") as data_file:
data = data_file.read()
encoded_data = base64.b64encode(data).decode('utf-8')
return render_template("test.html", encoded_data=encoded_data)
test.html
<html>
<head>
</head>
<body>
<script>
var encoded_data = '{{ encoded_data }}';
var pdf_data = atob(encoded_data);
</script>
</body>
</html>