如何将所有维基百科图像添加到我的 docx 文件中?
How I can add all wikipedia images to my docx file?
我正在使用维基百科api,我想将页面上的所有照片都放到 docx 文档中。目前我只能在文档上放一张图片,但这并不好。维基百科的一些页面没有给我任何照片,当我在互联网上搜索时,我可以看到网站上有一些照片。这是我的代码:
import wikipedia
import re
from docx import Document
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.shared import Pt
from docx.shared import Mm
import requests
import io
from docx.shared import Inches
name = input("Introdu numele tau: ")
wikipedia.set_lang("ro")
hs = input("La ce liceu esti?\n")
cls = input("In ce clasa esti?\n")
date = input("Pe ce data trebuie facut proiectul?\n")
title = input("Despre ce vrei sa fie proiectul tau?\n")
while True:
try:
wiki = wikipedia.page(title)
break
except:
print("Nume proiect invalid")
title = input("Introdu alt nume de proiect: \n")
text = wiki.content
text = re.sub(r'==', '', text)
text = re.sub(r'=', '', text)
text = re.sub(r'\n', '\n ', text)
split = text.split('Vezi și', 1)
text = split[0]
print(text)
document = Document()
section = document.sections[0]
section.page_height = Mm(297)
section.page_width = Mm(210)
section.left_margin = Mm(25.4)
section.right_margin = Mm(25.4)
section.top_margin = Mm(25.4)
section.bottom_margin = Mm(25.4)
section.header_distance = Mm(12.7)
section.footer_distance = Mm(12.7)
style = document.styles['Normal']
font = style.font
font.name = 'Times New Roman'
font.size = Pt(12)
url = wiki.images[1]
response = requests.get(url, stream=True)
image = io.BytesIO(response.content)
try:
document.add_picture(image, width=Inches(1.5))
except:
pass
paragraph = document.add_paragraph(date)
paragraph.alignment = WD_ALIGN_PARAGRAPH.RIGHT
paragraph = document.add_paragraph(name)
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
paragraph = document.add_paragraph('Clasa '+cls)
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
paragraph = document.add_paragraph(hs)
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
paragraph = document.add_heading(title, 0)
paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER
paragraph = document.add_paragraph(' ' + text)
paragraph.style = document.styles['Normal']
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
document.save(title + ".docx")
input()
我觉得问题出在这里:
url = wiki.images[1]
response = requests.get(url, stream=True)
image = io.BytesIO(response.content)
try:
document.add_picture(image, width=Inches(1.5))
except:
pass
因为在docx文档上只显示一张图片
我建议您在 Python 中探索 loops 和 functions。循环使您能够执行某些代码零次或多次,而函数使您可以将一大块代码组合在一起并按名称访问它。在更高级的语言中,这称为 abstraction.
此维基百科目的的循环类似于:
for image in wiki.images:
document.add_picture(image, ...)
那么如果wiki.images
为空,则不会添加图片。如果它有 5 张图像,则将添加所有这 5 张图像。
一个函数可能是这样的:
def add_wiki_image(document, image_url):
response = requests.get(image_url, stream=True)
image = io.BytesIO(response.content)
document.add_picture(image, width=Inches(1.5)
可以这样称呼:
for image_url in wiki.images:
add_wiki_image(document, image_url)
将 add_wiki_image()
作为函数允许在任何需要的地方简洁地引用(“调用”)该代码,并且实现图像添加操作的细节很巧妙 encapsulated 在函数定义中。
我正在使用维基百科api,我想将页面上的所有照片都放到 docx 文档中。目前我只能在文档上放一张图片,但这并不好。维基百科的一些页面没有给我任何照片,当我在互联网上搜索时,我可以看到网站上有一些照片。这是我的代码:
import wikipedia
import re
from docx import Document
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.shared import Pt
from docx.shared import Mm
import requests
import io
from docx.shared import Inches
name = input("Introdu numele tau: ")
wikipedia.set_lang("ro")
hs = input("La ce liceu esti?\n")
cls = input("In ce clasa esti?\n")
date = input("Pe ce data trebuie facut proiectul?\n")
title = input("Despre ce vrei sa fie proiectul tau?\n")
while True:
try:
wiki = wikipedia.page(title)
break
except:
print("Nume proiect invalid")
title = input("Introdu alt nume de proiect: \n")
text = wiki.content
text = re.sub(r'==', '', text)
text = re.sub(r'=', '', text)
text = re.sub(r'\n', '\n ', text)
split = text.split('Vezi și', 1)
text = split[0]
print(text)
document = Document()
section = document.sections[0]
section.page_height = Mm(297)
section.page_width = Mm(210)
section.left_margin = Mm(25.4)
section.right_margin = Mm(25.4)
section.top_margin = Mm(25.4)
section.bottom_margin = Mm(25.4)
section.header_distance = Mm(12.7)
section.footer_distance = Mm(12.7)
style = document.styles['Normal']
font = style.font
font.name = 'Times New Roman'
font.size = Pt(12)
url = wiki.images[1]
response = requests.get(url, stream=True)
image = io.BytesIO(response.content)
try:
document.add_picture(image, width=Inches(1.5))
except:
pass
paragraph = document.add_paragraph(date)
paragraph.alignment = WD_ALIGN_PARAGRAPH.RIGHT
paragraph = document.add_paragraph(name)
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
paragraph = document.add_paragraph('Clasa '+cls)
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
paragraph = document.add_paragraph(hs)
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
paragraph = document.add_heading(title, 0)
paragraph.alignment = WD_ALIGN_PARAGRAPH.CENTER
paragraph = document.add_paragraph(' ' + text)
paragraph.style = document.styles['Normal']
paragraph.alignment = WD_ALIGN_PARAGRAPH.LEFT
document.save(title + ".docx")
input()
我觉得问题出在这里:
url = wiki.images[1]
response = requests.get(url, stream=True)
image = io.BytesIO(response.content)
try:
document.add_picture(image, width=Inches(1.5))
except:
pass
因为在docx文档上只显示一张图片
我建议您在 Python 中探索 loops 和 functions。循环使您能够执行某些代码零次或多次,而函数使您可以将一大块代码组合在一起并按名称访问它。在更高级的语言中,这称为 abstraction.
此维基百科目的的循环类似于:
for image in wiki.images:
document.add_picture(image, ...)
那么如果wiki.images
为空,则不会添加图片。如果它有 5 张图像,则将添加所有这 5 张图像。
一个函数可能是这样的:
def add_wiki_image(document, image_url):
response = requests.get(image_url, stream=True)
image = io.BytesIO(response.content)
document.add_picture(image, width=Inches(1.5)
可以这样称呼:
for image_url in wiki.images:
add_wiki_image(document, image_url)
将 add_wiki_image()
作为函数允许在任何需要的地方简洁地引用(“调用”)该代码,并且实现图像添加操作的细节很巧妙 encapsulated 在函数定义中。