将部分函数输出保存到 Python 中的文件
Saving part of function output to a file in Python
我有一个简单的抓取功能,可以从给定的网址返回特定的东西。
它发回我想以某种方式将内容保存到 .md 文件的字典。
代码如下:
import requests
from bs4 import BeautifulSoup
def get_data(url):
page = requests.get(url).text
soup = BeautifulSoup(page, 'html.parser')
iframe = []
yt_secondary = []
div = soup.find_all('div', attrs={'class': 'tags'})
for entry in div:
tags = entry.text.strip().replace('#', '').split('\n')
songs_links = soup.find_all('iframe')[0]
iframe.append(songs_links)
entry = {'tags': tags,
'iframe': songs_links}
return entry
if __name__ == "__main__":
print(get_data('https://nikisaku.tumblr.com/post/643205680992485376/test'))
它按预期返回:
{'tags': ['Tagged: testing, test2, test3, .'], 'iframe': <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="281" id="youtube_iframe" src="https://www.youtube.com/embed/bwKfVwiUpvo?feature=oembed&enablejsapi=1&origin=https://safe.txmblr.com&wmode=opaque" width="500"></iframe>}
现在我希望能够以以下格式将其保存到 .md 文件中:
---
tags: Tagged: testing, test2, test3, .
---
<iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="281" id="youtube_iframe" src="https://www.youtube.com/embed/bwKfVwiUpvo?feature=oembed&enablejsapi=1&origin=https://safe.txmblr.com&wmode=opaque" width="500"></iframe>
这样还能存吗?
我需要将它作为此功能,因为我将使用它遍历给定页面的 X 以抓取标签和链接(有效),并且对于每个结果我都必须创建一个新的 .md 文件。
提前致谢!
因为你的函数 returns 字典,你可以遍历它并分别打印键和值:
if __name__ == "__main__":
raw_data = get_data('https://nikisaku.tumblr.com/post/643205680992485376/test')
for key, value in raw_data.items():
if type(value) is list:
print(f"---\n{key}: {', '.join(value)}")
else:
print(f"---\n{key}: {value}")
结果如下所示:
---
tags: Tagged: testing, test2, test3, .
---
iframe: <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" ...
我有一个简单的抓取功能,可以从给定的网址返回特定的东西。 它发回我想以某种方式将内容保存到 .md 文件的字典。 代码如下:
import requests
from bs4 import BeautifulSoup
def get_data(url):
page = requests.get(url).text
soup = BeautifulSoup(page, 'html.parser')
iframe = []
yt_secondary = []
div = soup.find_all('div', attrs={'class': 'tags'})
for entry in div:
tags = entry.text.strip().replace('#', '').split('\n')
songs_links = soup.find_all('iframe')[0]
iframe.append(songs_links)
entry = {'tags': tags,
'iframe': songs_links}
return entry
if __name__ == "__main__":
print(get_data('https://nikisaku.tumblr.com/post/643205680992485376/test'))
它按预期返回:
{'tags': ['Tagged: testing, test2, test3, .'], 'iframe': <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="281" id="youtube_iframe" src="https://www.youtube.com/embed/bwKfVwiUpvo?feature=oembed&enablejsapi=1&origin=https://safe.txmblr.com&wmode=opaque" width="500"></iframe>}
现在我希望能够以以下格式将其保存到 .md 文件中:
---
tags: Tagged: testing, test2, test3, .
---
<iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen="" frameborder="0" height="281" id="youtube_iframe" src="https://www.youtube.com/embed/bwKfVwiUpvo?feature=oembed&enablejsapi=1&origin=https://safe.txmblr.com&wmode=opaque" width="500"></iframe>
这样还能存吗? 我需要将它作为此功能,因为我将使用它遍历给定页面的 X 以抓取标签和链接(有效),并且对于每个结果我都必须创建一个新的 .md 文件。
提前致谢!
因为你的函数 returns 字典,你可以遍历它并分别打印键和值:
if __name__ == "__main__":
raw_data = get_data('https://nikisaku.tumblr.com/post/643205680992485376/test')
for key, value in raw_data.items():
if type(value) is list:
print(f"---\n{key}: {', '.join(value)}")
else:
print(f"---\n{key}: {value}")
结果如下所示:
---
tags: Tagged: testing, test2, test3, .
---
iframe: <iframe allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" ...