是否有获取 link 预览的最佳做法？

Question

基本上给定任何 url，我可以使用

在 Ruby 中获取网页

doc = Nokogiri::HTML(open(my_url))
title = doc.at('meta[property="og:title"]')['content']
...

并提取我需要的元素

在获取任何链接之前是否有最佳做法？这似乎也是一个潜在的安全风险。

我假设像 facebook 这样的大公司可能运行一张图片通过某种模型来确定它是否应该被审查？

Answer 1

Essentially given any url, I could fetch the webpage in Ruby using

我正在使用 metainspector 从各种媒体 URL 获取 OG 数据。它工作得很好，可能会让你省去一些麻烦。

Is there a best practice before fetching any links? It seems like a potential security risk as well.

这取决于您的应用程序、您抓取的信息以及您向用户显示的内容。如果你担心淫秽的词，你可以过滤掉它们（可能会有一些宝石），但通常在 OG 元中我没有看到它们。您可以将成人网站域列入黑名单，或者只允许某些域..

I'm assuminig large compaines like facebook might run an image through some model to determine if it should be censored?

图像识别是一种方法，但需要大量工作。很多。

Is there a best practice for fetching link previews?