如何模仿网页的"Save As"获取源码？

Question

我正在尝试下载网页的源代码。

当我尝试“查看页面源代码”时，我感兴趣的网页部分不在其中。它似乎“隐藏”在

<div class="row5" info_abc></div>

而实际的 text/formating 不存在。

如果我执行“另存为”，我会得到所有内容，包括我想要的页面部分。代码看起来也有点不同：

<div info_abc="" class="row5">...

如果我使用 Chrome 执行“检查元素”，我也可以看到这段代码。

如何获取该网页的完整源代码（不包括网站其他部分），比如我“保存”该网页时可以获得的内容？我可以使用 curl 或 wget 来完成吗？

Answer 1

curl 和 wget 可能不起作用。像这样的页面通常使用 javascript 动态加载。一种解决方法是使用 Selenium（或其竞争对手）来模拟浏览器行为。 There are tons of postings about this, for example here.

我还没有亲自尝试过，但我最近运行进入了 Shellnium: Simple Selnium WebDriver for Bash，您可能想看一看。

How to mimic "Save As" of a webpage to get its source code?