从 Web 读取文本文件并分配给 Ruby / Chef 中的变量

Question

我想从网上读取一个文本文件并将其分配给 chef / ruby

中的一个变量

在 powershell 中我会做这样的事情：

$content = (Invoke-WebRequest http://website.com/string.txt).content

谁能告诉我在 ruby 中这是如何完成的？

Answer 1

使用Ruby的OpenURI library。

require 'open-uri'
content = open("http://website.com/string.txt").read

Answer 2

最小示例

如果您的 URL 真的只是一个纯文本文件，您可以使用 Ruby 标准库中的 OpenURI。在您的示例中，您指向“string.txt”，但我们将使用一个真实的网页仅用于演示目的。

require 'open-uri'
content = open('http://google.com').read

这会将 URL 的所有内容分配给变量 content。这可能就是您所需要的，但除非您真正处理纯文本，否则这种方法在没有进一步处理的情况下通常没有用。

使用Nokogiri解析HTML

一般来说，打开一个不提供 application/json 或 text/plain MIME 内容类型的 URI 会给你一个大字符串，但并不是那么有用。在这种情况下，使用 Nokogiri gem 对输出做一些事情。

示例 1：提取表单元素

例如，要从 Google 网页中提取表单元素：

require 'open-uri'
require 'nokogiri'

uri = 'http://google.com'
doc =  Nokogiri::HTML(open uri)
doc.css('title, form input').each { |e| puts e }

这将过滤页面并仅打印所需的元素。在这种情况下，结果将是：

<title>Google</title>
<input name="ie" value="ISO-8859-1" type="hidden">
<input value="en" name="hl" type="hidden">
<input name="source" type="hidden" value="hp">
<input name="biw" type="hidden">
<input name="bih" type="hidden">
<input style="color:#000;margin:0;padding:5px 8px 0 6px;vertical-align:top" autocomplete="off" class="lst" value="" title="Google Search" maxlength="2048" name="q" size="57">
<input class="lsb" value="Google Search" name="btnG" type="submit">
<input class="lsb" value="I'm Feeling Lucky" name="btnI" onclick="if(this.form.q.value)this.checked=1; else top.location='/doodles/'" type="submit">
<input id="gbv" name="gbv" type="hidden" value="1">

示例 2：从段落元素中提取纯文本

再举一个例子，考虑这个片段。它从 Ruby Wikipedia entry.

中提取前两个段落标签的内容

require 'open-uri'
require 'nokogiri'

puts Nokogiri::HTML(open uri).css(?p).map { |e| e.text }.slice(0,2).join "\n\n"

通过对段落元素的数组进行切片和连接，或者对数组元素进行 grepping，您可以非常轻松地提取文本数据。使用 Nokogiri XPath 表达式将为您提供更多功能。在这种情况下，结果是：

Ruby is a dynamic, reflective, object-oriented, general-purpose programming language. It was designed and developed in the mid-1990s by Yukihiro "Matz" Matsumoto in Japan.

According to its creator, Ruby was influenced by Perl, Smalltalk, Eiffel, Ada, and Lisp.[12] It supports multiple programming paradigms, including functional, object-oriented, and imperative. It also has a dynamic type system and automatic memory management.

您当然可以使用 Nokogiri 做更多的事情，但这应该让您入门。真正的要点是 parsing HTML 通常比在 text/html 响应上使用正则表达式更好，但当然在某些情况下 MIME 类型您的回答可能决定您使用更简约的方法。

Answer 3

在 Chef 中，正确的方法是使用 Chef::HTTP 客户端。

Chef::HTTP.new('https://example.com/').get('/string.txt')

从 Web 读取文本文件并分配给 Ruby / Chef 中的变量

Read text file from web and assign to variable in Ruby / Chef

ruby

chef-infra

最小示例

使用Nokogiri解析HTML

示例 1：提取表单元素

示例 2：从段落元素中提取纯文本