Nokogiri 如何用两个 类 遍历一个 table 的每一行

Nokogiri how to traverse every row of a table with two classes

我正在尝试使用 Nokogiri 解析 HTML table。 table 标记得很好并且没有结构问题,除了 table header 作为实际行嵌入而不是使用 <thead>。我遇到的问题是我想要除第一行以外的每一行,因为我对 header 不感兴趣,而是对后面的所有内容感兴趣。这是 table 结构的示例。

<table id="foo">
<tbody>
  <tr class="headerrow">....</tr>
  <tr class="row">...</tr>
  <tr class="row_alternate">...</tr>
  <tr class="row">...</tr>
  <tr class="row_alternate">...</tr>
</tbody>
</table>

我很感兴趣只抓取 class rowrow_alternate 的行。但是,据我所知,这种语法在 Nokogiri 中是不合法的:

doc.css('.row .row_alternate').each do |a_row|
  # do stuff with a_row
end

使用 Nokogiri 解决此问题的最佳方法是什么?

我会试试这个:

doc.css(".row, .row_alternate").each do |a_row|
  # do stuff with a_row
end

尝试 doc.at_css(".headerrow").remove 然后

doc.css("tr").each do |row| #some code end

一个CSS选择器can contain multiple components separated by comma:

A comma-separated list of selectors represents the union of all elements selected by each of the individual selectors in the list. (A comma is U+002C.) For example, in CSS when several selectors share the same declarations, they may be grouped into a comma-separated list. White space may appear before and/or after the comma.

doc.css('.row, .row_alternate').each do |a_row|
  p a_row.to_html
end

# "<tr class=\"row\">...</tr>"
# "<tr class=\"row_alternate\">...</tr>"
# "<tr class=\"row\">...</tr>"
# "<tr class=\"row_alternate\">...</tr>"