解析此 HTML table 的最快方法是什么?
What's the fastest way to parse this HTML table?
我正在尝试解析从网站获取的 Node 中的 table。 table 看起来像这样。我想忽略 header 并只解析实际的交易主体。
<tbody><tr class="dgHeader" style="font-weight:bold;">
<th scope="col">Reference 1</th><th scope="col">Reference 2</th><th scope="col">Reference 3</th><th scope="col">Reference 4</th><th scope="col">Gross Amount</th><th scope="col">Discounts/Surcharges</th><th scope="col">Net Amount</th><th scope="col">Means of Payment</th><th scope="col">Form of Payment</th><th scope="col">Payment Folio</th><th scope="col">Branch</th><th scope="col">Time</th><th scope="col">Maturity Date</th><th scope="col">Payment date</th> </tr><tr align="left">
<td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia1">0000000000000000000000000000000X4D649G66</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia2"></span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia3"></span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia4"></span>
</td><td align="right">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblImporteBruto">.00</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblDescuentosRecargos">[=12=].00</span>
</td><td align="right">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblImporteNeto">.00</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblMedioPago">Internet</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFormaPago">Cash</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFolioPago">45786172008896142466 </span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblSucursal">4578</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblHora">01:48:59 p.m.</span>
</td><td>
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFechaVencimiento">00/00/0000</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFechaPago">20/06/2016</span>
</td> </tr> </tbody>
我一直在使用 Cheerio,但很难获得 id 标签以从 table 获取数据。
这样就解决了,也很容易拿到参考码
$ = cheerio.load(str, {
ignoreWhitespace: true
});
$('tr').each(function(i, tr){
var reference = $('#ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia1').text())
}
我正在尝试解析从网站获取的 Node 中的 table。 table 看起来像这样。我想忽略 header 并只解析实际的交易主体。
<tbody><tr class="dgHeader" style="font-weight:bold;">
<th scope="col">Reference 1</th><th scope="col">Reference 2</th><th scope="col">Reference 3</th><th scope="col">Reference 4</th><th scope="col">Gross Amount</th><th scope="col">Discounts/Surcharges</th><th scope="col">Net Amount</th><th scope="col">Means of Payment</th><th scope="col">Form of Payment</th><th scope="col">Payment Folio</th><th scope="col">Branch</th><th scope="col">Time</th><th scope="col">Maturity Date</th><th scope="col">Payment date</th> </tr><tr align="left">
<td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia1">0000000000000000000000000000000X4D649G66</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia2"></span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia3"></span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia4"></span>
</td><td align="right">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblImporteBruto">.00</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblDescuentosRecargos">[=12=].00</span>
</td><td align="right">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblImporteNeto">.00</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblMedioPago">Internet</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFormaPago">Cash</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFolioPago">45786172008896142466 </span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblSucursal">4578</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblHora">01:48:59 p.m.</span>
</td><td>
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFechaVencimiento">00/00/0000</span>
</td><td align="left">
<span id="ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblFechaPago">20/06/2016</span>
</td> </tr> </tbody>
我一直在使用 Cheerio,但很难获得 id 标签以从 table 获取数据。
这样就解决了,也很容易拿到参考码
$ = cheerio.load(str, {
ignoreWhitespace: true
});
$('tr').each(function(i, tr){
var reference = $('#ctl00_Contentplaceholder1_gvConcentracionPagos_ctl02_lblReferencia1').text())
}