如何在逻辑应用程序中将 HTML table 转换为 JSON

How do I convert an HTML table into JSON in Logic Apps

我正在构建一个逻辑应用程序来处理我们收到的电子邮件回复之前的呼叫。第一封电子邮件是一封确认电子邮件,其中包含一个 table 指示已通知哪些公用事业提供商。我想将 table 的内容添加到 excel 电子表格中,并在此过程中添加我们自己的参考编号。我找到了一个可能的解决方案 in this answer which was taken from John Dyer's Blog.

    var data = [];

    // first row needs to be headers
    var headers = [];
    for (var i=0; i<table.rows[0].cells.length; i++) {
        headers[i] = table.rows[0].cells[i].innerHTML.toLowerCase().replace(/ /gi,'');
    }

    // go through cells
    for (var i=1; i<table.rows.length; i++) {

        var tableRow = table.rows[i];
        var rowData = {};

        for (var j=0; j<tableRow.cells.length; j++) {

            rowData[ headers[j] ] = tableRow.cells[j].innerHTML;

        }

        data.push(rowData);
    }       

    return data;
}

我尝试在 Azure Functions 中使用此代码,但当它让我下载 VS 时,它变得越来越复杂。我找不到在门户中添加代码的方法。我安装了 VS,但它很快就让我无法自拔了。

我在网上找到了一个 converter 并用它来将代码转换成 C++,这样我也许可以在 .net 函数中使用它。

#include <stdio.h>
int main()
{
      printf("function tableToJson(table) {
    var data = [];

    // first row needs to be headers
    var headers = [];
    for (var i=0; i<table.rows[0].cells.length; i++) {
        headers[i] = table.rows[0].cells[i].innerHTML.toLowerCase().replace(/ /gi,'');
    }

    // go through cells
    for (var i=1; i<table.rows.length; i++) {

        var tableRow = table.rows[i];
        var rowData = {};

        for (var j=0; j<tableRow.cells.length; j++) {

            rowData[ headers[j] ] = tableRow.cells[j].innerHTML;

        }

        data.push(rowData);
    }       

    return data;
}\n");
      return 0;
}

我已使用两个撰写操作将邮件内容 trim 缩减为有问题的 table:

<table class="MsoNormalTable" border="1" cellspacing="0" cellpadding="0" width="100%" style="width:100.0%; border-collapse:collapse; border:none">
  <tbody>
    <tr>
      <td colspan="3" valign="top" style="border:solid gray 1.0pt; background:#9CCC6B; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal">
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black">MEMBERS NOTIFIED: The following owners of underground infrastructure in the area of your excavation site have been notified.</span>
          </b>
        </p>
      </td>
    </tr>
    <tr>
      <td width="50%" valign="top" style="width:50.0%; border-top:none; border-left:solid gray 1.0pt; border-bottom:solid gray 1.0pt; border-right:none; background:#9CCC6B; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black">Member name</span>
          </b>
        </p>
      </td>
      <td width="25%" valign="top" style="width:25.0%; border-top:none; border-left:solid gray 1.0pt; border-bottom:solid gray 1.0pt; border-right:none; background:#9CCC6B; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black">Station Code</span>
          </b>
        </p>
      </td>
      <td width="25%" valign="top" style="width:25.0%; border:solid gray 1.0pt; border-top:none; background:#9CCC6B; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black">Initial Status</span>
          </b>
        </p>
      </td>
    </tr>
    <tr>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">G-TEL FOR ENBRIDGE GAS (LEGACY UNION GAS) (ENOW01)</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">ENOW01</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
    </tr>
    <tr>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">CITY OF STRATFORD (STRATWS01)</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">STRATWS01</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
    </tr>
    <tr>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">FESTIVAL HYDRO (LOCAL HYDRO) (FESTH01)</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">FESTH01</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
    </tr>
    <tr>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">WIGHTMAN TELECOM - FIBRE - LIMITED (WT01)</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">WT01</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
    </tr>
    <tr>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">CLI FOR ROGERS (ROGWAT01)</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border:none; border-left:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">ROGWAT01</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:none; border-right:solid gray 1.0pt; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Cleared</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
    </tr>
    <tr>
      <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:solid gray 1.0pt; border-right:none; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">G-TEL FOR BELL CANADA (BCOW01)</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border-top:none; border-left:solid gray 1.0pt; border-bottom:solid gray 1.0pt; border-right:none; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">BCOW01</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
      <td valign="top" style="border:solid gray 1.0pt; border-top:none; padding:3.75pt .75pt 3.75pt 3.75pt">
        <p class="MsoNormal" align="center" style="text-align:center">
          <span class="value">
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif">Notification sent</span>
          </span>
          <b>
            <span style="font-size:9.0pt; font-family:&quot;Arial&quot;,sans-serif; color:black"></span>
          </b>
        </p>
      </td>
    </tr>
  </tbody>
</table>

此 HTML 成为 Azure Function 操作的输入以及我使用转换输出并在上面引用的 C++ 代码创建的 Function。

我在输出中收到内部服务器错误 500。

我希望有人能指出正确的方向来解决这个问题。我明明做错了!

我希望我走在正确的轨道上,但下面的代码(虽然非常针对您的用例)将读取 HTML table 和 return a JSON 数据表示。

只需在 .NET 中创建一个名为 ConvertHtmlTableToJson 的新 Azure 函数并将其粘贴到其中。

#r "Newtonsoft.Json"

using System.Net;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Primitives;
using System.Collections.Generic;
using Newtonsoft.Json;
using System.Xml;

public static async Task<IActionResult> Run(HttpRequest req, ILogger log)
{
    var outputTable = new List<List<String>>();

    string requestBody = String.Empty;

    using (StreamReader streamReader = new StreamReader(req.Body))
    {
        requestBody = await streamReader.ReadToEndAsync();
    }

    dynamic data = JsonConvert.DeserializeObject(requestBody);
    string xmlString = System.Text.Encoding.UTF8.GetString(Convert.FromBase64String((string)data?.Content));;

    var xmlDocument = new XmlDocument();
    xmlDocument.LoadXml(xmlString);

    // Get the rows
    var xmlRows = xmlDocument.DocumentElement.SelectNodes("//tr");

    foreach (XmlNode xmlRow in xmlRows)
    {
        // Now get the columns.
        var xmlColumns = xmlRow.SelectNodes(".//td");
        var row = new List<string>();

        foreach (XmlNode xmlColumn in xmlColumns)
        {
            var value = xmlColumn.SelectSingleNode(".//span[@class='value']");

            if (value != null)
                row.Add(value.InnerText);
        }

        if (row.Count > 0)
            outputTable.Add(row);
    }
   
    return new OkObjectResult(outputTable);
}

它接受一个 Base64 字符串,即您在示例中提供的 HTML 数据。

一些注意事项...

  • 查找具有 class="value" 属性的 span 元素是硬编码的,这与您收到的电子邮件和您提供的 HTML 一致。
  • 它不检查不平衡的列,即如果某行缺少值,您可能会得到一行两列和另一行三列。
  • Headers 被忽略,因为它只在 td 元素中搜索,其中 span 元素具有第一点中指定的属性标准。

只要您的电子邮件保持不变,并且您传递的结构与您作为示例提供的结构相同,它就会为您提取数据。除此之外,它还需要加强。

从那里,您应该能够使用它 return 的二维数组将数据加载到您的 Excel table.

这就是我在 LogicApps 中的表示方式...

HTML变量

这是请求中的 body ...

{
  "Content": "@{base64(variables('HTML'))}"
}

结果

[
  [
    "G-TEL FOR ENBRIDGE GAS (LEGACY UNION GAS) (ENOW01)",
    "ENOW01",
    "Notification sent"
  ],
  [
    "CITY OF STRATFORD (STRATWS01)",
    "STRATWS01",
    "Notification sent"
  ],
  [
    "FESTIVAL HYDRO (LOCAL HYDRO) (FESTH01)",
    "FESTH01",
    "Notification sent"
  ],
  [
    "WIGHTMAN TELECOM - FIBRE - LIMITED (WT01)",
    "WT01",
    "Notification sent"
  ],
  [
    "CLI FOR ROGERS (ROGWAT01)",
    "ROGWAT01",
    "Cleared"
  ],
  [
    "G-TEL FOR BELL CANADA (BCOW01)",
    "BCOW01",
    "Notification sent"
  ]
]