如何在函数中使用 2 个变量而不是 1 个变量来在 power bi 和 power query 中抓取子 URL?

How to use 2 variables in a function instead of 1 to webs scrape sub-URLs in power bi and power query?

我有这个table;

这是使用此代码创建的;

let
    Source = Xml.Tables(Web.Contents("https://www.edmunds.com/sitemap_web54-mmy-cost-to-own.xml")),
    Table0 = Source{0}[Table],
    #"Kept First Rows" = Table.FirstN(Table0,10),
    #"Added Custom" = Table.AddColumn(#"Kept First Rows", "Custom", each Web.BrowserContents([loc])),
    #"Added Custom3" = Table.AddColumn(#"Added Custom", "Custom.3", each try Text.Range([Custom],Text.PositionOf([Custom],"<optgroup"),Text.PositionOf([Custom],"</optgroup>")-Text.PositionOf([Custom],"<optgroup")+11) otherwise "<optgroup/>"),
    #"Parsed XML" = Table.TransformColumns(#"Added Custom3",{{"Custom.3", Xml.Tables}}),
    #"Expanded Custom.3" = Table.ExpandTableColumn(#"Parsed XML", "Custom.3", {"option"}, {"option"}),
    #"Expanded option" = Table.ExpandTableColumn(#"Expanded Custom.3", "option", {"Element:Text", "Attribute:value"}, {"Model", "Style"})
in
    #"Expanded option"

如果您查看 loc 列,您会发现多个模型具有相同的 link。 最终我想要每个模型的拥有成本数据。 所以我在一个新查询中创建了这段代码,并将其绑定到上述查询中的自定义列。

(PageMake as text)=>
let
    Source = Web.BrowserContents(PageMake),
    #"Extracted Table From Html" = Html.Table(Source, {{"Column1", "SECTION:nth-child(2) > DIV.table-responsive > TABLE.costs-table.text-gray-darker.table.table-borderless > * > TR > :nth-child(1)"}, {"Column2", "SECTION:nth-child(2) > DIV.table-responsive > TABLE.costs-table.text-gray-darker.table.table-borderless > * > TR > :nth-child(2)"}, {"Column3", "SECTION:nth-child(2) > DIV.table-responsive > TABLE.costs-table.text-gray-darker.table.table-borderless > * > TR > :nth-child(3)"}, {"Column4", "SECTION:nth-child(2) > DIV.table-responsive > TABLE.costs-table.text-gray-darker.table.table-borderless > * > TR > :nth-child(4)"}, {"Column5", "SECTION:nth-child(2) > DIV.table-responsive > TABLE.costs-table.text-gray-darker.table.table-borderless > * > TR > :nth-child(5)"}, {"Column6", "SECTION:nth-child(2) > DIV.table-responsive > TABLE.costs-table.text-gray-darker.table.table-borderless > * > TR > :nth-child(6)"}, {"Column7", "SECTION:nth-child(2) > DIV.table-responsive > TABLE.costs-table.text-gray-darker.table.table-borderless > * > TR > :nth-child(7)"}}, [RowSelector="SECTION:nth-child(2) > DIV.table-responsive > TABLE.costs-table.text-gray-darker.table.table-borderless > * > TR"]),
    #"Promoted Headers" = Table.PromoteHeaders(#"Extracted Table From Html", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"", type text}, {"Year 1", Currency.Type}, {"Year 2", Currency.Type}, {"Year 3", Currency.Type}, {"Year 4", Currency.Type}, {"Year 5", Currency.Type}, {"Total", Currency.Type}})
in
    #"Changed Type"

问题是 link 与不同模型没有任何联系,因此当它提取 Cost to Own 数据时 table 它只是为与 link.

关联的所有模型拉出第一个

您面临的问题是每辆车都没有明确的 table。有一个 table 由几个人共享。如果您执行新的网络查询(查询 -> 网络)并输入 URL,例如 https://www.edmunds.com/lexus/ls-460/2016/cost-to-own/?style=401580678,您将看到 table 的集合。 (我很确定你已经从上面的信息中知道了这一点。)但是如果你在建议的 Table 中查看,你会看到它们都包含多个 2016 模型的信息。我认为您需要使用所需的信息提取整个 table(我认为您需要 Table 1),然后在获得后解析 table。您可以使用 table 中的任何一辆车前往 table。 (在我看来,快速浏览一下,每个 table 都是年级组的所有车辆)。

使用 = Table.AddColumn(#"Expanded option", "Custom.1", each fnGetEdmunds(Text.From([loc])&"?style="&Text.From([Style]))) 从您的第一个查询中的新列调用您的函数。您会遇到一些错误,因为某些行没有样式来查找具有拥有成本的页面,而某些页面没有 table 具有拥有成本的页面。所以你必须处理这些错误。

这里是 M 代码:

//The base query:
let
    Source = Xml.Tables(Web.Contents("https://www.edmunds.com/sitemap_web54-mmy-cost-to-own.xml")),
    Table0 = Source{0}[Table],
    #"Kept First Rows" = Table.FirstN(Table0,10),
    #"Added Custom" = Table.AddColumn(#"Kept First Rows", "Custom", each Web.BrowserContents([loc])),
    #"Added Custom3" = Table.AddColumn(#"Added Custom", "Custom.3", each try Text.Range([Custom],Text.PositionOf([Custom],"<optgroup"),Text.PositionOf([Custom],"</optgroup>")-Text.PositionOf([Custom],"<optgroup")+11) otherwise "<optgroup/>"),
    #"Parsed XML" = Table.TransformColumns(#"Added Custom3",{{"Custom.3", Xml.Tables}}),
    #"Expanded Custom.3" = Table.ExpandTableColumn(#"Parsed XML", "Custom.3", {"option"}, {"option"}),
    #"Expanded option" = Table.ExpandTableColumn(#"Expanded Custom.3", "option", {"Element:Text", "Attribute:value"}, {"Model", "Style"}),
    #"Invoked Custom Function" = Table.AddColumn(#"Expanded option", "Custom.1", each fnGetEdmunds(Text.From([loc])&"?style="&Text.From([Style])))
in
    #"Invoked Custom Function"

//The function named fnGetEdmunds
(PageMake as text)=>
let
    Source = Web.BrowserContents(PageMake),
    #"Extracted Table From Html" = Html.Table(Source, {{"Column1", ".col-fixed"}, {"Column2", ".col-padding-left"}, {"Column3", ".col-padding-left + *"}, {"Column4", ".d-none TD:nth-child(4)"}, {"Column5", ".d-none TD:nth-child(5)"}, {"Column6", ".d-none TD:nth-child(6)"}, {"Column7", ".d-none .font-weight-bold"}, {"Column8", ".d-none:nth-child(3) TD:nth-child(4)"}, {"Column9", ".d-none:nth-child(3) TD:nth-child(5)"}, {"Column10", ".d-none:nth-child(3) TD:nth-child(6)"}, {"Column11", ".d-none:nth-child(4) TD:nth-child(4)"}, {"Column12", ".d-none:nth-child(4) TD:nth-child(5)"}, {"Column13", ".d-none:nth-child(4) TD:nth-child(6)"}, {"Column14", ".d-none:nth-child(5) TD:nth-child(4)"}, {"Column15", ".d-none:nth-child(5) TD:nth-child(5)"}, {"Column16", ".d-none:nth-child(5) TD:nth-child(6)"}, {"Column17", ".d-inline"}, {"Column18", ".p-0.heading-4"}, {"Column19", ".mb-1 SPAN"}, {"Column20", "CAPTION"}, {"Column21", "TH:nth-child(4)"}, {"Column22", "TH:nth-child(5)"}, {"Column23", "TH:nth-child(6)"}, {"Column24", "TH:nth-child(7)"}}, [RowSelector=".col-fixed"]),
    #"Promoted Headers" = Table.PromoteHeaders(#"Extracted Table From Html", [PromoteAllScalars=true]),
    #"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"", type text}, {"Year 1", type text}, {"Year 2", type text}, {"Column4", Currency.Type}, {"Column5", Currency.Type}, {"Column6", Currency.Type}, {"Column7", Currency.Type}, {"Column8", Currency.Type}, {"Column9", Currency.Type}, {"Column10", Currency.Type}, {"Column11", Currency.Type}, {"Column12", Currency.Type}, {"Column13", Currency.Type}, {"Column14", Currency.Type}, {"Column15", Currency.Type}, {"Column16", Currency.Type}, {"Column17", type text}, {"Column18", type text}, {"Column19", type text}, {"Column20", type text}, {"Year 3", type text}, {"Year 4", type text}, {"Year 5", type text}, {"Total", type text}})
in
    #"Changed Type"

就像我上面说的:您仍然需要解析 table 以提取特定车辆的信息,并处理错误。