XPath 排序不持久?

XPath sorting not persistent?

我有一个关注XML:

<doc>
<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>1</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>She Sells Sea Shells by the Sea Shore and she also</ActivityNarrativeText>
  </ActivityNarrativeInformation>
 <ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>3</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>triple shot frappuccino, extra hot, with whipped cream in a tall cup </ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>2</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>likes to take long walks on the beach while she drinks a</ActivityNarrativeText>
  </ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>987654321</ActivityID>
  <ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>486</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>It was a dark and stormy night; the rain fell in torrents--except at occasional intervals, when
 </ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>987654321</ActivityID>
  <ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>488</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>scene lies), rattling along the housetops, and fiercely agitating the scanty flame of the lamps that
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>987654321</ActivityID>
  <ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>487</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>was checked by a violent gust of wind which swept up the streets (for it is in London that our
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>987654321</ActivityID>
  <ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>489</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>struggled against the darkness.
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>55555555</ActivityID>
  <ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>31921</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>Papa Bear was very big and growly. Mamma Bear was middle-sized and pleasant.
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>55555555</ActivityID>
  <ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>31923</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>Papa bear loved to fix things around the house; Mama bear loved to grow flowers in her garden; and, Baby bear loved playing in the yard. They were very happy. </ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>55555555</ActivityID>
  <ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>31920</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>Once upon a time there were three bears, Papa Bear, Mamma Bear and Baby Bear
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>55555555</ActivityID>
  <ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>31922</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>And Baby Bear, well, he was small, and
sometimes he squeaked! They lived in a pretty little house on the edge of the forest
</ActivityNarrativeText>
</ActivityNarrativeInformation>
</doc

我需要按 ActivityID 对 ActivityNarrativeInformation 元素进行分组,并按 ActivityNarrativeSequenceNumber

排序的方式连接 ActivityNarrativeText

我设法使用以下 XPath 查询 (XPath 3.1) 对元素进行排序 sort(//ActivityNarrativeInformation[ActivityID=123456789], (), function($ActivityNarrativeSequenceNumber) {$ActivityNarrativeSequenceNumber})

所以结果是这样的:

<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>1</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>She Sells Sea Shells by the Sea Shore and she also</ActivityNarrativeText>
  </ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>2</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>likes to take long walks on the beach while she drinks a</ActivityNarrativeText>
  </ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>3</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>triple shot frappuccino, extra hot, with whipped cream in a tall cup </ActivityNarrativeText>
</ActivityNarrativeInformation>

然而,问题是,如果我想通过在末尾添加 /ActivityNarrativeText 来限制以上所有 ActivityNarrativeText

sort(//ActivityNarrativeInformation[ActivityID=123456789], (), function($ActivityNarrativeSequenceNumber) {$ActivityNarrativeSequenceNumber})/ActivityNarrativeText

sort(//ActivityNarrativeInformation[ActivityID=123456789]/ActivityNarrativeText, (), function($seq) {$seq/ActivityNarrativeSequenceNumber})

订单丢失:

<ActivityNarrativeText>She Sells Sea Shells by the Sea Shore and she also</ActivityNarrativeText>
<ActivityNarrativeText>triple shot frappuccino, extra hot, with whipped cream in a tall cup </ActivityNarrativeText>
<ActivityNarrativeText>likes to take long walks on the beach while she drinks a</ActivityNarrativeText>

我做错了什么?

如果您想从样本 xml 中提取连贯的句子 ActivityID,则此表达式

string-join(sort(//ActivityNarrativeInformation[ActivityID=123456789]/ActivityNarrativeText/concat(normalize-space()," "), (), function($ActivityNarrativeSequenceNumber) {$ActivityNarrativeSequenceNumber}))

应该输出

She Sells Sea Shells by the Sea Shore and she also likes to take long walks on the beach while she drinks a triple shot frappuccino, extra hot, with whipped cream in a tall cup 

Testing it here: videlibri.de/cgi-bin/xidelcgi

如果您也在使用 , then please add its tag. And maybe for Windows, or for Unix。

我不太确定这可以用 XPath 完成。我相信你最好使用 XQuery。

对于 <ActivityID>123456789</ActivityID> 的叙述,你可以这样做:

$ xidel -s input.xml --xquery '
  normalize-space(
    for $x in //ActivityNarrativeInformation
    where $x/ActivityID = 123456789
    order by $x/ActivityNarrativeSequenceNumber
    return
    $x/ActivityNarrativeText
  )
'

对于所有叙述,我建议:

$ xidel -s input.xml --xquery '
  for $narrative at $i in //ActivityNarrativeInformation
  group by $id:=$narrative/ActivityID
  count $i
  return (
    $i,
    normalize-space(
      for $seq in $narrative
      order by $seq/ActivityNarrativeSequenceNumber
      return
      $seq/ActivityNarrativeText
    )
  )
'
1
Once upon a time there were three bears, [...]
2
She Sells Sea Shells by the Sea Shore and [...]
3
It was a dark and stormy night; the rain [...]

首先按 <ActivityID> 分组,然后在另一个 for 循环中按 <ActivityNarrativeSequenceNumber> 对句子排序。

更新2021-07-05;我忘记了 XPath 的 !。在那种情况下,一个 for 循环就足够了:

$ xidel -s input.xml --xquery '
  for $narrative at $i in //ActivityNarrativeInformation
  order by $narrative/ActivityNarrativeSequenceNumber
  group by $id:=$narrative/ActivityID
  count $i
  return (
    $i,
    normalize-space($narrative ! ActivityNarrativeText)
  )
'

当您写入 /ActivityNarrativeText 时,您丢失了顺序,它 returns <ActivityNarrativeText> 与输入文件中的顺序相同

/something with nodes 并不仅仅意味着将其映射到子节点。

意思是

  • 贴图

  • 将所有节点重新排序为输入文档顺序

  • 删除重复项

你可以使用 !ActivityNarrativeText

除了在排序后不使用 / 而使用 ! 的正确答案之外,如果您的排序函数参数选择了正确的元素作为排序键,您的尝试之一实际上会奏效:

sort(//ActivityNarrativeInformation[ActivityID=123456789]/ActivityNarrativeText, (), function($text) {$text/../ActivityNarrativeSequenceNumber})