XQuery如何制作相似度矩阵?
XQuery how to make a similarity matrix?
让我们假设我们有 n 条记录。我想计算每条记录与所有其他记录之间的相似度。我想做一个相似度矩阵。我是 XQuery 的新手,但我正在尽力而为。我附上了一对记录之间的相似度必须是什么样子的屏幕截图。
这是一个csv字符串。我使用以下 for 循环来生成此示例:
for $item1 at $index in /rec:Record
let $records:= /rec:Record
for $item2 in $records[$index + 1]
(: here I call the similarity functions :)
return
(: csv output :)
我需要编辑 for 循环以生成数据集中每对记录之间的相似度矩阵。怎么做??
注意:相似度函数已准备就绪,我的问题是 NOT 计算相似度本身。
你可能会做这样的事情。我不确定您的 csv 是什么样子或您的解析器如何加载它。我还模拟了您表示已经拥有的某种功能。
declare function local:somefn ($listA as xs:integer*, $listB as xs:integer*) xs:string { "6,7,10,3" };
let $data :=
<csv>
<row>1,1,1</row>
<row>2,2,2</row>
<row>3,3,3</row>
<row>4,4,4</row>
</csv>
for $row1 at $pos in $data/row
for $row2 in $data/row[ position() > $pos ]
let $x := local:somefn($row1, $row2)
return $x
在 baseX 中这产生:
6,7,10,3
6,7,10,3
6,7,10,3
6,7,10,3
6,7,10,3
6,7,10,3
编辑:添加 CSV 输出作为文本节点结束:
想想 MarkLogic 中地图的强大功能。
在 ML 中表示矩阵的示例如下。我也迷上了两件事:一个函数作为你的公式的占位符(包括传递你的原始序列以防你需要它进行分析)以及一个小函数来显示如何访问地图的地图。
xquery version "1.0-ml";
declare function local:csv($matrix){
let $nl := " "
return text{
for $x in map:keys($matrix)
let $row := map:get($matrix, $x)
order by xs:int($x)
return fn:string-join(for $y in map:keys($row)
order by xs:int($y)
return xs:string(map:get($row, $y))
, ",") || $nl
}
};
declare function local:my-formula($x, $y, $seq){
let $foo := "do something"
return "your-formula for " || xs:string($x) || " and " || xs:string($y)
};
declare function local:pretty($matrix){
<matrix>
{
for $x in map:keys($matrix)
order by xs:int($x)
return <row>
{
let $row := map:get($matrix, $x)
for $y in map:keys($row)
order by xs:int($y)
return <cell x="{$x}" y="{$y}">{map:get($row, $y)}</cell>
}
</row>
}
</matrix>
};
let $matrix := map:map()
let $numbers := "1,2,3,4,5,5,6,7,8"
let $seq := fn:tokenize($numbers, ",")
let $_ := for $x in $seq
let $map := map:map()
let $_ := for $y in $seq
return map:put($map, $y, local:my-formula($x, $y, $seq))
return map:put($matrix, $x, $map)
return local:pretty($matrix)
您可以直接转储地图中的地图($matrix)。不过local:pretty函数returns一个格式让你很方便的看地图的构造图:
<matrix>
<row>
<cell x="1" y="1">your-formula for 1 and 1</cell>
<cell x="1" y="2">your-formula for 1 and 2</cell>
<cell x="1" y="3">your-formula for 1 and 3</cell>
<cell x="1" y="4">your-formula for 1 and 4</cell>
<cell x="1" y="5">your-formula for 1 and 5</cell>
<cell x="1" y="6">your-formula for 1 and 6</cell>
<cell x="1" y="7">your-formula for 1 and 7</cell>
<cell x="1" y="8">your-formula for 1 and 8</cell>
</row>
<row>
<cell x="2" y="1">your-formula for 2 and 1</cell>
<cell x="2" y="2">your-formula for 2 and 2</cell>
<cell x="2" y="3">your-formula for 2 and 3</cell>
<cell x="2" y="4">your-formula for 2 and 4</cell>
<cell x="2" y="5">your-formula for 2 and 5</cell>
<cell x="2" y="6">your-formula for 2 and 6</cell>
<cell x="2" y="7">your-formula for 2 and 7</cell>
<cell x="2" y="8">your-formula for 2 and 8</cell>
</row>
<row>
<cell x="3" y="1">your-formula for 3 and 1</cell>
<cell x="3" y="2">your-formula for 3 and 2</cell>
<cell x="3" y="3">your-formula for 3 and 3</cell>
<cell x="3" y="4">your-formula for 3 and 4</cell>
<cell x="3" y="5">your-formula for 3 and 5</cell>
<cell x="3" y="6">your-formula for 3 and 6</cell>
<cell x="3" y="7">your-formula for 3 and 7</cell>
<cell x="3" y="8">your-formula for 3 and 8</cell>
</row>
<row>
<cell x="4" y="1">your-formula for 4 and 1</cell>
<cell x="4" y="2">your-formula for 4 and 2</cell>
<cell x="4" y="3">your-formula for 4 and 3</cell>
<cell x="4" y="4">your-formula for 4 and 4</cell>
<cell x="4" y="5">your-formula for 4 and 5</cell>
<cell x="4" y="6">your-formula for 4 and 6</cell>
<cell x="4" y="7">your-formula for 4 and 7</cell>
<cell x="4" y="8">your-formula for 4 and 8</cell>
</row>
<row>
<cell x="5" y="1">your-formula for 5 and 1</cell>
<cell x="5" y="2">your-formula for 5 and 2</cell>
<cell x="5" y="3">your-formula for 5 and 3</cell>
<cell x="5" y="4">your-formula for 5 and 4</cell>
<cell x="5" y="5">your-formula for 5 and 5</cell>
<cell x="5" y="6">your-formula for 5 and 6</cell>
<cell x="5" y="7">your-formula for 5 and 7</cell>
<cell x="5" y="8">your-formula for 5 and 8</cell>
</row>
<row>
<cell x="6" y="1">your-formula for 6 and 1</cell>
<cell x="6" y="2">your-formula for 6 and 2</cell>
<cell x="6" y="3">your-formula for 6 and 3</cell>
<cell x="6" y="4">your-formula for 6 and 4</cell>
<cell x="6" y="5">your-formula for 6 and 5</cell>
<cell x="6" y="6">your-formula for 6 and 6</cell>
<cell x="6" y="7">your-formula for 6 and 7</cell>
<cell x="6" y="8">your-formula for 6 and 8</cell>
</row>
<row>
<cell x="7" y="1">your-formula for 7 and 1</cell>
<cell x="7" y="2">your-formula for 7 and 2</cell>
<cell x="7" y="3">your-formula for 7 and 3</cell>
<cell x="7" y="4">your-formula for 7 and 4</cell>
<cell x="7" y="5">your-formula for 7 and 5</cell>
<cell x="7" y="6">your-formula for 7 and 6</cell>
<cell x="7" y="7">your-formula for 7 and 7</cell>
<cell x="7" y="8">your-formula for 7 and 8</cell>
</row>
<row>
<cell x="8" y="1">your-formula for 8 and 1</cell>
<cell x="8" y="2">your-formula for 8 and 2</cell>
<cell x="8" y="3">your-formula for 8 and 3</cell>
<cell x="8" y="4">your-formula for 8 and 4</cell>
<cell x="8" y="5">your-formula for 8 and 5</cell>
<cell x="8" y="6">your-formula for 8 and 6</cell>
<cell x="8" y="7">your-formula for 8 and 7</cell>
<cell x="8" y="8">your-formula for 8 and 8</cell>
</row>
</matrix>
对于 CSV,有一个名为 local:csv 的示例函数,它创建一个文本节点,结果如下:
your-formula for 1 and 1,your-formula for 1 and 2,your-formula for 1 and 3,your-formula for 1 and 4,your-formula for 1 and 5,your-formula for 1 and 6,your-formula for 1 and 7,your-formula for 1 and 8
your-formula for 2 and 1,your-formula for 2 and 2,your-formula for 2 and 3,your-formula for 2 and 4,your-formula for 2 and 5,your-formula for 2 and 6,your-formula for 2 and 7,your-formula for 2 and 8
your-formula for 3 and 1,your-formula for 3 and 2,your-formula for 3 and 3,your-formula for 3 and 4,your-formula for 3 and 5,your-formula for 3 and 6,your-formula for 3 and 7,your-formula for 3 and 8
your-formula for 4 and 1,your-formula for 4 and 2,your-formula for 4 and 3,your-formula for 4 and 4,your-formula for 4 and 5,your-formula for 4 and 6,your-formula for 4 and 7,your-formula for 4 and 8
your-formula for 5 and 1,your-formula for 5 and 2,your-formula for 5 and 3,your-formula for 5 and 4,your-formula for 5 and 5,your-formula for 5 and 6,your-formula for 5 and 7,your-formula for 5 and 8
your-formula for 6 and 1,your-formula for 6 and 2,your-formula for 6 and 3,your-formula for 6 and 4,your-formula for 6 and 5,your-formula for 6 and 6,your-formula for 6 and 7,your-formula for 6 and 8
your-formula for 7 and 1,your-formula for 7 and 2,your-formula for 7 and 3,your-formula for 7 and 4,your-formula for 7 and 5,your-formula for 7 and 6,your-formula for 7 and 7,your-formula for 7 and 8
your-formula for 8 and 1,your-formula for 8 and 2,your-formula for 8 and 3,your-formula for 8 and 4,your-formula for 8 and 5,your-formula for 8 and 6,your-formula for 8 and 7,your-formula for 8 and 8
让我们假设我们有 n 条记录。我想计算每条记录与所有其他记录之间的相似度。我想做一个相似度矩阵。我是 XQuery 的新手,但我正在尽力而为。我附上了一对记录之间的相似度必须是什么样子的屏幕截图。
这是一个csv字符串。我使用以下 for 循环来生成此示例:
for $item1 at $index in /rec:Record
let $records:= /rec:Record
for $item2 in $records[$index + 1]
(: here I call the similarity functions :)
return
(: csv output :)
我需要编辑 for 循环以生成数据集中每对记录之间的相似度矩阵。怎么做??
注意:相似度函数已准备就绪,我的问题是 NOT 计算相似度本身。
你可能会做这样的事情。我不确定您的 csv 是什么样子或您的解析器如何加载它。我还模拟了您表示已经拥有的某种功能。
declare function local:somefn ($listA as xs:integer*, $listB as xs:integer*) xs:string { "6,7,10,3" };
let $data :=
<csv>
<row>1,1,1</row>
<row>2,2,2</row>
<row>3,3,3</row>
<row>4,4,4</row>
</csv>
for $row1 at $pos in $data/row
for $row2 in $data/row[ position() > $pos ]
let $x := local:somefn($row1, $row2)
return $x
在 baseX 中这产生:
6,7,10,3
6,7,10,3
6,7,10,3
6,7,10,3
6,7,10,3
6,7,10,3
编辑:添加 CSV 输出作为文本节点结束:
想想 MarkLogic 中地图的强大功能。
在 ML 中表示矩阵的示例如下。我也迷上了两件事:一个函数作为你的公式的占位符(包括传递你的原始序列以防你需要它进行分析)以及一个小函数来显示如何访问地图的地图。
xquery version "1.0-ml";
declare function local:csv($matrix){
let $nl := " "
return text{
for $x in map:keys($matrix)
let $row := map:get($matrix, $x)
order by xs:int($x)
return fn:string-join(for $y in map:keys($row)
order by xs:int($y)
return xs:string(map:get($row, $y))
, ",") || $nl
}
};
declare function local:my-formula($x, $y, $seq){
let $foo := "do something"
return "your-formula for " || xs:string($x) || " and " || xs:string($y)
};
declare function local:pretty($matrix){
<matrix>
{
for $x in map:keys($matrix)
order by xs:int($x)
return <row>
{
let $row := map:get($matrix, $x)
for $y in map:keys($row)
order by xs:int($y)
return <cell x="{$x}" y="{$y}">{map:get($row, $y)}</cell>
}
</row>
}
</matrix>
};
let $matrix := map:map()
let $numbers := "1,2,3,4,5,5,6,7,8"
let $seq := fn:tokenize($numbers, ",")
let $_ := for $x in $seq
let $map := map:map()
let $_ := for $y in $seq
return map:put($map, $y, local:my-formula($x, $y, $seq))
return map:put($matrix, $x, $map)
return local:pretty($matrix)
您可以直接转储地图中的地图($matrix)。不过local:pretty函数returns一个格式让你很方便的看地图的构造图:
<matrix>
<row>
<cell x="1" y="1">your-formula for 1 and 1</cell>
<cell x="1" y="2">your-formula for 1 and 2</cell>
<cell x="1" y="3">your-formula for 1 and 3</cell>
<cell x="1" y="4">your-formula for 1 and 4</cell>
<cell x="1" y="5">your-formula for 1 and 5</cell>
<cell x="1" y="6">your-formula for 1 and 6</cell>
<cell x="1" y="7">your-formula for 1 and 7</cell>
<cell x="1" y="8">your-formula for 1 and 8</cell>
</row>
<row>
<cell x="2" y="1">your-formula for 2 and 1</cell>
<cell x="2" y="2">your-formula for 2 and 2</cell>
<cell x="2" y="3">your-formula for 2 and 3</cell>
<cell x="2" y="4">your-formula for 2 and 4</cell>
<cell x="2" y="5">your-formula for 2 and 5</cell>
<cell x="2" y="6">your-formula for 2 and 6</cell>
<cell x="2" y="7">your-formula for 2 and 7</cell>
<cell x="2" y="8">your-formula for 2 and 8</cell>
</row>
<row>
<cell x="3" y="1">your-formula for 3 and 1</cell>
<cell x="3" y="2">your-formula for 3 and 2</cell>
<cell x="3" y="3">your-formula for 3 and 3</cell>
<cell x="3" y="4">your-formula for 3 and 4</cell>
<cell x="3" y="5">your-formula for 3 and 5</cell>
<cell x="3" y="6">your-formula for 3 and 6</cell>
<cell x="3" y="7">your-formula for 3 and 7</cell>
<cell x="3" y="8">your-formula for 3 and 8</cell>
</row>
<row>
<cell x="4" y="1">your-formula for 4 and 1</cell>
<cell x="4" y="2">your-formula for 4 and 2</cell>
<cell x="4" y="3">your-formula for 4 and 3</cell>
<cell x="4" y="4">your-formula for 4 and 4</cell>
<cell x="4" y="5">your-formula for 4 and 5</cell>
<cell x="4" y="6">your-formula for 4 and 6</cell>
<cell x="4" y="7">your-formula for 4 and 7</cell>
<cell x="4" y="8">your-formula for 4 and 8</cell>
</row>
<row>
<cell x="5" y="1">your-formula for 5 and 1</cell>
<cell x="5" y="2">your-formula for 5 and 2</cell>
<cell x="5" y="3">your-formula for 5 and 3</cell>
<cell x="5" y="4">your-formula for 5 and 4</cell>
<cell x="5" y="5">your-formula for 5 and 5</cell>
<cell x="5" y="6">your-formula for 5 and 6</cell>
<cell x="5" y="7">your-formula for 5 and 7</cell>
<cell x="5" y="8">your-formula for 5 and 8</cell>
</row>
<row>
<cell x="6" y="1">your-formula for 6 and 1</cell>
<cell x="6" y="2">your-formula for 6 and 2</cell>
<cell x="6" y="3">your-formula for 6 and 3</cell>
<cell x="6" y="4">your-formula for 6 and 4</cell>
<cell x="6" y="5">your-formula for 6 and 5</cell>
<cell x="6" y="6">your-formula for 6 and 6</cell>
<cell x="6" y="7">your-formula for 6 and 7</cell>
<cell x="6" y="8">your-formula for 6 and 8</cell>
</row>
<row>
<cell x="7" y="1">your-formula for 7 and 1</cell>
<cell x="7" y="2">your-formula for 7 and 2</cell>
<cell x="7" y="3">your-formula for 7 and 3</cell>
<cell x="7" y="4">your-formula for 7 and 4</cell>
<cell x="7" y="5">your-formula for 7 and 5</cell>
<cell x="7" y="6">your-formula for 7 and 6</cell>
<cell x="7" y="7">your-formula for 7 and 7</cell>
<cell x="7" y="8">your-formula for 7 and 8</cell>
</row>
<row>
<cell x="8" y="1">your-formula for 8 and 1</cell>
<cell x="8" y="2">your-formula for 8 and 2</cell>
<cell x="8" y="3">your-formula for 8 and 3</cell>
<cell x="8" y="4">your-formula for 8 and 4</cell>
<cell x="8" y="5">your-formula for 8 and 5</cell>
<cell x="8" y="6">your-formula for 8 and 6</cell>
<cell x="8" y="7">your-formula for 8 and 7</cell>
<cell x="8" y="8">your-formula for 8 and 8</cell>
</row>
</matrix>
对于 CSV,有一个名为 local:csv 的示例函数,它创建一个文本节点,结果如下:
your-formula for 1 and 1,your-formula for 1 and 2,your-formula for 1 and 3,your-formula for 1 and 4,your-formula for 1 and 5,your-formula for 1 and 6,your-formula for 1 and 7,your-formula for 1 and 8
your-formula for 2 and 1,your-formula for 2 and 2,your-formula for 2 and 3,your-formula for 2 and 4,your-formula for 2 and 5,your-formula for 2 and 6,your-formula for 2 and 7,your-formula for 2 and 8
your-formula for 3 and 1,your-formula for 3 and 2,your-formula for 3 and 3,your-formula for 3 and 4,your-formula for 3 and 5,your-formula for 3 and 6,your-formula for 3 and 7,your-formula for 3 and 8
your-formula for 4 and 1,your-formula for 4 and 2,your-formula for 4 and 3,your-formula for 4 and 4,your-formula for 4 and 5,your-formula for 4 and 6,your-formula for 4 and 7,your-formula for 4 and 8
your-formula for 5 and 1,your-formula for 5 and 2,your-formula for 5 and 3,your-formula for 5 and 4,your-formula for 5 and 5,your-formula for 5 and 6,your-formula for 5 and 7,your-formula for 5 and 8
your-formula for 6 and 1,your-formula for 6 and 2,your-formula for 6 and 3,your-formula for 6 and 4,your-formula for 6 and 5,your-formula for 6 and 6,your-formula for 6 and 7,your-formula for 6 and 8
your-formula for 7 and 1,your-formula for 7 and 2,your-formula for 7 and 3,your-formula for 7 and 4,your-formula for 7 and 5,your-formula for 7 and 6,your-formula for 7 and 7,your-formula for 7 and 8
your-formula for 8 and 1,your-formula for 8 and 2,your-formula for 8 and 3,your-formula for 8 and 4,your-formula for 8 and 5,your-formula for 8 and 6,your-formula for 8 and 7,your-formula for 8 and 8