当所有列名称在 kdb 中都不匹配时，根据特定条件将行从一个数据集添加到另一个数据集

Question

我有两个数据集，名为 data1 和 data2。 data1 看起来像：

 id1     id2  exc1  exc2  exc3  exc4
 "aa2"   "12ac"   45     54   53     65 
 "bb"     "23"   23     33   23     12

data2 看起来像：

kid1   id2   sf1   sf2  sf3   sf4  exc1 exc2
"aa2" "ads2"  55    6    55   66    45   54

kid1 列和 id1 列具有相同的条目，只是两个数据集中的行数不同。 data1 中缺少一些行，我必须从 data2 中选择这些行。为此，我必须合并 data1 中的 id1 和 id2 以及 data2 中的 kid1 和 id2，并创建一个名为 link 的新列。在 excel 中，我像 "aa2 | 12ac" 一样对 data2 进行了类似的操作。此外，我必须搜索 link 的哪些条目存在于 data2 中但不存在于 data1 中，我必须将它们添加到 data1 中。

从data2向data1添加新行的条件是：如果data1和data2中的列名相同，则使用data2中的数据，否则使用data2中不存在但data1中存在的列名，从 kid1 = id1 的行复制数据。

我已经完成了 excel 中的全部工作，并且想在 kdb 中复制整个工作以加快进程。如果有人可以帮助我，那就太好了。对此的任何线索表示赞赏。谢谢

Answer 1

这里有一种方法可以满足您的需求。首先，设置 tables:

t:([]id1:("aa2";"bb");id2:("12ac";"23");exc1:45 23; exc2:54 33;exc3:53 23;exc4:65 12)
q:([]kid:enlist "aa2";id2:enlist "ads2";sf1:(),55;sf2:(),6;sf3:(),55;sf4:(),66;exc1:(),45;exc2:(),54)

然后，使用 sv 关键字将 id1/id2 和 kid/id2 列连接到 link 列中，就像上面那样，然后键入 tables 在这个新列上

rt:`link xkey update link:`$"|"sv/:flip(id1;id2),id1:`$id1,id2:`$id2 from t
rq:`link xkey update link:`$"|"sv/:flip(kid;id2),kid:`$kid,id2:`$id2 from q

然后使用 uj 将 table 连接在一起，如果记录匹配，这将自动用第二个 table 的值覆盖第一个 table 中的值, 否则保留旧值：

q)rt uj rq
link     | id1   id2    exc1 exc2 exc3 exc4 kid   sf1 sf2 sf3 sf4
---------| ------------------------------------------------------
aa2|12ac | "aa2" "12ac" 45   54   53   65   ""
bb|23    | "bb"  "23"   23   33   23   12   ""
aa2|ads2 | ""    "ads2" 45   54             "aa2" 55  6   55  66

希望对您有所帮助。

Answer 2

您在找这样的东西吗？

q)(2!data1) uj `id1`id2 xkey update id1:kid1 from data2
id1   id2   | exc1 exc2 exc3 exc4 kid1  sf1 sf2 sf3 sf4
------------| -----------------------------------------
"aa2" "12ac"| 45   54   53   65   ""
"bb"  "23"  | 23   33   23   12   ""
"aa2" "ads2"| 45   54             "aa2" 55  6   55  66

具有两个键控 table 的 uj 将 return 具有键并集的 table

为了从 data1 返回到更新 exc2 和 exc3 如果它们是空白的，只使用 id1 作为键，你可以尝试一些东西像这样：

q)t:(2!data1) uj `id1`id2 xkey update id1:kid1 from data2   //same as before
q)(t lj 1!select id1,exc2,exc3 from data1)^t                //lj these fields on, use fill to only update null fields
id1   id2   | exc1 exc2 exc3 exc4 kid1  sf1 sf2 sf3 sf4
------------| -----------------------------------------
"aa2" "12ac"| 45   54   53   65   ""
"bb"  "23"  | 23   33   23   12   ""
"aa2" "ads2"| 45   54   53        "aa2" 55  6   55  66
q)cols[data1]#0!(t lj 1!select id1,exc2,exc3 from data1)^t  //use Ryan's suggestion for getting the cols you desire
id1   id2    exc1 exc2 exc3 exc4
--------------------------------
"aa2" "12ac" 45   54   53   65
"bb"  "23"   23   33   23   12
"aa2" "ads2" 45   54   53

当所有列名称在 kdb 中都不匹配时，根据特定条件将行从一个数据集添加到另一个数据集

Addiing rows from from one dataset to another based on certain conditions when all columns name don't match in kdb+

kdb

q-lang