Oracle:自连接 XMLTable 输出
Oracle: Self join XMLTable output
我正在尝试将某些 XML 解析为 WITH 子句中的 table,然后在主查询中,将 table 与其自身连接。但是,即使是最小的测试用例,我也没有得到任何记录。
这是示例 xml:
<xml>
<entry timestamp="20170330100429" effective="20170329">
<field name="Name">
<ov> <![CDATA[Fran]]> </ov>
<nv> <![CDATA[Frank]]> </nv>
</field>
<field name="Zip">
<ov> <![CDATA[13583]]> </ov>
<nv> <![CDATA[13853]]> </nv>
</field>
</entry>
<entry timestamp="20170401094783" effective="20170331">
<field name="MI">
<ov> <![CDATA[J]]> </ov>
<nv> <![CDATA[A]]> </nv>
</field>
<field name="Suffix">
<ov> <![CDATA[Jr]]> </ov>
<nv> <![CDATA[III]]> </nv>
</field>
</entry>
</xml>
这里是精简的 SQL,将 XML 扁平化为 WITH 子句中的 table:
with myxml as
(
select 1 xml_id,
'<xml> <entry timestamp="20170330100429" effective="20170329"> <field name="Name"> <ov><![CDATA[Fran]]></ov> <nv><![CDATA[Frank]]></nv> </field> <field name="Zip"> <ov><![CDATA[13583]]></ov> <nv><![CDATA[13853]]></nv> </field> </entry> <entry timestamp="20170401094783" effective="20170331"> <field name="MI"> <ov><![CDATA[J]]></ov> <nv><![CDATA[A]]></nv> </field> <field name="Suffix"> <ov><![CDATA[Jr]]></ov> <nv><![CDATA[III]]></nv> </field> </entry> </xml>' x from dual
),
HRH as
(
SELECT au.xml_id,
ts.tran_ts as chg_timestr,
to_date(substr(ts.tran_ts, 0, 8), 'yyyymmdd') chg_date,
cast(substr(ts.tran_ts, 9, 6) as int) chg_time,
f.fieldname,
f.ov old_value,
f.nv new_value
FROM myxml au,
xmltable('/xml'
PASSING XMLTYPE(au.x)
COLUMNS entrynode XMLTYPE PATH 'entry'
) as x1,
xmltable('/entry'
PASSING x1.entrynode
COLUMNS tran_ts VARCHAR(16) PATH '@timestamp',
fieldnode XMLTYPE PATH 'field'
) as ts,
xmltable('/field'
PASSING ts.fieldnode
COLUMNS fieldname VARCHAR(250) PATH '@name',
nv VARCHAR(250) PATH 'nv',
ov VARCHAR(250) PATH 'ov'
) as f
)
select * from HRH
此查询给出以下输出:
XML_ID CHG_TIMESTR CHG_DATE CHG_TIME FIELDNAME OLD_VALUE NEW_VALUE
====== ============== ========== ======== ========= ========= =========
1 20170330100429 2017-03-30 100429 Name Fran Frank
1 20170330100429 2017-03-30 100429 Zip 13583 13853
1 20170401094783 2017-04-01 94783 MI J A
1 20170401094783 2017-04-01 94783 Suffix Jr III
====== ============== ========== ======== ========= ========= =========
如果相反,我使用以下任一方法:
select * from HRH h1, HRH h2
select * from HRH h1 join HRH h2 on 1=1
我收到一条错误消息:
Line Pos Text
==== === ================================================
17 23 ORA-19032: Expected XML tag , got no content
ORA-06512: at "SYS.XMLTYPE", line 310
ORA-06512: at line 1
==== === ================================================
我怎样才能让它工作?我的总体目标是为我的数据集获取给定 xml_id-field 组合的最新更新。
不确定您为什么会看到这种行为;实现第一个 CTE 会停止错误,但随后找不到任何数据。
不直接相关,但您可以将查询(至少对于您显示的数据)简化为单个 XMLTable 调用:
SELECT au.xml_id,
x.tran_ts as chg_timestr,
to_date(substr(x.tran_ts, 0, 8), 'yyyymmdd') chg_date,
cast(substr(x.tran_ts, 9, 6) as int) chg_time,
x.fieldname,
x.ov old_value,
x.nv new_value
FROM myxml au
CROSS JOIN xmltable('/xml/entry/field'
PASSING XMLTYPE(au.x)
COLUMNS tran_ts VARCHAR2(16) PATH './../@timestamp',
fieldname VARCHAR2(250) PATH '@name',
nv VARCHAR2(250) PATH 'nv',
ov VARCHAR2(250) PATH 'ov'
) x;
XML_ID CHG_TIMESTR CHG_DATE CHG_TIME FIELDNAME OLD_VALUE NEW_VALUE
---------- ---------------- ---------- ---------- ---------- ---------- ----------
1 20170330100429 2017-03-30 100429 Name Fran Frank
1 20170330100429 2017-03-30 100429 Zip 13583 13853
1 20170401094783 2017-04-01 94783 MI J A
1 20170401094783 2017-04-01 94783 Suffix Jr III
尝试自联接获取数据,但有很多空值。也奇怪。
但即使您的实际查询太复杂而无法做到这一点(毕竟您说您已经将其精简),您说的目标是获取最新更新,您可以这样做:
with myxml as
(
select /*+ materialize */ 1 xml_id,
'<xml> <entry timestamp="20170330100429" effective="20170329"> <field name="Name"> <ov><![CDATA[Fran]]></ov> <nv><![CDATA[Frank]]></nv> </field> <field name="Zip"> <ov><![CDATA[13583]]></ov> <nv><![CDATA[13853]]></nv> </field> </entry> <entry timestamp="20170401094783" effective="20170331"> <field name="MI"> <ov><![CDATA[J]]></ov> <nv><![CDATA[A]]></nv> </field> <field name="Suffix"> <ov><![CDATA[Jr]]></ov> <nv><![CDATA[III]]></nv> </field> </entry> </xml>' x from dual
),
HRH as
(
SELECT au.xml_id,
x.tran_ts as chg_timestr,
to_date(substr(x.tran_ts, 0, 8), 'yyyymmdd') chg_date,
cast(substr(x.tran_ts, 9, 6) as int) chg_time,
x.fieldname,
x.ov old_value,
x.nv new_value,
rank() over (partition by au.xml_id, x.fieldname
order by x.tran_ts desc) rnk
FROM myxml au
CROSS JOIN xmltable('/xml/entry/field'
PASSING XMLTYPE(au.x)
COLUMNS tran_ts VARCHAR2(16) PATH './../@timestamp',
fieldname VARCHAR2(250) PATH '@name',
nv VARCHAR2(250) PATH 'nv',
ov VARCHAR2(250) PATH 'ov'
) x
)
select * from HRH
where rnk = 1;
因为每个 ID/field 您只有一组数据,可以为您提供与样本相同的结果,但如果有多个,则应该为每个数据提供最新的数据。您还应该查看 rank()
和 dense_rank()
分析函数之间的区别,并决定您想要查看的内容是否存在联系 - 如果您的数据中可能存在联系。
我正在尝试将某些 XML 解析为 WITH 子句中的 table,然后在主查询中,将 table 与其自身连接。但是,即使是最小的测试用例,我也没有得到任何记录。
这是示例 xml:
<xml>
<entry timestamp="20170330100429" effective="20170329">
<field name="Name">
<ov> <![CDATA[Fran]]> </ov>
<nv> <![CDATA[Frank]]> </nv>
</field>
<field name="Zip">
<ov> <![CDATA[13583]]> </ov>
<nv> <![CDATA[13853]]> </nv>
</field>
</entry>
<entry timestamp="20170401094783" effective="20170331">
<field name="MI">
<ov> <![CDATA[J]]> </ov>
<nv> <![CDATA[A]]> </nv>
</field>
<field name="Suffix">
<ov> <![CDATA[Jr]]> </ov>
<nv> <![CDATA[III]]> </nv>
</field>
</entry>
</xml>
这里是精简的 SQL,将 XML 扁平化为 WITH 子句中的 table:
with myxml as
(
select 1 xml_id,
'<xml> <entry timestamp="20170330100429" effective="20170329"> <field name="Name"> <ov><![CDATA[Fran]]></ov> <nv><![CDATA[Frank]]></nv> </field> <field name="Zip"> <ov><![CDATA[13583]]></ov> <nv><![CDATA[13853]]></nv> </field> </entry> <entry timestamp="20170401094783" effective="20170331"> <field name="MI"> <ov><![CDATA[J]]></ov> <nv><![CDATA[A]]></nv> </field> <field name="Suffix"> <ov><![CDATA[Jr]]></ov> <nv><![CDATA[III]]></nv> </field> </entry> </xml>' x from dual
),
HRH as
(
SELECT au.xml_id,
ts.tran_ts as chg_timestr,
to_date(substr(ts.tran_ts, 0, 8), 'yyyymmdd') chg_date,
cast(substr(ts.tran_ts, 9, 6) as int) chg_time,
f.fieldname,
f.ov old_value,
f.nv new_value
FROM myxml au,
xmltable('/xml'
PASSING XMLTYPE(au.x)
COLUMNS entrynode XMLTYPE PATH 'entry'
) as x1,
xmltable('/entry'
PASSING x1.entrynode
COLUMNS tran_ts VARCHAR(16) PATH '@timestamp',
fieldnode XMLTYPE PATH 'field'
) as ts,
xmltable('/field'
PASSING ts.fieldnode
COLUMNS fieldname VARCHAR(250) PATH '@name',
nv VARCHAR(250) PATH 'nv',
ov VARCHAR(250) PATH 'ov'
) as f
)
select * from HRH
此查询给出以下输出:
XML_ID CHG_TIMESTR CHG_DATE CHG_TIME FIELDNAME OLD_VALUE NEW_VALUE
====== ============== ========== ======== ========= ========= =========
1 20170330100429 2017-03-30 100429 Name Fran Frank
1 20170330100429 2017-03-30 100429 Zip 13583 13853
1 20170401094783 2017-04-01 94783 MI J A
1 20170401094783 2017-04-01 94783 Suffix Jr III
====== ============== ========== ======== ========= ========= =========
如果相反,我使用以下任一方法:
select * from HRH h1, HRH h2
select * from HRH h1 join HRH h2 on 1=1
我收到一条错误消息:
Line Pos Text
==== === ================================================
17 23 ORA-19032: Expected XML tag , got no content
ORA-06512: at "SYS.XMLTYPE", line 310
ORA-06512: at line 1
==== === ================================================
我怎样才能让它工作?我的总体目标是为我的数据集获取给定 xml_id-field 组合的最新更新。
不确定您为什么会看到这种行为;实现第一个 CTE 会停止错误,但随后找不到任何数据。
不直接相关,但您可以将查询(至少对于您显示的数据)简化为单个 XMLTable 调用:
SELECT au.xml_id,
x.tran_ts as chg_timestr,
to_date(substr(x.tran_ts, 0, 8), 'yyyymmdd') chg_date,
cast(substr(x.tran_ts, 9, 6) as int) chg_time,
x.fieldname,
x.ov old_value,
x.nv new_value
FROM myxml au
CROSS JOIN xmltable('/xml/entry/field'
PASSING XMLTYPE(au.x)
COLUMNS tran_ts VARCHAR2(16) PATH './../@timestamp',
fieldname VARCHAR2(250) PATH '@name',
nv VARCHAR2(250) PATH 'nv',
ov VARCHAR2(250) PATH 'ov'
) x;
XML_ID CHG_TIMESTR CHG_DATE CHG_TIME FIELDNAME OLD_VALUE NEW_VALUE
---------- ---------------- ---------- ---------- ---------- ---------- ----------
1 20170330100429 2017-03-30 100429 Name Fran Frank
1 20170330100429 2017-03-30 100429 Zip 13583 13853
1 20170401094783 2017-04-01 94783 MI J A
1 20170401094783 2017-04-01 94783 Suffix Jr III
尝试自联接获取数据,但有很多空值。也奇怪。
但即使您的实际查询太复杂而无法做到这一点(毕竟您说您已经将其精简),您说的目标是获取最新更新,您可以这样做:
with myxml as
(
select /*+ materialize */ 1 xml_id,
'<xml> <entry timestamp="20170330100429" effective="20170329"> <field name="Name"> <ov><![CDATA[Fran]]></ov> <nv><![CDATA[Frank]]></nv> </field> <field name="Zip"> <ov><![CDATA[13583]]></ov> <nv><![CDATA[13853]]></nv> </field> </entry> <entry timestamp="20170401094783" effective="20170331"> <field name="MI"> <ov><![CDATA[J]]></ov> <nv><![CDATA[A]]></nv> </field> <field name="Suffix"> <ov><![CDATA[Jr]]></ov> <nv><![CDATA[III]]></nv> </field> </entry> </xml>' x from dual
),
HRH as
(
SELECT au.xml_id,
x.tran_ts as chg_timestr,
to_date(substr(x.tran_ts, 0, 8), 'yyyymmdd') chg_date,
cast(substr(x.tran_ts, 9, 6) as int) chg_time,
x.fieldname,
x.ov old_value,
x.nv new_value,
rank() over (partition by au.xml_id, x.fieldname
order by x.tran_ts desc) rnk
FROM myxml au
CROSS JOIN xmltable('/xml/entry/field'
PASSING XMLTYPE(au.x)
COLUMNS tran_ts VARCHAR2(16) PATH './../@timestamp',
fieldname VARCHAR2(250) PATH '@name',
nv VARCHAR2(250) PATH 'nv',
ov VARCHAR2(250) PATH 'ov'
) x
)
select * from HRH
where rnk = 1;
因为每个 ID/field 您只有一组数据,可以为您提供与样本相同的结果,但如果有多个,则应该为每个数据提供最新的数据。您还应该查看 rank()
和 dense_rank()
分析函数之间的区别,并决定您想要查看的内容是否存在联系 - 如果您的数据中可能存在联系。