Snowflake:从 xml 文件的内部标签获取价值
Snowflake: getting value from inner tags of xml file
我正在尝试将 xml 文件导入雪花数据库 table。我创建了 table 和 XML 文件格式。 XML 文件格式是使用以下代码创建的:
CREATE OR REPLACE FILE FORMAT LAND_XML.PUBLIC.XML_FILE_FORMAT
TYPE = 'XML'
COMPRESSION = 'AUTO'
PRESERVE_SPACE = FALSE
STRIP_OUTER_ELEMENT = TRUE
DISABLE_SNOWFLAKE_DATA = FALSE
DISABLE_AUTO_CONVERT = FALSE
IGNORE_UTF8_ERRORS = FALSE;
XML 文件如下所示:
<?xml version="1.0" encoding="utf-8" ?>
<NoticeOfChange version="1.0.0" application_version="v0.0.1-5780-g16dbd00e9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="https://prod.notices.govt/notices/">
<ProducedBy>
<Name>Land Department</Name>
<Contact>
<Name>Technical Support</Name>
<Phone>0800 xxx xxx</Phone>
<Email>customersupport@land.govt</Email>
</Contact>
</ProducedBy>
<Notices>
<Notice>
<NoticeId>577</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>578</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>579</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>580</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
</Notices>
</NoticeOfChange>
当我将 XML 文件导入雪花数据库 table 时,它只显示两行(而不是预期的 4 行通知)。输出如下图所示:
当前文件格式根据 <produced by>
和 <notices>
标记将 XML 文件分成两行。但是,我对 <produced by>
标签不感兴趣(想丢弃它)并且想将 <notices>
标签中的通知转换为 table 的单独行。基于有限的知识,我无法将文件格式修改为我想要的输出。任何帮助将不胜感激?
根对象是 NoticeOfChange
所以压扁它会给你那个对象,ProducedBy
和 Notices
,正如你所注意到的,你只想要后者,所以压扁那个子对象.. 通过 xmlget(d.xml, 'Notices'):"$"
因此仅将此 CTE 用于数据..
WITH data_table AS (
SELECT PARSE_XML('<?xml version="1.0" encoding="utf-8" ?>
<NoticeOfChange version="1.0.0" application_version="v0.0.1-5780-g16dbd00e9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="https://prod.notices.govt/notices/">
<ProducedBy>
<Name>Land Department</Name>
<Contact>
<Name>Technical Support</Name>
<Phone>0800 xxx xxx</Phone>
<Email>customersupport@land.govt</Email>
</Contact>
</ProducedBy>
<Notices>
<Notice>
<NoticeId>577</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>578</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>579</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>580</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
</Notices>
</NoticeOfChange>') as xml
)
通过以下方式访问通知:
SELECT
f.value as notice
FROM data_table AS d
,lateral flatten(input=>xmlget(d.xml, 'Notices'):"$")f;
给予:
NOTICE
<Notice> <NoticeId>577</NoticeId> <NoticeType>NoticeOfChange</NoticeType> <Description>Notification of change of ownership of rating unit</Description> <Statutory>Under Local Government (Rating) Act 2020</Statutory> </Notice>
<Notice> <NoticeId>578</NoticeId> <NoticeType>NoticeOfChange</NoticeType> <Description>Notification of change of ownership of rating unit</Description> <Statutory>Under Local Government (Rating) Act 2020</Statutory> </Notice>
<Notice> <NoticeId>579</NoticeId> <NoticeType>NoticeOfChange</NoticeType> <Description>Notification of change of ownership of rating unit</Description> <Statutory>Under Local Government (Rating) Act 2020</Statutory> </Notice>
<Notice> <NoticeId>580</NoticeId> <NoticeType>NoticeOfChange</NoticeType> <Description>Notification of change of ownership of rating unit</Description> <Statutory>Under Local Government (Rating) Act 2020</Statutory> </Notice>
此时可以根据需要存储或访问独立部件。
我正在尝试将 xml 文件导入雪花数据库 table。我创建了 table 和 XML 文件格式。 XML 文件格式是使用以下代码创建的:
CREATE OR REPLACE FILE FORMAT LAND_XML.PUBLIC.XML_FILE_FORMAT
TYPE = 'XML'
COMPRESSION = 'AUTO'
PRESERVE_SPACE = FALSE
STRIP_OUTER_ELEMENT = TRUE
DISABLE_SNOWFLAKE_DATA = FALSE
DISABLE_AUTO_CONVERT = FALSE
IGNORE_UTF8_ERRORS = FALSE;
XML 文件如下所示:
<?xml version="1.0" encoding="utf-8" ?>
<NoticeOfChange version="1.0.0" application_version="v0.0.1-5780-g16dbd00e9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="https://prod.notices.govt/notices/">
<ProducedBy>
<Name>Land Department</Name>
<Contact>
<Name>Technical Support</Name>
<Phone>0800 xxx xxx</Phone>
<Email>customersupport@land.govt</Email>
</Contact>
</ProducedBy>
<Notices>
<Notice>
<NoticeId>577</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>578</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>579</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>580</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
</Notices>
</NoticeOfChange>
当我将 XML 文件导入雪花数据库 table 时,它只显示两行(而不是预期的 4 行通知)。输出如下图所示:
当前文件格式根据 <produced by>
和 <notices>
标记将 XML 文件分成两行。但是,我对 <produced by>
标签不感兴趣(想丢弃它)并且想将 <notices>
标签中的通知转换为 table 的单独行。基于有限的知识,我无法将文件格式修改为我想要的输出。任何帮助将不胜感激?
根对象是 NoticeOfChange
所以压扁它会给你那个对象,ProducedBy
和 Notices
,正如你所注意到的,你只想要后者,所以压扁那个子对象.. 通过 xmlget(d.xml, 'Notices'):"$"
因此仅将此 CTE 用于数据..
WITH data_table AS (
SELECT PARSE_XML('<?xml version="1.0" encoding="utf-8" ?>
<NoticeOfChange version="1.0.0" application_version="v0.0.1-5780-g16dbd00e9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="https://prod.notices.govt/notices/">
<ProducedBy>
<Name>Land Department</Name>
<Contact>
<Name>Technical Support</Name>
<Phone>0800 xxx xxx</Phone>
<Email>customersupport@land.govt</Email>
</Contact>
</ProducedBy>
<Notices>
<Notice>
<NoticeId>577</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>578</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>579</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
<Notice>
<NoticeId>580</NoticeId>
<NoticeType>NoticeOfChange</NoticeType>
<Description>Notification of change of ownership of rating unit</Description>
<Statutory>Under Local Government (Rating) Act 2020</Statutory>
</Notice>
</Notices>
</NoticeOfChange>') as xml
)
通过以下方式访问通知:
SELECT
f.value as notice
FROM data_table AS d
,lateral flatten(input=>xmlget(d.xml, 'Notices'):"$")f;
给予:
NOTICE |
---|
<Notice> <NoticeId>577</NoticeId> <NoticeType>NoticeOfChange</NoticeType> <Description>Notification of change of ownership of rating unit</Description> <Statutory>Under Local Government (Rating) Act 2020</Statutory> </Notice> |
<Notice> <NoticeId>578</NoticeId> <NoticeType>NoticeOfChange</NoticeType> <Description>Notification of change of ownership of rating unit</Description> <Statutory>Under Local Government (Rating) Act 2020</Statutory> </Notice> |
<Notice> <NoticeId>579</NoticeId> <NoticeType>NoticeOfChange</NoticeType> <Description>Notification of change of ownership of rating unit</Description> <Statutory>Under Local Government (Rating) Act 2020</Statutory> </Notice> |
<Notice> <NoticeId>580</NoticeId> <NoticeType>NoticeOfChange</NoticeType> <Description>Notification of change of ownership of rating unit</Description> <Statutory>Under Local Government (Rating) Act 2020</Statutory> </Notice> |
此时可以根据需要存储或访问独立部件。