如何使用 regex_substr 从 xml 文件中解析多个文件名
How can I parse multiple filenames from xml file using regex_substr
我有一个 FTP 日志以 XML 格式存储在 CLOB 列中,我需要获取它找到和检索的文件名。使用 Table/Lateral 和 REGEX_SUBSTR 我可以获得我需要的已知数量的文件,但我不知道如何处理未知数量的文件。有一个字段 returns 找到的文件数 <FilesProcessed>3</FilesProcessed>
。有没有办法使用该字段来帮助解析文件名?或者,更重要的是,是否有更好的方法来做到这一点?
(我在这里使用的是 CTE,但我将从数据库 table 中提取它(具有大量的列,但没有文件名!)。
With PAYLOAD_DATA(LOGS) as(
VALUES
('<?xml version="1.0"?>
<InboundMFTEventDetailsDocument>
<MFTEventExecutionDetails>
<Status>Successful</Status>
<FilesProcessed>3</FilesProcessed>
<MFTEventLogID>5dn39m00fgmdefo80002g7ki</MFTEventLogID>
<ExecutionLogs>
<Logs>Finding file(s) in VFS Path:/Wholesale/CS/Inbound/SFTP/PROD-OUT-Shipment/, URL:SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/.
Filename Filter = INVPTH*.txt
Found following 3 file(s).
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320009986.txt
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH092720210320009986.txt
</Logs>
<Logs>Starting copy of file(s) from VFS Path:/Wholesale/CS/Inbound/SFTP/PROD-OUT-Shipment/, URL:SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt to VFS Path:/Wholesale/RLM/Outbound/SFTP/SERVERSERVICES-INBOUND-SHIPH/, URL:SFTP://MKWHLDV.kors.local:22/SERVERSERVICES/INBOUND/SHIPH/INVPTH033020210320006396.txt
Copy finished:VFS Path:/Wholesale/CS/Inbound/SFTP/PROD-OUT-Shipment/, URL:SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt
</Logs>
<Logs>...</Logs>
<Logs>...</Logs>
<Logs>...</Logs>
</ExecutionLogs>
</MFTEventExecutionDetails>
</InboundMFTEventDetailsDocument>'))
SELECT FILENAME
FROM PAYLOAD_DATA A,
TABLE (VALUES (REGEXP_SUBSTR(LOGS, ' SFTP://.*[\r\n][ |\<]', 1, REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') - 0)),
(REGEXP_SUBSTR(LOGS, ' SFTP://.*[\r\n][ |\<]', 1, REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') - 1)),
(REGEXP_SUBSTR(LOGS, ' SFTP://.*[\r\n][ |\<]', 1, REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') - 2)),
(REGEXP_SUBSTR(LOGS, ' SFTP://.*[\r\n][ |\<]', 1, REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') - 3))) as G(FILENAME)
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH092720210320009986.txt
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320009986.txt
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt
也许是这样的
With PAYLOAD_DATA(LOGS) as(
...
)
occurrences (occ) as (values 1 union all select occ + 1 from occurrences where occ < (SELECT REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') FROM PAYLOAD_DATA))
select REGEXP_SUBSTR(LOGS, ' (SFTP://.*)[\r\n][ |\<]', 1, occ, '', 1) filename from PAYLOAD_DATA, occurrences
我有一个 FTP 日志以 XML 格式存储在 CLOB 列中,我需要获取它找到和检索的文件名。使用 Table/Lateral 和 REGEX_SUBSTR 我可以获得我需要的已知数量的文件,但我不知道如何处理未知数量的文件。有一个字段 returns 找到的文件数 <FilesProcessed>3</FilesProcessed>
。有没有办法使用该字段来帮助解析文件名?或者,更重要的是,是否有更好的方法来做到这一点?
(我在这里使用的是 CTE,但我将从数据库 table 中提取它(具有大量的列,但没有文件名!)。
With PAYLOAD_DATA(LOGS) as(
VALUES
('<?xml version="1.0"?>
<InboundMFTEventDetailsDocument>
<MFTEventExecutionDetails>
<Status>Successful</Status>
<FilesProcessed>3</FilesProcessed>
<MFTEventLogID>5dn39m00fgmdefo80002g7ki</MFTEventLogID>
<ExecutionLogs>
<Logs>Finding file(s) in VFS Path:/Wholesale/CS/Inbound/SFTP/PROD-OUT-Shipment/, URL:SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/.
Filename Filter = INVPTH*.txt
Found following 3 file(s).
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320009986.txt
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH092720210320009986.txt
</Logs>
<Logs>Starting copy of file(s) from VFS Path:/Wholesale/CS/Inbound/SFTP/PROD-OUT-Shipment/, URL:SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt to VFS Path:/Wholesale/RLM/Outbound/SFTP/SERVERSERVICES-INBOUND-SHIPH/, URL:SFTP://MKWHLDV.kors.local:22/SERVERSERVICES/INBOUND/SHIPH/INVPTH033020210320006396.txt
Copy finished:VFS Path:/Wholesale/CS/Inbound/SFTP/PROD-OUT-Shipment/, URL:SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt
</Logs>
<Logs>...</Logs>
<Logs>...</Logs>
<Logs>...</Logs>
</ExecutionLogs>
</MFTEventExecutionDetails>
</InboundMFTEventDetailsDocument>'))
SELECT FILENAME
FROM PAYLOAD_DATA A,
TABLE (VALUES (REGEXP_SUBSTR(LOGS, ' SFTP://.*[\r\n][ |\<]', 1, REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') - 0)),
(REGEXP_SUBSTR(LOGS, ' SFTP://.*[\r\n][ |\<]', 1, REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') - 1)),
(REGEXP_SUBSTR(LOGS, ' SFTP://.*[\r\n][ |\<]', 1, REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') - 2)),
(REGEXP_SUBSTR(LOGS, ' SFTP://.*[\r\n][ |\<]', 1, REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') - 3))) as G(FILENAME)
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH092720210320009986.txt
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320009986.txt
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt
也许是这样的
With PAYLOAD_DATA(LOGS) as(
...
)
occurrences (occ) as (values 1 union all select occ + 1 from occurrences where occ < (SELECT REGEXP_COUNT(LOGS, ' SFTP://.*[\r\n][ |\<]') FROM PAYLOAD_DATA))
select REGEXP_SUBSTR(LOGS, ' (SFTP://.*)[\r\n][ |\<]', 1, occ, '', 1) filename from PAYLOAD_DATA, occurrences