如何使用 regex_substr 从 xml 文件中解析多个文件名

How can I parse multiple filenames from xml file using regex_substr

我有一个 FTP 日志以 XML 格式存储在 CLOB 列中,我需要获取它找到和检索的文件名。使用 Table/Lateral 和 REGEX_SUBSTR 我可以获得我需要的已知数量的文件,但我不知道如何处理未知数量的文件。有一个字段 returns 找到的文件数 <FilesProcessed>3</FilesProcessed>。有没有办法使用该字段来帮助解析文件名?或者,更重要的是,是否有更好的方法来做到这一点?

(我在这里使用的是 CTE,但我将从数据库 table 中提取它(具有大量的列,但没有文件名!)。

With PAYLOAD_DATA(LOGS) as(
VALUES
('<?xml version="1.0"?>
<InboundMFTEventDetailsDocument>
  <MFTEventExecutionDetails>
    <Status>Successful</Status>
    <FilesProcessed>3</FilesProcessed>
    <MFTEventLogID>5dn39m00fgmdefo80002g7ki</MFTEventLogID>
    <ExecutionLogs>
      <Logs>Finding file(s) in VFS Path:/Wholesale/CS/Inbound/SFTP/PROD-OUT-Shipment/, URL:SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/. 
        Filename Filter = INVPTH*.txt
        Found following 3 file(s).
           SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt
           SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320009986.txt
           SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH092720210320009986.txt
      </Logs>
      <Logs>Starting copy of file(s) from VFS Path:/Wholesale/CS/Inbound/SFTP/PROD-OUT-Shipment/, URL:SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt to VFS Path:/Wholesale/RLM/Outbound/SFTP/SERVERSERVICES-INBOUND-SHIPH/, URL:SFTP://MKWHLDV.kors.local:22/SERVERSERVICES/INBOUND/SHIPH/INVPTH033020210320006396.txt
Copy finished:VFS Path:/Wholesale/CS/Inbound/SFTP/PROD-OUT-Shipment/, URL:SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt
</Logs>
      <Logs>...</Logs>
      <Logs>...</Logs>
      <Logs>...</Logs>
    </ExecutionLogs>
  </MFTEventExecutionDetails>
</InboundMFTEventDetailsDocument>'))

SELECT  FILENAME
FROM PAYLOAD_DATA A,
TABLE (VALUES (REGEXP_SUBSTR(LOGS, '  SFTP://.*[\r\n][   |\<]', 1, REGEXP_COUNT(LOGS, '  SFTP://.*[\r\n][   |\<]') - 0)),
              (REGEXP_SUBSTR(LOGS, '  SFTP://.*[\r\n][   |\<]', 1, REGEXP_COUNT(LOGS, '  SFTP://.*[\r\n][   |\<]') - 1)),
              (REGEXP_SUBSTR(LOGS, '  SFTP://.*[\r\n][   |\<]', 1, REGEXP_COUNT(LOGS, '  SFTP://.*[\r\n][   |\<]') - 2)),
              (REGEXP_SUBSTR(LOGS, '  SFTP://.*[\r\n][   |\<]', 1, REGEXP_COUNT(LOGS, '  SFTP://.*[\r\n][   |\<]') - 3)))  as G(FILENAME)
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH092720210320009986.txt 
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320009986.txt 
SFTP://ftp.some_server.com:22/UAT/OUT/Shipment/INVPTH033020210320006396.txt

也许是这样的

With PAYLOAD_DATA(LOGS) as(
...
)
occurrences (occ) as (values 1 union all select occ + 1 from occurrences where occ < (SELECT  REGEXP_COUNT(LOGS, '  SFTP://.*[\r\n][   |\<]') FROM PAYLOAD_DATA))
select REGEXP_SUBSTR(LOGS, '  (SFTP://.*)[\r\n][   |\<]', 1, occ, '', 1) filename from PAYLOAD_DATA, occurrences