从动态字符串中删除字符

Question

我有一个包含一些垃圾值的文件，我需要在将该文件加载到 table 时删除它们。在这里举一些例子。文件以分号分隔，最后一列包含那些垃圾值。

2019-02-20;05377378;ABC+xY+++Rohit Anita Chicago
2019-02-20;05201343;ABC+xY++Gustav Russia
2019-02-20;07348738;ABC+xy+++Jain Ram Ambarnath

现在我必须加载没有 ABC+xY+++ 值的最后一列。但有些行我有 ABC+xY+++ 和一些 ABC+xY++。任何摆脱这个的建议。这意味着 2 次或 3 次 + 可用

我正在使用 informatica powercenter 加载此文件。在表达式中我需要创建一些 substr/instr 函数。我也可以在 oracle sql 中进行测试，以便快速了解值是否正确。

我的预期输出是

有什么建议吗

谢谢，比顿

Answer 1

我认为您正在搜索以下内容：

WITH dat AS (SELECT '2019-02-20;05373487378;ABC+xY++Rohit Anita Chicago' AS adress FROM dual)
SELECT regexp_REPLACE(adress, '(.*);ABC\+x[yY]\+{2,3}(.*)',';') FROM dat

Answer 2

我不确定我理解你的问题，但这会做我想你问的，会在 SQL 和 Infa 表达式中工作。

with myrecs as
(select '2019-02-20;870789789707;ABC+xY++Gustav Russia' as myfield from dual union 
all
 select '2019-02-20;870789789707;ABC+xY+++Carroll Iowa' as myfield from dual)

 select myfield,

    substr(myfield,1, instr(myfield,';',-1)) ---will select everything up to, and including the final semicolon
    ||--concatenate
    substr(myfield,instr(myfield,'+',-1)+1) as yourfield --will select everything after the final plus sign
 from myrecs;

OUTPUT:
myfield                                         yourfield
2019-02-20;870789789707;ABC+xY++Gustav Russia   2019-02-20;870789789707;Gustav Russia
2019-02-20;870789789707;ABC+xY+++Carroll Iowa   2019-02-20;870789789707;Carroll Iowa

Answer 3

Informatica PowerCenter 提供了一些函数来处理正则表达式。在这种情况下，您需要 REG_EXTRACT.

函数的描述很好 already available - 检查并点赞 :)

根据它，您很可能需要定义一个端口，例如：

your_output_port=REG_EXTRACT(ADDRESS, '([^\+]+)$', 1)

Here's我是如何测试的。

Answer 4

这就是解决方案。

substr
    ( 
        Address,
        0, 
        instr(Address ,';',-1)
    )
    ||
substr
    (
        Address,
        instr(Address ,'+',-1)
    )

您可能需要根据需要在 substr 开始/结束位置添加 +1。

从动态字符串中删除字符

Remove character from dynamic string

sql

oracle

substring

informatica

informatica-powerexchange