是否有一个 PostgreSQL 函数让我获取一个大字符串，将其分成子字符串，然后只替换这些子字符串中的项目？

Question

架构：fruit_schema

table: fruit_table

.

我在 my_column 列的一个单元格中有这个字符串。

x:y。是我不感兴趣的文字

    <Item>
    <Name>First</Name>
    <Subject>x:y. Food: Apple. x:y. </Subject>
    </Item>

    <Item>
    <Name>Second</Name>
    <Subject>x:y. x:y. Food: Apple. Type: Red. x:y. x:y.</Subject>
    </Item>

我想将 'Type: Red' 添加到第一项。到目前为止我已经写了这个查询：

    DO
    $$
    DECLARE
        oldItem varchar := 'Food: Apple.';
        newItem varchar;
    BEGIN
        newItem  = 'Food: Apple. Type: Red.';

        UPDATE 
            fruit_schema.fruit_table AS profile
        SET 
            my_column = REPLACE(my_column, oldItem, newItem)
        WHERE
            profile.my_column ILIKE '%First%</Item>%';
    END$$;

但是，这是我的输出：

    <Item>
    <Name>First</Name>
    <Subject>x:y. Food: Apple. Type: Red. x:y. </Subject>
    </Item>

    <Item>
    <Name>Second</Name>
    <Subject>x:y. x:y. Food: Apple. Type: Red. Type: Red. x:y. x:y.</Subject>
    </Item>

第一项正确更新，但第二项被替换为副本 "Type: Red"。无论如何，我可以将这个巨大的字符串分别拆分为第一项和第二项吗？我只想处理第一个子字符串，从 <Name>First</Name> 到 </Item>

我想为我想要处理的所有其他项目执行此操作。如果可能的话，我想避免手动计算项目中的字符数。

Answer 1

对于特定的主题，你可以做一个regexp_replace:

SELECT regexp_replace('<Item>
    <Name>First</Name>
    <Subject>x:y. Food: Apple. x:y. </Subject>
    </Item>

    <Item>
    <Name>Second</Name>
    <Subject>x:y. x:y. Food: Apple. Type: Red. x:y. x:y.</Subject>
    </Item>', 'Food: Apple\. ', 'Food: Apple. Type: Red ', '');

如果您的替换内容不同，您可能需要进行一些调整

Answer 2

您需要 regexp_replace() 函数和 negative lookahead，例如：

# select replace(e'aaa\naaa bbb', 'aaa', 'aaa bbb');
┌─────────────┐
│   replace   │
├─────────────┤
│ aaa bbb    ↵│
│ aaa bbb bbb │
└─────────────┘

# select regexp_replace(e'aaa\naaa bbb', 'aaa(\s+)(?!bbb)', 'aaa bbb', 'g');
┌────────────────┐
│ regexp_replace │
├────────────────┤
│ aaa bbb       ↵│
│ aaa bbb        │
└────────────────┘

使用您的数据：

# select regexp_replace(
  '<Item><Name>First</Name>
     <Subject>x:y. Food: Apple. x:y. </Subject>
   </Item>
   <Item><Name>Second</Name>
     <Subject>x:y. x:y. Food: Apple. Type: Red. x:y. x:y.</Subject>
   </Item>
   <Item><Name>Third</Name>
     <Subject>x:y. Food: Apple. x:y. </Subject>
   </Item>',
  'Food: Apple\.(\s+)(?!Type: Red\.)',
  'Food: Apple. Type: Red.', 'g');
┌─────────────────────────────────────────────────────────────────────┐
│                           regexp_replace                            │
├─────────────────────────────────────────────────────────────────────┤
│ <Item>\n<Name>First</Name>                                         ↵│
│      <Subject>x:y. Food: Apple. Type: Red. x:y. </Subject>         ↵│
│    </Item>                                                         ↵│
│    <Item>\n<Name>Second</Name>                                     ↵│
│      <Subject>x:y. x:y. Food: Apple. Type: Red. x:y. x:y.</Subject>↵│
│    </Item>                                                         ↵│
│    <Item>\n<Name>Third</Name>                                      ↵│
│      <Subject>x:y. Food: Apple. Type: Red. x:y. </Subject>         ↵│
│    </Item>                                                          │
└─────────────────────────────────────────────────────────────────────┘

更新：如何仅替换特定项目的文本：

with t(x) as (values('<Item><Name>First</Name>
     <Subject>x:y. Food: Apple. x:y. </Subject>
   </Item>
   <Item><Name>Second</Name>
     <Subject>x:y. x:y. Food: Apple. Type: Red. x:y. x:y.</Subject>
   </Item>
   <Item><Name>Third</Name>
     <Subject>x:y. Food: Apple. x:y. </Subject>
   </Item>'))
select
    string_agg(
        case
            when xx like any(array['%<Name>Third</Name>%','%<Name>Second</Name>%']) then
                regexp_replace(xx, 'Food: Apple\.(\s+)(?!Type: Red\.)', 'Food: Apple. Type: Red.', 'g')
            else xx
        end || '</Item>', e'\n')
from t, unnest(string_to_array(replace(x, e'\n', ''), '</Item>')) as xx
where xx <> '';
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                string_agg                                                 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────┤
│ <Item><Name>First</Name>     <Subject>x:y. Food: Apple. x:y. </Subject>   </Item>                        ↵│
│    <Item><Name>Second</Name>     <Subject>x:y. x:y. Food: Apple. Type: Red. x:y. x:y.</Subject>   </Item>↵│
│    <Item><Name>Third</Name>     <Subject>x:y. Food: Apple. Type: Red. x:y. </Subject>   </Item>           │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────┘

是否有一个 PostgreSQL 函数让我获取一个大字符串，将其分成子字符串，然后只替换这些子字符串中的项目？

Is there a PostgreSQL function that let's me take a large string, separate it into substrings, and then only replace items in those substrings?

postgresql

postgresql-10