如何从 Redshift table 中的 JSON 数组值中删除撇号？

Question

我在一列中有一个 JSON 字符串，例如[{'id': 2746, 'name': "HaZore'a", 'lat': Decimal('32.644516'), 'lon': Decimal('35.11918')}] of我的 RS table。我的要求是从名称值 (HaZore'a) 中删除这个单引号，然后用双引号替换所有单引号并删除方括号，因为 JSON 只识别双引号。我试过这样做

replace(replace(left(right((replace(replace(city, '\'', '"'), 'None', '"None"')),
                                              len((replace(replace(city, '\'', '"'), 'None', '"None"'))) - 1),
                                        len((replace(replace(city, '\'', '"'), 'None', '"None"'))) - 2), '\'',
                                   '"'), 'Decimal(', ''), ')', '')

但是它给我留下了这样的名称值中的双引号 {"id": 2746, "name": "HaZore"a", "lat": "32.644516", "lon": "35.11918 "}.

我应该如何更新上面的语句以从列中的名称值中删除撇号。我有几行名称值带有撇号。

请帮忙。

谢谢 Ajinkya

Answer 1

我认为您需要转向 regexp_replace() 以更正确地进行模式匹配。参见：https://docs.aws.amazon.com/redshift/latest/dg/REGEXP_REPLACE.html

由于此文本是 json-like 您的双引号文本中包含单引号的文本将仅包含字母字符（单引号除外）。您可以使用它来进行模式匹配。

我有一个 postgres fiddle 来演示（见评论）。现在 Redshift 和 postgres 以不同的方式转义文本中的单引号，因此需要进行一些调整，但我稍后会讲到。

数据设置：

create table test as 
select '[{''id'': 2746, ''name'': "HaZore''a", ''lat'': Decimal(''32.644516''), ''lon'': Decimal(''35.11918'')}]'''::text 
as txt;

SQL查询：

select replace(regexp_replace(txt, '("[[:alpha:]]*)''([[:alpha:]]*")', '-'), '''', '"') 
from test;

我将撇号改为破折号以使事情更清楚。您可以根据需要随意更改或删除它。

现在要转义 Redshift 中的单引号，我认为您需要将它们反斜杠。所以查询更改为：

select replace(regexp_replace(txt, '("[[:alpha:]]*)\'([[:alpha:]]*")', '-'), '\'', '"') 
from test;

由于我还没有启动 Redshift 在那个环境中进行测试，请原谅任何错误。

如何从 Redshift table 中的 JSON 数组值中删除撇号？

How to remove apostrophe from JSON array value in Redshift table?

arrays

json

amazon-redshift

regexp-replace