将 REGEXP 模式应用于数据库中的字符串列

Question

我正在尝试通过对其应用 REGEXP 模式来操作字符串列。

所有值实际上都是浮点数，但转换为字符串。重点是从数字中删除所有尾随零，如下所示。

input -> output
1.000 -> 1
1.100 -> 1.1
1.001 -> 1.001
0.001 -> 0.001
0.010 -> 0.01

我可以使用 REGEXP ^(\d+(?:\.\d*?[1-9](?=0|\b))?)\.?0*$ 实现上述。

问题：如何使用 SQLalchemy 将上述 REGEXP 应用于字符串列的所有行？操作必须在数据库端完成，由于数据的大小，无法将数据移出数据库。

Postgres 等价物是：

select TRIM(trailing '00' FROM CAST(numeric_col::decimal(32, 8) as text))
from my_table;

感谢任何帮助。

Answer 1

给出这样的 table：

test# table t71689267;
 id │  col  
════╪═══════
  1 │ 1.000
  2 │ 1.100
  3 │ 1.001
  4 │ 0.001
  5 │ 0.010

你可以在 Postgresql 中这样做*（注意 regexp_match returns 一个数组）：

test# select col, regexp_match(col, '^(\d+(?:\.\d*?[^0])?)\.?0*$') from t71689267;
  col  │ regexp_match 
═══════╪══════════════
 1.000 │ {1}
 1.100 │ {1.1}
 1.001 │ {1.001}
 0.001 │ {0.001}
 0.010 │ {0.01}

SQLAlchemy 中的等价物是

import sqlalchemy as sa
...
with engine.connect() as conn:
    # tbl is a SQLAlchemy Table instance corresponding to the table in the database
    q = sa.select(sa.func.regexp_match(tbl.c.col, r'^(\d+(?:\.\d*?[^0])?)\.?0*$'))
    rows = conn.execute(q)
    for row in rows.scalars():
        print(row)
    print()

给予

['1']
['1.1']
['1.001']
['0.001']
['0.01']

* 问题中的模式似乎并不适用于所有情况；我使用的那个适用于问题中的所有字符串。你当然可以替换任何你想使用的模式。

将 REGEXP 模式应用于数据库中的字符串列

Apply REGEXP pattern to string column in database

python

regex

postgresql

sqlalchemy