如何在具有唯一约束的列中插入长文本(> 3K 个字符)
How to insert long text (>3K chars) in columns with unique constraint
some problems 在 postgres 中插入的文本太长。当我有一个简单的 table 文本时,我可以根据需要插入文本(我测试了多达 40K 个字符)。但是,当我添加 unique
约束时,我开始遇到一个奇怪的 btree
问题,请参阅下面的最小工作示例 (MWE)
MWE:
#!/bin/bash
N=4096 # Aiming for a URL of length 4,096 characters
DB_NAME='foo'
# Generate random N character alphanumeric string of lenght 4,096
URL="http://www.$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w $(($N-15)) | head -n 1).com"
# Case 1 we have a table ('website') which has a single column with no
# constraints
TABLE_NAME='website'
sudo -u postgres psql -c "drop database if exists $DB_NAME;"
sudo -u postgres psql -c "create database $DB_NAME;"
sudo -u postgres psql -d $DB_NAME -c "drop table if exists $TABLE_NAME;"
sudo -u postgres psql -d $DB_NAME -c "create table $TABLE_NAME (url text);"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
# Case 2 we have a table ('website2') which has a single column which must be
# unique
TABLE_NAME='website2'
sudo -u postgres psql -c "drop database if exists $DB_NAME;"
sudo -u postgres psql -c "create database $DB_NAME;"
sudo -u postgres psql -d $DB_NAME -c "drop table if exists $TABLE_NAME;"
sudo -u postgres psql -d $DB_NAME -c "create table $TABLE_NAME (url text unique);"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
输出:
$ ./test.sh
DROP DATABASE
CREATE DATABASE
NOTICE: table "website" does not exist, skipping
DROP TABLE
CREATE TABLE
INSERT 0 1
DROP DATABASE
CREATE DATABASE
NOTICE: table "website2" does not exist, skipping
DROP TABLE
CREATE TABLE
ERROR: index row size 4112 exceeds btree version 4 maximum 2704 for index "website2_url_key"
DETAIL: Index row references tuple (0,1) in relation "website2".
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.
问题:如果我的文本非常长(>3K 个字符),我该怎么办?
- 有什么方法可以消除这个 btree 错误吗?
- 我是否应该删除
unique
约束并仅在应用程序级别进行检查?
- 我应该压缩所有 URL 吗?
- 是否根本不可能通过 PSQL 实现此目的?
我在 UNIQUE constraint on large VARCHARs - PostgreSQL
找到了解决方案
解法:
N=4096 # Aiming for a URL of length 4,096 characters
DB_NAME='foo'
# Generate random N character alphanumeric string of lenght 4,096
URL="http://www.$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w $(($N-15)) | head -n 1).com"
# To get over the length limitation of a text column with a unique
# constraint we carry out the following
# 1) Remove UNIQUE constraint from column
# 2) Add UNIQUE constraint for md5 of column
TABLE_NAME='websitemd5'
sudo -u postgres psql -c "drop database if exists $DB_NAME;"
sudo -u postgres psql -c "create database $DB_NAME;"
sudo -u postgres psql -d $DB_NAME -c "drop table if exists $TABLE_NAME;"
sudo -u postgres psql -d $DB_NAME -c "create table $TABLE_NAME (url text);"
sudo -u postgres psql -d $DB_NAME -c "create unique index unique_url_index on $TABLE_NAME (md5(url));"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
输出:
DROP DATABASE
CREATE DATABASE
NOTICE: table "websitemd5" does not exist, skipping
DROP TABLE
CREATE TABLE
CREATE INDEX
INSERT 0 1
ERROR: duplicate key value violates unique constraint "unique_url_index"
DETAIL: Key (md5(url))=(71bf6c554ab335360cd657d060f84c2d) already exists.
更好的是,将 md5 转换为类型 uuid
:
CREATE UNIQUE INDEX unique_url_index ON $TABLE_NAME ((md5(url)::uuid)); -- parens required
使索引更小更快。参见:
some problems 在 postgres 中插入的文本太长。当我有一个简单的 table 文本时,我可以根据需要插入文本(我测试了多达 40K 个字符)。但是,当我添加 unique
约束时,我开始遇到一个奇怪的 btree
问题,请参阅下面的最小工作示例 (MWE)
MWE:
#!/bin/bash
N=4096 # Aiming for a URL of length 4,096 characters
DB_NAME='foo'
# Generate random N character alphanumeric string of lenght 4,096
URL="http://www.$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w $(($N-15)) | head -n 1).com"
# Case 1 we have a table ('website') which has a single column with no
# constraints
TABLE_NAME='website'
sudo -u postgres psql -c "drop database if exists $DB_NAME;"
sudo -u postgres psql -c "create database $DB_NAME;"
sudo -u postgres psql -d $DB_NAME -c "drop table if exists $TABLE_NAME;"
sudo -u postgres psql -d $DB_NAME -c "create table $TABLE_NAME (url text);"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
# Case 2 we have a table ('website2') which has a single column which must be
# unique
TABLE_NAME='website2'
sudo -u postgres psql -c "drop database if exists $DB_NAME;"
sudo -u postgres psql -c "create database $DB_NAME;"
sudo -u postgres psql -d $DB_NAME -c "drop table if exists $TABLE_NAME;"
sudo -u postgres psql -d $DB_NAME -c "create table $TABLE_NAME (url text unique);"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
输出:
$ ./test.sh
DROP DATABASE
CREATE DATABASE
NOTICE: table "website" does not exist, skipping
DROP TABLE
CREATE TABLE
INSERT 0 1
DROP DATABASE
CREATE DATABASE
NOTICE: table "website2" does not exist, skipping
DROP TABLE
CREATE TABLE
ERROR: index row size 4112 exceeds btree version 4 maximum 2704 for index "website2_url_key"
DETAIL: Index row references tuple (0,1) in relation "website2".
HINT: Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.
问题:如果我的文本非常长(>3K 个字符),我该怎么办?
- 有什么方法可以消除这个 btree 错误吗?
- 我是否应该删除
unique
约束并仅在应用程序级别进行检查? - 我应该压缩所有 URL 吗?
- 是否根本不可能通过 PSQL 实现此目的?
我在 UNIQUE constraint on large VARCHARs - PostgreSQL
找到了解决方案解法:
N=4096 # Aiming for a URL of length 4,096 characters
DB_NAME='foo'
# Generate random N character alphanumeric string of lenght 4,096
URL="http://www.$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w $(($N-15)) | head -n 1).com"
# To get over the length limitation of a text column with a unique
# constraint we carry out the following
# 1) Remove UNIQUE constraint from column
# 2) Add UNIQUE constraint for md5 of column
TABLE_NAME='websitemd5'
sudo -u postgres psql -c "drop database if exists $DB_NAME;"
sudo -u postgres psql -c "create database $DB_NAME;"
sudo -u postgres psql -d $DB_NAME -c "drop table if exists $TABLE_NAME;"
sudo -u postgres psql -d $DB_NAME -c "create table $TABLE_NAME (url text);"
sudo -u postgres psql -d $DB_NAME -c "create unique index unique_url_index on $TABLE_NAME (md5(url));"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
sudo -u postgres psql -d $DB_NAME -c "insert into $TABLE_NAME (url) values ('$URL');"
输出:
DROP DATABASE
CREATE DATABASE
NOTICE: table "websitemd5" does not exist, skipping
DROP TABLE
CREATE TABLE
CREATE INDEX
INSERT 0 1
ERROR: duplicate key value violates unique constraint "unique_url_index"
DETAIL: Key (md5(url))=(71bf6c554ab335360cd657d060f84c2d) already exists.
更好的是,将 md5 转换为类型 uuid
:
CREATE UNIQUE INDEX unique_url_index ON $TABLE_NAME ((md5(url)::uuid)); -- parens required
使索引更小更快。参见: