Netezza: ERROR: 65536 : Record size limit exceeded

Question

谁能解释一下下面的行为

KAP.ADMIN(ADMIN)=> create table char1 ( a char(64000),b char(1516));

CREATE TABLE

KAP.ADMIN(ADMIN)=> create table char2 ( a char(64000),b char(1517));

ERROR: 65536 : Record size limit exceeded

KAP.ADMIN(ADMIN)=> insert into char1 select * from char1;

ERROR: 65540 : Record size limit exceeded => why this error during insert if create table does not throw any error for same table as shown above.

KAP.ADMIN(ADMIN)=> \d char1
                      Table "CHAR1"

Attribute |       Type       | Modifier | Default Value

-----------+------------------+----------+---------------

A         | CHARACTER(64000) |          |

B         | CHARACTER(1516)  |          |

Distributed on hash: "A"

./nz_ddl_table KAP char1

Creating table:  "CHAR1"

CREATE TABLE  CHAR1
(
     A        character(64000),
     B        character(1516)
)
DISTRIBUTE ON (A)
;

/*
       Number of columns  2

    (Variable) Data Size  4 - 65520
            Row Overhead  28
  ======================  =============

  Total Row Size (bytes)  32 - 65548

*/

我想知道上述情况下行大小的计算方法。我查看了 netezza db 用户指南，但无法理解上面示例中的计算。

Answer 1

首先根据一行数据创建临时table。

create temp  table tmptable as
select *
from Table
limit 1

然后检查临时文件使用的字节数table。那应该是每行的大小。

select used_bytes
from _v_sys_object_storage_size a inner join
_v_table b
on a.tblid = b.objid
and b.tablename = 'tmptable'

Netezza 有一些限制： 1) char/varchar 字段中的最大字符数：64,000 2) 最大行大小：65,535 字节

新西兰的记录长度不可能超过 65 k 字节。尽管 NZ box 提供了巨大的 space，但使用准确的 space 预测而不是随机间距移动确实是个好主意。现在在您的要求中，所有属性是否都必须强制要求 char(64000) 或可以通过实时数据分析进行压缩。如果可以进一步压缩，则重新访问属性 length 。同样在此类要求期间，永远不要使用 insert into char1 select * ...... 语句，因为这将允许系统选择首选数据类型，并且可能具有更高的大小端，这可能不是必需的。

Answer 2

我认为这个 link 很好地解释了 Netezza / PDA 数据类型的开销：

For every row of every table, there is a 24-byte fixed overhead of the rowid, createxid, and deletexid. If you have any nullable columns, a null vector is required and it is N/8 bytes where N is the number of columns in the record. 
The system rounds up the size of 
this header to a multiple of 4 bytes.
In addition, the system adds a record header of 4 bytes if any of the following is true:

Column of type VARCHAR
Column of type CHAR where the length is greater than 16 (stored internally as VARCHAR)
Column of type NCHAR
Column of type NVARCHAR

Using UTF-8 encoding, each Unicode code point can require 1 - 4 bytes of storage. A 10-character string requires 10 bytes of storage if it is ASCII and up to 20 bytes if it is Latin, or as many as 40 bytes if it is Kanji.

The only time a record does not contain a header is if all the columns are defined as NOT NULL, there are no character data types larger than 16 bytes, and no variable character data types.

https://www.ibm.com/support/knowledgecenter/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_data_types_calculate_row_size.html