NZLOAD 正在运行,而 Netezza 中的外部 Table 因错误输入行数达到最大错误而失败

NZLOAD is working while External Table in Netezza failing with count of bad input rows reached maxerrors

Netezza External 的间歇性问题Table。

外部 table 因系统本身生成的文件而失败(意思是外部 table 生成的文件不是来自其他来源)。但是我们尝试通过 nzload 实用程序加载相同的文件到另一个 table 并且在没有任何 issues.This 问题的情况下工作不一致并且大多数时候无法重现。

CREATE EXTERNAL TABLE SP_PORTFOLIO_EXT_DATA_6128_140
(
    CLIENT_ID INTEGER,
    CONFIG_ID INTEGER,
    SCENARIO_ID INTEGER,
    PORTFOLIO_ID INTEGER,
    PORTFOLIO_NAME CHARACTER VARYING(200),
    CUSTOM13 CHARACTER VARYING(600),
    CUSTOM12 CHARACTER VARYING(500),
    CUSTOM11 CHARACTER VARYING(500),
    CUSTOM10 CHARACTER VARYING(500),
    CUSTOM9 CHARACTER VARYING(500),
    CUSTOM8 CHARACTER VARYING(500),
    CUSTOM7 CHARACTER VARYING(500),
    CUSTOM6 CHARACTER VARYING(2000),
    CUSTOM3 CHARACTER VARYING(500),
    CUSTOM2 CHARACTER VARYING(3000),
    CUSTOM1 CHARACTER VARYING(500),
    CREATIVE CHARACTER VARYING(512),
    PLACEMENT CHARACTER VARYING(5000),
    IMPRESSIONS NUMERIC(38,0),
    CLICKS NUMERIC(38,0),
    CONVERSIONS INTEGER,
    TRUE_CONVERSIONS NUMERIC(38,6),
    OPTMETRIC NUMERIC(38,6),
    LASTAD_OPTMETRIC NUMERIC(38,6),
    CURRSPEND NUMERIC(38,6)
)
USING
(
    DATAOBJECT('/san5/Netezza/CAR/CAR_ZEUS/SPBU/test/SP_PORTFOLIO_EXT_DATA_6128_140.csv')
    DELIMITER 254
    ESCAPECHAR '/'
    TIMESTYLE '24HOUR'
    LOGDIR '/tmp'
    Y2BASE 2000
    ENCODING 'internal'
);

命令成功完成

select COUNT(*) from SP_PORTFOLIO_EXT_DATA_6128_140;
ERROR [HY000] ERROR:  External Table : count of bad input rows reached maxerrors limit

NZLOAD 方法

CREATE TABLE TEST_LOAD
(
    CLIENT_ID INTEGER,
    CONFIG_ID INTEGER,
    SCENARIO_ID INTEGER,
    PORTFOLIO_ID INTEGER,
    PORTFOLIO_NAME CHARACTER VARYING(200),
    CUSTOM13 CHARACTER VARYING(600),
    CUSTOM12 CHARACTER VARYING(500),
    CUSTOM11 CHARACTER VARYING(500),
    CUSTOM10 CHARACTER VARYING(500),
    CUSTOM9 CHARACTER VARYING(500),
    CUSTOM8 CHARACTER VARYING(500),
    CUSTOM7 CHARACTER VARYING(500),
    CUSTOM6 CHARACTER VARYING(2000),
    CUSTOM3 CHARACTER VARYING(500),
    CUSTOM2 CHARACTER VARYING(3000),
    CUSTOM1 CHARACTER VARYING(500),
    CREATIVE CHARACTER VARYING(512),
    PLACEMENT CHARACTER VARYING(5000),
    IMPRESSIONS NUMERIC(38,0),
    CLICKS NUMERIC(38,0),
    CONVERSIONS INTEGER,
    TRUE_CONVERSIONS NUMERIC(38,6),
    OPTMETRIC NUMERIC(38,6),
    LASTAD_OPTMETRIC NUMERIC(38,6),
    CURRSPEND NUMERIC(38,6)
)
DISTRIBUTE ON RANDOM;

# Loading data  from the same file using Nzload

nzload -host 10.200.29.30 -u xxxxx -pw xxxxx -db SPBU_REPORT_DB_TEST -t test_load -delim 254 -ctrlChars  -df /san5/Netezza/CAR/CAR_ZEUS/SPBU/test/SP_PORTFOLIO_EXT_DATA_6128_140.csv

Load session of table 'TEST_LOAD' completed successfully

[ja.prod@inet11026 ~]$ cat /san5/Netezza/CAR/CAR_ZEUS/SPBU/test/SP_PORTFOLIO_EXT_DATA_6128_140.csv|wc -l
191322

select count(*) from test_load;
191322

添加 nzlog

File Buffer Size (MB): 8                  Load Replay Region (MB): 0
  Encoding:              INTERNAL           Max errors:            1
  Skip records:          0                  Max rows:              0
  FillRecord:            No                 Truncate String:       No
  Escape Char:           '/'                Accept Control Chars:  No
  Allow CR in string:    No                 Ignore Zero:           No
  Quoted data:           NO                 Require Quotes:        No

  BoolStyle:             1_0                Decimal Delimiter:     '.'

  Disable NFC:           No
  Date Style:            YMD                Date Delim:            '-'
  Y2Base:                2000
  Time Style:            24HOUR             Time Delim:            ':'
  Time extra zeros:      No

Found bad records

bad #: input row #(byte offset to last char examined) [field #, declaration] diagnostic, "text consumed"[last char examined]
----------------------------------------------------------------------------------------------------------------------------
1: 25(184) [21, INT4] expected field delimiter or end of record, "0"[.]

Statistics

  number of records read:      25
  number of bad records:       1
  -------------------------------------------------
  number of records loaded:    0

  Elapsed Time (sec): 0.0

-----------------------------------------------------------------------------
Load completed at: 08-Oct-15 09:59:04 EDT

包含错误行的 .nzbad 数据(管道符号代表实际分隔符以提高可读性):

140|1305|6128||NULL|SEO|SEO|test.com/vehicledetail/detail/632888199/overview|SEO|SEO|SEO|SEO Brand|SEO Brand|best Tracking|Google(Seo)|SEO|Impression Tracker|Unknown|0|1|0|0.000000|0.000000|0.000000|0.000000

从 nzlog 我们可以看出加载在第 25 行失败。具体来说,当它尝试加载第 21 列时,它遇到了一个不是整数的值。

日志显示它遇到一个 0,然后是一个句点。所以数据可能有类似 0.0 或 0.1234 的东西,不能作为整数加载。

bad #: input row #(byte offset to last char examined) [field #, declaration] diagnostic, "text consumed"[last char examined]
----------------------------------------------------------------------------------------------------------------------------
1: 25(184) [21, INT4] expected field delimiter or end of record, "0"[.]

使用您提供的 .nzbad 数据(出于可读性目的,此处使用“|”而不是您的实际分隔符):

140|1305|6128||NULL|SEO|SEO|test.com/vehicledetail/detail/632888199/overview|SEO|SEO|SEO|SEO Brand|SEO Brand|best Tracking|Google(Seo)|SEO|Impression Tracker|Unknown|0|1|0|0.000000|0.000000|0.000000|0.000000

我在这里注意到的一件事是您有一个带有“/”的 varchar 字段。您的外部 table 和您的 nzload 方法之间的区别之一是外部 table 指定 escapechar '/' 而 nzload 没有。

您会发现您的数据 'test.com/vehicledetail/detail/632888199/overview' 将被加载为 'test.comvehicledetaildetail632888199overview',因为 '/' 字符将被删除,因为它们本身没有转义(例如 '//')。

如果“/”直接位于数据中的列分隔符之前,它会指示它认为列分隔符是数据的一部分,并且会认为数据中的第 22 列实际上是第 21 列table 与我们在这里看到的相匹配。

如你所说,ScottMcG 我比较了 Nzload 和外部生成的 nzlog 文件 table 发现转义字符是唯一的 difference.So 我注释掉了那部分并再次尝试并且一切正常。

CREATE EXTERNAL TABLE SP_PORTFOLIO_EXT_DATA_6128_140

( CLIENT_ID 整数, CONFIG_ID 整数, SCENARIO_ID 整数, PORTFOLIO_ID 整数, PORTFOLIO_NAME 字符变化(200), CUSTOM13 字符变化(600), CUSTOM12 字符变化(500), CUSTOM11 字符变化(500), CUSTOM10 字符变化(500), CUSTOM9 字符变化(500), CUSTOM8 字符变化(500), CUSTOM7 字符变化(500), CUSTOM6 字符变化(2000), CUSTOM3 字符变化(500), CUSTOM2 字符变化(3000), CUSTOM1 字符变化(500), 创意人物变化(512), 放置字符变化(5000), 印象数字(38,0), 单击数字 (38,0), 转换整数, TRUE_CONVERSIONS 数字 (38,6), 光学数值(38,6), LASTAD_OPTMETRIC 数字 (38,6), 支出数字 (38,6) ) 使用 ( DATAOBJECT('/san5/Netezza/CAR/CAR_ZEUS/SPBU/test/SP_PORTFOLIO_EXT_DATA_6128_140.csv') 分隔符 254 时间类型“24 小时” 日志目录 '/tmp' Y2BASE 2000 编码 'internal' );

select 计数 (*) 来自 SP_PORTFOLIO_EXT_DATA_6128_140;

191322.

数据类型必须更改如下:CHARACTER VARYING replace for VARCHAR/ NVARCHAR

CREATE TABLE TEST_LOAD
(
CLIENT_ID INTEGER,
CONFIG_ID INTEGER,
SCENARIO_ID INTEGER,
PORTFOLIO_ID INTEGER,
PORTFOLIO_NAME VARCHAR(200),
CUSTOM13 VARCHAR(600),
CUSTOM12 VARCHAR(500),
CUSTOM11 VARCHAR(500),
CUSTOM10 VARCHAR(500),
CUSTOM9 VARCHAR(500),
CUSTOM8 VARCHAR(500),
CUSTOM7 VARCHAR(500),
CUSTOM6 VARCHAR(2000),
CUSTOM3 VARCHAR(500),
CUSTOM2 VARCHAR(3000),
CUSTOM1 VARCHAR(500),
CREATIVE VARCHAR(512),
PLACEMENT VARCHAR(5000),
IMPRESSIONS NUMERIC(38,0),
CLICKS NUMERIC(38,0),
CONVERSIONS INTEGER,
TRUE_CONVERSIONS NUMERIC(38,6),
OPTMETRIC NUMERIC(38,6),
LASTAD_OPTMETRIC NUMERIC(38,6),
CURRSPEND NUMERIC(38,6)
)
DISTRIBUTE ON RANDOM;