从插入外部 table -oracle 中删除 EOL 定界符

removing EOL delimiter from inserting into external table -oracle

我已将 notrim 包含在外部 table 的 rowdata 列中作为 Alex 的建议(这是 问题的延续,),

但现在行尾字符也附加在行数据列,我的意思是,行尾 (CR-LF) 也连接在行数据的末尾。

我不想使用 substr() 或 translate() ,因为文件大小约为 1GB,

我的外部 table 创建过程:

'CREATE TABLE ' || rec.ext_table_name || ' (ROW_DATA VARCHAR2(4000)) ORGANIZATION EXTERNAL ' ||
     '(TYPE ORACLE_LOADER DEFAULT DIRECTORY ' || rec.dir_name || ' ACCESS ' || 'PARAMETERS (RECORDS ' ||
     'DELIMITED by NEWLINE NOBADFILE NODISCARDFILE ' ||
     'FIELDS REJECT ROWS WITH ALL NULL FIELDS (ROW_DATA POSITION(1:4000) char)) LOCATION (' || l_quote ||
     'temp.txt' || l_quote || ')) REJECT LIMIT UNLIMITED'

我可以添加任何其他参数来删除行尾字符吗?谢谢。

编辑 1:

我的档案:

Some first line with spaces at end
Some second line with spaces at end

我的分机 table :

Some first line with spaces at end    <EOL>
Some second line with spaces at end   <EOL>

为了更清楚,我将在java中解释(当我将列值分配给字符串时,如下所示),

没有 没有trim :

rowdata[1]="Some first line with spaces at end";
rowdata[2]="Some second line with spaces at end";

没有trim:

rowdata[1]="Some first line with spaces at end    \n";
rowdata[2]="Some second line with spaces at end   \n";

我想要它是什么:

rowdata[1]="Some first line with spaces at end    ";
rowdata[2]="Some second line with spaces at end   ";

分隔符也是行数据的一部分,因为没有指定 trim。

编辑2:

Line-Endings : CRLF

平台:

Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit

Production PL/SQL Release 12.1.0.1.0 - Production

"CORE 12.1.0.1.0 Production" TNS for Solaris: Version 12.1.0.1.0 -

Production NLSRTL Version 12.1.0.1.0 - Production

SELECT DUMP(ROW_DATA,1016) 从 EXT_TABLE WHERE ROWNUM = 1;

Typ=1 Len=616 CharacterSet=AL32UTF8: 41,30,30,30,30,30,30,30,30,30,30,31,30,30,30,30,37,36,36,36,44,30,30,30,30,31,32,35,30,38,31,36,32,35,30,38,31,36,31,33,34,37,30,39,44,42,20,41,30,36,31,30,30,30,30,30,30,30,30,30,30,30,30,32,30,30,4d,59,52,20,32,5a,20,30,31,36,30,30,30,31,32,31,32,33,34,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,52,49,42,46,50,58,30,30,30,31,30,30,30,30,30,30,30,30,31,30,36,32,38,30,31,30,32,30,30,47,20,20,20,20,53,20,20,30,30,30,30,30,30,30,30,30,30,30,20,20,20,20,20,20,20,4e,39,32,37,32,20,20,20,20,20,20,30,30,30,30,30,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,30,30,39,39,38,54,45,53,54,52,52,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,4f,50,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,54,52,41,4e,53,49,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,52,52,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,4f,50,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,54,52,41,4e,53,49,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,d

len应该是615

您的文件行结尾是 CRLF(暗示文件是在 Windows 中创建的?),但您的数据库在 Solaris 上是 运行。正如 the documentation 所说:

If DELIMITED BY NEWLINE is specified, then the actual value used is platform-specific. On UNIX platforms, NEWLINE is assumed to be "\n". On Windows operating systems, NEWLINE is assumed to be "\r\n".

由于您的数据库平台是 Unix,因此它仅使用 LF (\n) 作为记录分隔符。您可以更改文件中的分隔符,或更改 terminated by 子句以查找 Windows 行结尾:

,,,
records delimited by "\r\n" nobadfile ...

如果您可能会得到带有任何一种行结尾的文件并且无法控制它,您可以添加一个 preprocessor step 来去除任何确实存在的文件。如果您创建一个 executable 脚本文件,要么在与该文件相同的目录中,要么(如 Oracle 建议的那样)在不同的 Oracle 可访问目录中,例如称为 remove_cr,其中包含:

/usr/bin/sed -e "s/\r$//" 

您可以在外部 table 定义中添加对它的调用,并保留 newline temrinator:

...
records delimited by newline nobadfile nodiscardfile
preprocessor 'remove_cr'
...

尽管如此,请务必阅读文档中的安全警告。

带有 CRLF 行结尾的 temp.txt 文件的演示:

create table t42_ext (
  row_data varchar2(4000)
)
organization external
(
  type oracle_loader default directory d42 access parameters
  (
    records delimited by newline nobadfile nodiscardfile
    preprocessor 'remove_cr'
    fields reject rows with all null fields
    (
      row_data position(1:4000) char notrim
    )
  )
  location ('temp.txt')
)
reject limit unlimited;

select '<'|| row_data ||'>' from t42_ext;

'<'||ROW_DATA||'>'                                                             
--------------------------------------------------------------------------------
<Line1sometext       >                                                          
<Line2sometext       >                                                          
<Line3sometext       >