从插入外部 table -oracle 中删除 EOL 定界符
removing EOL delimiter from inserting into external table -oracle
我已将 notrim 包含在外部 table 的 rowdata 列中作为 Alex 的建议(这是 问题的延续,),
但现在行尾字符也附加在行数据列,我的意思是,行尾 (CR-LF) 也连接在行数据的末尾。
我不想使用 substr() 或 translate() ,因为文件大小约为 1GB,
我的外部 table 创建过程:
'CREATE TABLE ' || rec.ext_table_name || ' (ROW_DATA VARCHAR2(4000)) ORGANIZATION EXTERNAL ' ||
'(TYPE ORACLE_LOADER DEFAULT DIRECTORY ' || rec.dir_name || ' ACCESS ' || 'PARAMETERS (RECORDS ' ||
'DELIMITED by NEWLINE NOBADFILE NODISCARDFILE ' ||
'FIELDS REJECT ROWS WITH ALL NULL FIELDS (ROW_DATA POSITION(1:4000) char)) LOCATION (' || l_quote ||
'temp.txt' || l_quote || ')) REJECT LIMIT UNLIMITED'
我可以添加任何其他参数来删除行尾字符吗?谢谢。
编辑 1:
我的档案:
Some first line with spaces at end
Some second line with spaces at end
我的分机 table :
Some first line with spaces at end <EOL>
Some second line with spaces at end <EOL>
为了更清楚,我将在java中解释(当我将列值分配给字符串时,如下所示),
没有 没有trim :
rowdata[1]="Some first line with spaces at end";
rowdata[2]="Some second line with spaces at end";
没有trim:
rowdata[1]="Some first line with spaces at end \n";
rowdata[2]="Some second line with spaces at end \n";
我想要它是什么:
rowdata[1]="Some first line with spaces at end ";
rowdata[2]="Some second line with spaces at end ";
分隔符也是行数据的一部分,因为没有指定 trim。
编辑2:
Line-Endings : CRLF
平台:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit
Production PL/SQL Release 12.1.0.1.0 - Production
"CORE 12.1.0.1.0 Production" TNS for Solaris: Version 12.1.0.1.0 -
Production NLSRTL Version 12.1.0.1.0 - Production
SELECT DUMP(ROW_DATA,1016) 从 EXT_TABLE WHERE ROWNUM = 1;
Typ=1 Len=616 CharacterSet=AL32UTF8:
41,30,30,30,30,30,30,30,30,30,30,31,30,30,30,30,37,36,36,36,44,30,30,30,30,31,32,35,30,38,31,36,32,35,30,38,31,36,31,33,34,37,30,39,44,42,20,41,30,36,31,30,30,30,30,30,30,30,30,30,30,30,30,32,30,30,4d,59,52,20,32,5a,20,30,31,36,30,30,30,31,32,31,32,33,34,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,52,49,42,46,50,58,30,30,30,31,30,30,30,30,30,30,30,30,31,30,36,32,38,30,31,30,32,30,30,47,20,20,20,20,53,20,20,30,30,30,30,30,30,30,30,30,30,30,20,20,20,20,20,20,20,4e,39,32,37,32,20,20,20,20,20,20,30,30,30,30,30,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,30,30,39,39,38,54,45,53,54,52,52,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,4f,50,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,54,52,41,4e,53,49,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,52,52,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,4f,50,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,54,52,41,4e,53,49,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,d
len应该是615
您的文件行结尾是 CRLF(暗示文件是在 Windows 中创建的?),但您的数据库在 Solaris 上是 运行。正如 the documentation 所说:
If DELIMITED BY NEWLINE is specified, then the actual value used is platform-specific. On UNIX platforms, NEWLINE is assumed to be "\n". On Windows operating systems, NEWLINE is assumed to be "\r\n".
由于您的数据库平台是 Unix,因此它仅使用 LF (\n
) 作为记录分隔符。您可以更改文件中的分隔符,或更改 terminated by
子句以查找 Windows 行结尾:
,,,
records delimited by "\r\n" nobadfile ...
如果您可能会得到带有任何一种行结尾的文件并且无法控制它,您可以添加一个 preprocessor step 来去除任何确实存在的文件。如果您创建一个 executable 脚本文件,要么在与该文件相同的目录中,要么(如 Oracle 建议的那样)在不同的 Oracle 可访问目录中,例如称为 remove_cr
,其中包含:
/usr/bin/sed -e "s/\r$//"
您可以在外部 table 定义中添加对它的调用,并保留 newline
temrinator:
...
records delimited by newline nobadfile nodiscardfile
preprocessor 'remove_cr'
...
尽管如此,请务必阅读文档中的安全警告。
带有 CRLF 行结尾的 temp.txt
文件的演示:
create table t42_ext (
row_data varchar2(4000)
)
organization external
(
type oracle_loader default directory d42 access parameters
(
records delimited by newline nobadfile nodiscardfile
preprocessor 'remove_cr'
fields reject rows with all null fields
(
row_data position(1:4000) char notrim
)
)
location ('temp.txt')
)
reject limit unlimited;
select '<'|| row_data ||'>' from t42_ext;
'<'||ROW_DATA||'>'
--------------------------------------------------------------------------------
<Line1sometext >
<Line2sometext >
<Line3sometext >
我已将 notrim 包含在外部 table 的 rowdata 列中作为 Alex 的建议(这是
但现在行尾字符也附加在行数据列,我的意思是,行尾 (CR-LF) 也连接在行数据的末尾。
我不想使用 substr() 或 translate() ,因为文件大小约为 1GB,
我的外部 table 创建过程:
'CREATE TABLE ' || rec.ext_table_name || ' (ROW_DATA VARCHAR2(4000)) ORGANIZATION EXTERNAL ' ||
'(TYPE ORACLE_LOADER DEFAULT DIRECTORY ' || rec.dir_name || ' ACCESS ' || 'PARAMETERS (RECORDS ' ||
'DELIMITED by NEWLINE NOBADFILE NODISCARDFILE ' ||
'FIELDS REJECT ROWS WITH ALL NULL FIELDS (ROW_DATA POSITION(1:4000) char)) LOCATION (' || l_quote ||
'temp.txt' || l_quote || ')) REJECT LIMIT UNLIMITED'
我可以添加任何其他参数来删除行尾字符吗?谢谢。
编辑 1:
我的档案:
Some first line with spaces at end
Some second line with spaces at end
我的分机 table :
Some first line with spaces at end <EOL> Some second line with spaces at end <EOL>
为了更清楚,我将在java中解释(当我将列值分配给字符串时,如下所示),
没有 没有trim :
rowdata[1]="Some first line with spaces at end";
rowdata[2]="Some second line with spaces at end";
没有trim:
rowdata[1]="Some first line with spaces at end \n";
rowdata[2]="Some second line with spaces at end \n";
我想要它是什么:
rowdata[1]="Some first line with spaces at end ";
rowdata[2]="Some second line with spaces at end ";
分隔符也是行数据的一部分,因为没有指定 trim。
编辑2:
Line-Endings : CRLF
平台:
Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit
Production PL/SQL Release 12.1.0.1.0 - Production
"CORE 12.1.0.1.0 Production" TNS for Solaris: Version 12.1.0.1.0 -
Production NLSRTL Version 12.1.0.1.0 - Production
SELECT DUMP(ROW_DATA,1016) 从 EXT_TABLE WHERE ROWNUM = 1;
Typ=1 Len=616 CharacterSet=AL32UTF8: 41,30,30,30,30,30,30,30,30,30,30,31,30,30,30,30,37,36,36,36,44,30,30,30,30,31,32,35,30,38,31,36,32,35,30,38,31,36,31,33,34,37,30,39,44,42,20,41,30,36,31,30,30,30,30,30,30,30,30,30,30,30,30,32,30,30,4d,59,52,20,32,5a,20,30,31,36,30,30,30,31,32,31,32,33,34,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,30,52,49,42,46,50,58,30,30,30,31,30,30,30,30,30,30,30,30,31,30,36,32,38,30,31,30,32,30,30,47,20,20,20,20,53,20,20,30,30,30,30,30,30,30,30,30,30,30,20,20,20,20,20,20,20,4e,39,32,37,32,20,20,20,20,20,20,30,30,30,30,30,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,30,30,39,39,38,54,45,53,54,52,52,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,4f,50,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,54,52,41,4e,53,49,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,52,52,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,4f,50,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,54,45,53,54,54,52,41,4e,53,49,44,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,20,d
len应该是615
您的文件行结尾是 CRLF(暗示文件是在 Windows 中创建的?),但您的数据库在 Solaris 上是 运行。正如 the documentation 所说:
If DELIMITED BY NEWLINE is specified, then the actual value used is platform-specific. On UNIX platforms, NEWLINE is assumed to be "\n". On Windows operating systems, NEWLINE is assumed to be "\r\n".
由于您的数据库平台是 Unix,因此它仅使用 LF (\n
) 作为记录分隔符。您可以更改文件中的分隔符,或更改 terminated by
子句以查找 Windows 行结尾:
,,,
records delimited by "\r\n" nobadfile ...
如果您可能会得到带有任何一种行结尾的文件并且无法控制它,您可以添加一个 preprocessor step 来去除任何确实存在的文件。如果您创建一个 executable 脚本文件,要么在与该文件相同的目录中,要么(如 Oracle 建议的那样)在不同的 Oracle 可访问目录中,例如称为 remove_cr
,其中包含:
/usr/bin/sed -e "s/\r$//"
您可以在外部 table 定义中添加对它的调用,并保留 newline
temrinator:
...
records delimited by newline nobadfile nodiscardfile
preprocessor 'remove_cr'
...
尽管如此,请务必阅读文档中的安全警告。
带有 CRLF 行结尾的 temp.txt
文件的演示:
create table t42_ext (
row_data varchar2(4000)
)
organization external
(
type oracle_loader default directory d42 access parameters
(
records delimited by newline nobadfile nodiscardfile
preprocessor 'remove_cr'
fields reject rows with all null fields
(
row_data position(1:4000) char notrim
)
)
location ('temp.txt')
)
reject limit unlimited;
select '<'|| row_data ||'>' from t42_ext;
'<'||ROW_DATA||'>'
--------------------------------------------------------------------------------
<Line1sometext >
<Line2sometext >
<Line3sometext >