一个或多个 UTF8 字段包含非 UTF 8 数据，编辑可能会产生意外结果

Question

我在 Oracle Linux 7.2 上有一个文本文件 (iso-8859-1)，我正在尝试将其加载到 table 到我的 Oracle DB 12.1c(AL32UTF8) 上。

     declare
     f       Utl_File.File_Type;
     v_Buffer  varchar2(1000);
     v_Table   Parse.Varchar2_Table;
     v_Nfields integer;
 begin
     f     := Utl_File.Fopen('SA', '1.txt', 'R');
     if Utl_File.Is_Open(f) then
         loop
             begin
                 Utl_File.Get_Line(f, v_Line, 1000);
                 if v_Line is null then
                     exit;
                 end if;
                     Parse.Delimstring_To_Table(v_Line, v_Table, v_Nfields, Chr(9));
                     --insert into ...
                 end if;
             exception
                 when No_Data_Found then
                     exit;
             end;
         end loop;
     end if;
     Utl_File.Fclose(f);
 end;

使用this解析

我在 pl/sql 开发人员中有漂亮（正确）的输出消息 "One or more UTF8 fields contain non-UTF 8 data, editing might give unexpected results"

Apex 5 中的输出不正确。

我可以用这个做点什么吗？我是 trieng 转换？在 oracle 中翻译等等...

更新 1

select *
  from nls_database_parameters
 where parameter like '%CHARACTERSET%';

PARAMETER               VALUE
NLS_NCHAR_CHARACTERSET  AL16UTF16
NLS_CHARACTERSET        AL32UTF8

Answer 1

UTL_FILE documentation 说

UTL_FILE expects that files opened by UTL_FILE.FOPEN in text mode are encoded in the database character set.

显然不是这样。

对 VARCHAR2 使用 DBMS_LOB.OPEN() to open a BFILE (see BFILENAME) as RAW value and convert it with UTL_I18N.RAW_TO_CHAR() 函数。

然后您可以使用您的 Parse.Delimstring_To_Table 函数来解析行。

考虑使用EXTERNAL TABLE or SQL*Loader，也许它们更容易使用。

一个或多个 UTF8 字段包含非 UTF 8 数据，编辑可能会产生意外结果

One or more UTF8 fields contain non-UTF 8 data, editing might give unexpected results

oracle

utf-8

oracle-apex