为什么 length(column) 和 lengthb(column) return 长度相同?

Why do length(column) and lengthb(column) return same length?

length(column)lengthb(column) 在 Oracle 中返回相同的长度,即使是值中包含的多字节字符也是如此。通过复制粘贴检查 lengthb 时,多字节列值 returns 更大。

SELECT column1, 
       Length(column1)              AS length_C, 
       Lengthb(column1)              AS length_B, 
       Lengthb('100749 ¬ 100749 ¬ ') AS bytelength 
FROM   db.sample 
+-------------------------------------------------------+
|            column1 | length_C  |length_B  |bytelength |
+-------------------------------------------------------+
|100749 ¬ 100749 ¬   |  17       |   17     |   19      |
+-------------------------------------------------------+

知道两者属于同一系列 LENGTH 函数:

  • 长度(字符数)
  • 长度b(字节数)
  • Lengthc(Unicode 字符,尽可能归一化)

我将向您展示一个示例,为此我将使用字符集为 AL32UTF8 的数据库。 UTF-8 是最流行的 Unicode 编码类型。它使用一个字节用于标准英文字母和符号,两个字节用于其他拉丁和中东字符,三个字节亚洲字符的字节数。额外的字符可以使用四个字节来表示。 UTF-8 向后兼容 ASCII,因为前 128 个字符映射到相同的值.

SQL> select value from nls_database_parameters where parameter = 'NLS_CHARACTERSET'
  2  ;

VALUE
--------------------------------------------------------------------------------
AL32UTF8

SQL> with t as ( select 'abcdefghijk' as c1, 'üäöß#+#üöä' as c2 from dual )
  2  select length(c1) , lengthb(c1) , lengthc ( c1 ) from t ;

LENGTH(C1) LENGTHB(C1) LENGTHC(C1)
---------- ----------- -----------
        11          11          11

SQL>  with t as ( select 'abcdefghijk' as c1, 'üäöß#+#üöä' as c2 from dual )
  2  select length(c2) , lengthb(c2), lengthc(c2) from t ;

LENGTH(C2) LENGTHB(C2) LENGTHC(C2)
---------- ----------- -----------
        17          45          17

在例子中,C1只包含普通英文字母,所以三个函数return是一样的。在 c2 的情况下,您可能会看到字符、字节和 unicode 之间的区别。

在那些情况下,我总是建议使用 DUMP()。这是理解这些字符的内部表示的最好方法。

SQL>  with t as ( select 'abcdefghijk' as c1, 'üäöß#+#üöä' as c2 from dual )
  2  select length(c1) as length_characters , dump(c1) as dump from t ;

LENGTH_CHARACTERS DUMP
----------------- -------------------------------------------------------
               11 Typ=96 Len=11: 97,98,99,100,101,102,103,104,105,106,107

SQL> with t as ( select 'abcdefghijk' as c1, 'üäöß#+#üöä' as c2 from dual )
  2  select length(c2) as length_characters , dump(c2) as dump from t ;

LENGTH_CHARACTERS
-----------------
DUMP
--------------------------------------------------------------------------------
               17
Typ=96 Len=45: 239,191,189,239,191,189,239,191,189,239,191,189,239,191,189,239,1
91,189,239,191,189,239,191,189,35,43,35,239,191,189,239,191,189,239,191,189,239,
191,189,239,191,189,239,191,189

在你的情况下,你犯了一个错误,因为你使用了两次 lengthb(我猜一个应该是 length )。检查字符串的内部表示:

SQL> select dump('100749 ¬ 100749 ¬ ',1016) from dual ;

DUMP('100749??100749??',1016)
------------------------------------------------------------------------------------------------------------------------
Typ=96 Len=28 CharacterSet=AL32UTF8: 31,30,30,37,34,39,20,ef,bf,bd,ef,bf,bd,20,31,30,30,37,34,39,20,ef,bf,bd,ef,bf,bd,20

SQL>