在 Azure SQL 数据仓库中将二进制列转换为字符串
Cast binary column to string in Azure SQL Data Warehouse
我目前在 Postgres 和 Redshift 中有一些函数,它们采用随机生成的字符串,对其进行哈希处理,然后使用部分哈希生成一个 0-99 之间的随机数。我正在尝试在 Azure SQL 数据仓库中复制此功能,以便我在 SQL DW 中获得与在 Postgres 和 Redshift 中相同的值。
我 运行 遇到的问题是,当我将结果转换为 VARCHAR 或使用字符串函数时,结果是一个截然不同的字符串。我想将 md5 函数的结果作为相同的 VARCHAR
.
为了说明,这是 Azure SQL DW 中的一个查询:
SELECT
'abc123' as random_string,
HASHBYTES('md5', 'abc123') as md5,
CAST(HASHBYTES('md5', 'abc123') AS VARCHAR) as md5_varchar,
RIGHT(HASHBYTES('md5', 'abc123'), 5) as md5_right
;
这会产生
random_string,md5,md5_varchar
abc123,0xE99A18C428CB38D5F260853678922E03,éšÄ(Ë8Õò`…6x’.,6x’.
如您所见,生成的 varchar 与 md5 函数的输出有很大不同。有没有办法将 md5 的结果转换成相同的字符串?
在 Postgres 和 Redshift 中,md5
函数的结果是 VARCHAR,因此对其进行转换很简单。
以下是 Redshift 和 Postgres 中的查询:
-- Redshift
SELECT
'abc123' as random_string,
right(strtol(right(md5('abc123'), 3), 16), 2)::INT as tranche
;
-- Postgres
SELECT
'abc123' as random_string,
right(('x' || lpad(right(md5('abc123'), 3), 4, '0')) :: BIT(16) :: INT :: VARCHAR, 2) :: INT AS tranche
;
两个函数 return 值 87
。
使用转换应该可以解决该问题:
CONVERT(VARCHAR(32),HashBytes('MD5', 'abc123'),2)
因为你可以定义样式的参数,这是我们转换varbinary值时需要的。它在这里描述:
https://technet.microsoft.com/pl-pl/library/ms187928(v=sql.105).aspx
这是该文档中有关使用 convert 进行二进制转换的备注部分:
Binary Styles When expression is binary(n), varbinary(n), char(n), or
varchar(n), style can be one of the values shown in the following
table. Style values that are not listed in the table return an error.
0 (default)
Translates ASCII characters to binary bytes or binary
bytes to ASCII characters. Each character or byte is converted 1:1. If
the data_type is a binary type, the characters 0x are added to the
left of the result.
1, 2
If the data_type is a binary type, the
expression must be a character expression. The expression must be
composed of an even number of hexadecimal digits (0, 1, 2, 3, 4, 5, 6,
7, 8, 9, A, B, C, D, E, F, a, b, c, d, e, f). If the style is set to 1
the characters 0x must be the first two characters in the expression.
If the expression contains an odd number of characters or if any of
the characters are invalid an error is raised. If the length of the
converted expression is greater than the length of the data_type the
result will be right truncated. Fixed length data_types that are
larger then the converted result will have zeros added to the right of
the result. If the data_type is a character type, the expression must
be a binary expression. Each binary character is converted into two
hexadecimal characters. If the length of the converted expression is
greater than the data_type length it will be right truncated. If the
data_type is a fix sized character type and the length of the
converted result is less than its length of the data_type; spaces are
added to the right of the converted expression to maintain an even
number of hexadecimal digits. The characters 0x will be added to the
left of the converted result for style 1.
我目前在 Postgres 和 Redshift 中有一些函数,它们采用随机生成的字符串,对其进行哈希处理,然后使用部分哈希生成一个 0-99 之间的随机数。我正在尝试在 Azure SQL 数据仓库中复制此功能,以便我在 SQL DW 中获得与在 Postgres 和 Redshift 中相同的值。
我 运行 遇到的问题是,当我将结果转换为 VARCHAR 或使用字符串函数时,结果是一个截然不同的字符串。我想将 md5 函数的结果作为相同的 VARCHAR
.
为了说明,这是 Azure SQL DW 中的一个查询:
SELECT
'abc123' as random_string,
HASHBYTES('md5', 'abc123') as md5,
CAST(HASHBYTES('md5', 'abc123') AS VARCHAR) as md5_varchar,
RIGHT(HASHBYTES('md5', 'abc123'), 5) as md5_right
;
这会产生
random_string,md5,md5_varchar
abc123,0xE99A18C428CB38D5F260853678922E03,éšÄ(Ë8Õò`…6x’.,6x’.
如您所见,生成的 varchar 与 md5 函数的输出有很大不同。有没有办法将 md5 的结果转换成相同的字符串?
在 Postgres 和 Redshift 中,md5
函数的结果是 VARCHAR,因此对其进行转换很简单。
以下是 Redshift 和 Postgres 中的查询:
-- Redshift
SELECT
'abc123' as random_string,
right(strtol(right(md5('abc123'), 3), 16), 2)::INT as tranche
;
-- Postgres
SELECT
'abc123' as random_string,
right(('x' || lpad(right(md5('abc123'), 3), 4, '0')) :: BIT(16) :: INT :: VARCHAR, 2) :: INT AS tranche
;
两个函数 return 值 87
。
使用转换应该可以解决该问题:
CONVERT(VARCHAR(32),HashBytes('MD5', 'abc123'),2)
因为你可以定义样式的参数,这是我们转换varbinary值时需要的。它在这里描述: https://technet.microsoft.com/pl-pl/library/ms187928(v=sql.105).aspx
这是该文档中有关使用 convert 进行二进制转换的备注部分:
Binary Styles When expression is binary(n), varbinary(n), char(n), or varchar(n), style can be one of the values shown in the following table. Style values that are not listed in the table return an error.
0 (default)
Translates ASCII characters to binary bytes or binary bytes to ASCII characters. Each character or byte is converted 1:1. If the data_type is a binary type, the characters 0x are added to the left of the result.
1, 2
If the data_type is a binary type, the expression must be a character expression. The expression must be composed of an even number of hexadecimal digits (0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F, a, b, c, d, e, f). If the style is set to 1 the characters 0x must be the first two characters in the expression. If the expression contains an odd number of characters or if any of the characters are invalid an error is raised. If the length of the converted expression is greater than the length of the data_type the result will be right truncated. Fixed length data_types that are larger then the converted result will have zeros added to the right of the result. If the data_type is a character type, the expression must be a binary expression. Each binary character is converted into two hexadecimal characters. If the length of the converted expression is greater than the data_type length it will be right truncated. If the data_type is a fix sized character type and the length of the converted result is less than its length of the data_type; spaces are added to the right of the converted expression to maintain an even number of hexadecimal digits. The characters 0x will be added to the left of the converted result for style 1.