替换 T-SQL 中的 Unicode 字符

Replace Unicode characters in T-SQL

如何只替换字符串的最后一个字符:

select REPLACE('this is the news with a þ', 'þ', '__')

我得到的结果是:

__is is __e news wi__ a __

编辑 服务器和数据库的排序规则是Latin1_General_CI_AS

我 运行 的实际查询是 REPLACE(note, 'þ', ''),其中 notentext 列。重点是去除刺字符,因为该字符稍后会在过程中用作列定界符。 (请不要建议更改分隔符,鉴于它的使用范围,这不会发生!)

我试过使用 N 前缀甚至使用测试 select 语句,结果如下:

这可能对你有用:

DECLARE @text NVARCHAR(1000) = N'this is the news with a þ';
DECLARE @find NVARCHAR(1000) = N'þ';
DECLARE @replace NVARCHAR(1000) = N'_';

SELECT REPLACE(CAST(@text AS VARCHAR), CAST(@find AS VARCHAR), CAST(@replace AS VARCHAR));

þ 字符(扩展 ASCII { 通过 ISO-8859-1 和 ANSI 代码页 1252 } & UNICODE 值为 254)被称为“刺”,在某些语言中直接等于 th:

  • 此处角色的技术信息:http://unicode-table.com/en/00FE/

  • 此处对该字符和排序规则的解释:http://userguide.icu-project.org/collation/customization。搜索页面(通常 Control-F)“复杂裁缝示例”,您将看到以下内容:

    The letter 'þ' (THORN) is normally treated by UCA/root collation as a separate letter that has primary-level sorting after 'z'. However, in Swedish and some other Scandinavian languages, 'þ' and 'Þ' should be treated as just a tertiary-level difference from the letters "th" and "TH" respectively.

如果您不希望 þ 等同于 th,则按如下方式强制执行二进制排序规则:

SELECT REPLACE(N'this is the news with a þ' COLLATE Latin1_General_100_BIN2,
                 N'þ', N'__');

Returns:

this is the news with a __

有关使用排序规则、Unicode、编码等的更多信息,请访问:Collations Info