如何从 VBScript 中的单个字符获取 UTF-8 编码
How to get the UTF-8 code from a single character in VBScript
我想获取字符的 UTF-8 代码,尝试使用流但它似乎不起作用:
示例:根据https://en.wikipedia.org/wiki/Pe_(Semitic_letter)#Character_encodings
,פ应该给出16#D7A4
Const adTypeBinary = 1
Dim adoStr, bytesthroughado
Set adoStr = CreateObject("Adodb.Stream")
adoStr.Charset = "utf-8"
adoStr.Open
adoStr.WriteText labelString
adoStr.Position = 0
adoStr.Type = adTypeBinary
adoStr.Position = 3
bytesthroughado = adoStr.Read
Msgbox(LenB(bytesthroughado)) 'gives 2
adoStr.Close
Set adoStr = Nothing
MsgBox(bytesthroughado) ' gives K
注意:AscW 给出的是 Unicode - 不是 UTF-8
bytesthroughado
是 byte()
子类型的值(参见第一行输出),因此您需要以适当的方式处理它:
Option Explicit
Dim ss, xx, ii, jj, char, labelString
labelString = "ařЖפ€"
ss = ""
For ii=1 To Len( labelString)
char = Mid( labelString, ii, 1)
xx = BytesThroughAdo( char)
If ss = "" Then ss = VarType(xx) & " " & TypeName( xx) & vbNewLine
ss = ss & char & vbTab
For jj=1 To LenB( xx)
ss = ss & Hex( AscB( MidB( xx, jj, 1))) & " "
Next
ss = ss & vbNewLine
Next
Wscript.Echo ss
Function BytesThroughAdo( labelChar)
Const adTypeBinary = 1 'Indicates binary data.
Const adTypeText = 2 'Default. Indicates text data.
Dim adoStream
Set adoStream = CreateObject( "Adodb.Stream")
adoStream.Charset = "utf-8"
adoStream.Open
adoStream.WriteText labelChar
adoStream.Position = 0
adoStream.Type = adTypeBinary
adoStream.Position = 3
BytesThroughAdo = adoStream.Read
adoStream.Close
Set adoStream = Nothing
End Function
输出:
cscript D:\bat\SO368074q.vbs
8209 Byte()
a 61
ř C5 99
Ж D0 96
פ D7 A4
€ E2 82 AC
我使用字符 ařЖפ€
来演示您的 UTF-8 编码器的功能(alts8.ps1
PowerShell 脚本来自另一个项目):
alts8.ps1 "ařЖפ€"
Ch Unicode Dec CP IME UTF-8 ? IME 0405/cs-CZ; CP852; ANSI 1250
a U+0061 97 …97… 0x61 a Latin Small Letter A
ř U+0159 345 …89… 0xC599 Å� Latin Small Letter R With Caron
Ж U+0416 1046 …22… 0xD096 Ð� Cyrillic Capital Letter Zhe
פ U+05E4 1508 …228… 0xD7A4 פ Hebrew Letter Pe
€ U+20AC 8364 …172… 0xE282AC â�¬ Euro Sign
我想获取字符的 UTF-8 代码,尝试使用流但它似乎不起作用:
示例:根据https://en.wikipedia.org/wiki/Pe_(Semitic_letter)#Character_encodings
,פ应该给出16#D7A4Const adTypeBinary = 1
Dim adoStr, bytesthroughado
Set adoStr = CreateObject("Adodb.Stream")
adoStr.Charset = "utf-8"
adoStr.Open
adoStr.WriteText labelString
adoStr.Position = 0
adoStr.Type = adTypeBinary
adoStr.Position = 3
bytesthroughado = adoStr.Read
Msgbox(LenB(bytesthroughado)) 'gives 2
adoStr.Close
Set adoStr = Nothing
MsgBox(bytesthroughado) ' gives K
注意:AscW 给出的是 Unicode - 不是 UTF-8
bytesthroughado
是 byte()
子类型的值(参见第一行输出),因此您需要以适当的方式处理它:
Option Explicit
Dim ss, xx, ii, jj, char, labelString
labelString = "ařЖפ€"
ss = ""
For ii=1 To Len( labelString)
char = Mid( labelString, ii, 1)
xx = BytesThroughAdo( char)
If ss = "" Then ss = VarType(xx) & " " & TypeName( xx) & vbNewLine
ss = ss & char & vbTab
For jj=1 To LenB( xx)
ss = ss & Hex( AscB( MidB( xx, jj, 1))) & " "
Next
ss = ss & vbNewLine
Next
Wscript.Echo ss
Function BytesThroughAdo( labelChar)
Const adTypeBinary = 1 'Indicates binary data.
Const adTypeText = 2 'Default. Indicates text data.
Dim adoStream
Set adoStream = CreateObject( "Adodb.Stream")
adoStream.Charset = "utf-8"
adoStream.Open
adoStream.WriteText labelChar
adoStream.Position = 0
adoStream.Type = adTypeBinary
adoStream.Position = 3
BytesThroughAdo = adoStream.Read
adoStream.Close
Set adoStream = Nothing
End Function
输出:
cscript D:\bat\SO368074q.vbs
8209 Byte() a 61 ř C5 99 Ж D0 96 פ D7 A4 € E2 82 AC
我使用字符 ařЖפ€
来演示您的 UTF-8 编码器的功能(alts8.ps1
PowerShell 脚本来自另一个项目):
alts8.ps1 "ařЖפ€"
Ch Unicode Dec CP IME UTF-8 ? IME 0405/cs-CZ; CP852; ANSI 1250 a U+0061 97 …97… 0x61 a Latin Small Letter A ř U+0159 345 …89… 0xC599 Å� Latin Small Letter R With Caron Ж U+0416 1046 …22… 0xD096 Ð� Cyrillic Capital Letter Zhe פ U+05E4 1508 …228… 0xD7A4 פ Hebrew Letter Pe € U+20AC 8364 …172… 0xE282AC â�¬ Euro Sign