如何从 VBScript 中的单个字符获取 UTF-8 编码

How to get the UTF-8 code from a single character in VBScript

我想获取字符的 UTF-8 代码,尝试使用流但它似乎不起作用:

示例:根据https://en.wikipedia.org/wiki/Pe_(Semitic_letter)#Character_encodings

,פ应该给出16#D7A4
Const adTypeBinary = 1
Dim adoStr, bytesthroughado
Set adoStr = CreateObject("Adodb.Stream")
    adoStr.Charset = "utf-8"
    adoStr.Open
    adoStr.WriteText labelString
    adoStr.Position = 0 
    adoStr.Type = adTypeBinary
    adoStr.Position = 3 
    bytesthroughado = adoStr.Read
    Msgbox(LenB(bytesthroughado)) 'gives 2
    adoStr.Close
Set adoStr = Nothing
MsgBox(bytesthroughado) ' gives K

注意:AscW 给出的是 Unicode - 不是 UTF-8

bytesthroughadobyte() 子类型的值(参见第一行输出),因此您需要以适当的方式处理它:

Option Explicit

Dim ss, xx, ii, jj, char, labelString

labelString = "ařЖפ€"
ss = ""
For ii=1 To Len( labelString)
  char = Mid( labelString, ii, 1)
  xx = BytesThroughAdo( char)
  If ss = "" Then ss = VarType(xx) & " " & TypeName( xx) & vbNewLine
  ss = ss & char & vbTab
  For jj=1 To LenB( xx)
      ss = ss & Hex( AscB( MidB( xx, jj, 1))) & " "
  Next
  ss = ss & vbNewLine
Next   

Wscript.Echo ss

Function BytesThroughAdo( labelChar)
    Const adTypeBinary = 1  'Indicates binary data.
    Const adTypeText   = 2  'Default. Indicates text data.
    Dim adoStream
    Set adoStream = CreateObject( "Adodb.Stream")
    adoStream.Charset = "utf-8"
    adoStream.Open
    adoStream.WriteText labelChar
    adoStream.Position = 0 
    adoStream.Type = adTypeBinary
    adoStream.Position = 3 
    BytesThroughAdo = adoStream.Read
    adoStream.Close
    Set adoStream = Nothing
End Function

输出:

cscript D:\bat\SO368074q.vbs
8209 Byte()
a       61
ř       C5 99
Ж       D0 96
פ       D7 A4
€       E2 82 AC

我使用字符 ařЖפ€ 来演示您的 UTF-8 编码器的功能(alts8.ps1 PowerShell 脚本来自另一个项目):

alts8.ps1 "ařЖפ€"
Ch Unicode     Dec    CP    IME     UTF-8   ?  IME 0405/cs-CZ; CP852; ANSI 1250

 a  U+0061      97         …97…      0x61   a  Latin Small Letter A
 ř  U+0159     345         …89…    0xC599  Å�  Latin Small Letter R With Caron
 Ж  U+0416    1046         …22…    0xD096  Ð�  Cyrillic Capital Letter Zhe
 פ  U+05E4    1508        …228…    0xD7A4  פ  Hebrew Letter Pe
 €  U+20AC    8364        …172…  0xE282AC â�¬  Euro Sign