Unicode 的位串

Question

我有一串比特，像这样string str = "0111001101101000"是字母"sh"。
我需要用它来制作 Unicode 字母。我正在做以下事情：

BitArray bn = new BitArray(str.Length); //creating new bitarray
for (int kat = 0; kat < str.Length; kat++)
{
    if (str[kat].ToString() == "0")//adding boolean values into array
    {
        bn[kat] = false;
    }
    else
        bn[kat] = true;
}

byte[] bytes = new byte[bn.Length];//converting to bytes
bn.CopyTo(bytes, 0);
string output = Encoding.Unicode.GetString(bytes); //encoding                          

textBox2.Text = output; // result in textbox

但是输出的文本完全是一团糟。怎么做才对？

Answer 1

你的代码有几个问题。

首先BitArray会颠倒位序——这样更容易使用 Convert.ToByte
您的输入字符串包含两个字节（一个每个字符），但你正在使用 Encoding.Unicode 对其进行解码，这是UTF16编码（每个字符两个字节），需要使用Encoding.UTF8

工作代码

string str = "0111001101101000";

int numOfBytes = str.Length / 8;
byte[] bytes = new byte[numOfBytes];
for (int i = 0; i < numOfBytes; ++i)
{
    bytes[i] = Convert.ToByte(str.Substring(8 * i, 8), 2);
}

string output = Encoding.UTF8.GetString(bytes);

Answer 2

A) 你的字符串是 ASCII，不是 UNICODE：每个字符 8 位

B) 每个字节的最高有效位在左边，所以 bn[...]

中使用的奇怪数学

C) 注释部分无用，因为 "false" 是 BitArray

的默认状态

D) 字节数组的长度错误。 8 位 == 1 字节！ :-)

string str = "0111001101101000";

BitArray bn = new BitArray(str.Length); //creating new bitarray

for (int kat = 0; kat < str.Length; kat++) {
    if (str[kat] == '0')//adding boolean values into array
    {
        //bn[(kat / 8 * 8) + 7 - (kat % 8)] = false;
    } else {
        bn[(kat / 8 * 8) + 7 - (kat % 8)] = true;
    }
}

// 8 bits in a byte
byte[] bytes = new byte[bn.Length / 8];//converting to bytes
bn.CopyTo(bytes, 0);

string output = Encoding.ASCII.GetString(bytes); //encoding

可能更好：

string str = "0111001101101000";

byte[] bytes = new byte[str.Length / 8];

for (int ix = 0, weight = 128, ix2 = 0; ix < str.Length; ix++) {
    if (str[ix] == '1') {
        bytes[ix2] += (byte)weight;
    }

    weight /= 2;

    // Every 8 bits we "reset" the weight 
    // and increment the ix2
    if (weight == 0) {
        ix2++;
        weight = 128;
    }
}

string output = Encoding.ASCII.GetString(bytes); //encoding

Unicode 的位串

String of bits to Unicode

c#

string

unicode

bits