有没有办法获得 non-printable 字符的符号?
Is there a way to get a symbol of a non-printable character?
我想找到一种在 c# 中获取 non-printable 字符符号的方法(例如 "SOH"
表示标题开始,"BS"
表示退格)。有什么想法吗?
编辑: 我不需要可视化 non-printable 字符的字节值,但它的代码如下所示 https://web.itu.edu.tr/sgunduz/courses/mikroisl/ascii.html
例如 "NUL"
对应 0x00
,"SOH"
对应 0x01
等
在 Visual Studio 中仅显示 SOH 字符 (U+0001),然后像这样编码:
var bytes = Encoding.UTF8.GetBytes("☺");
现在你可以用它做任何你想做的事。对于退格键,使用 U+232B
您可能正在寻找一种 字符串转储 以便 可视化 控制字符。您可以借助 正则表达式 来实现,其中 \p{Cc}
匹配控制符号:
using Systen.Text.RegularExpressions;
...
string source = "BEL \u0007 then CR + LF \r\n SOH \u0001 [=10=][=10=]";
// To get control characters visible, we match them and
// replace with their codes
string result = Regex.Replace(
source, @"\p{Cc}",
m => $"[Control: 0x{(int)m.Value[0]:x4}]");
// Let's have a look:
// Initial string
Console.WriteLine(source);
Console.WriteLine();
// Control symbols visualized
Console.WriteLine(result);
结果:
BEL then CR + LF
SOH
BEL [Control: 0x0007] then CR + LF [Control: 0x000d][Control: 0x000a] SOH [Control: 0x0001] [Control: 0x0000][Control: 0x0000]
编辑: 如果您想以不同的方式进行可视化,您应该编辑 lambda
m => $"[Control: 0x{(int)m.Value[0]:x4}]"
例如:
static string[] knownCodes = new string[] {
"NULL", "SOH", "STX", "ETX", "EOT", "ENQ",
"ACK", "BEL", "BS", "HT", "LF", "VT",
"FF", "CR", "SO", "SI", "DLE", "DC1", "DC2",
"DC3", "DC4", "NAK", "SYN", "ETB", "CAN",
"EM", "SUB", "ESC", "FS", "GS", "RS", "US",
};
private static string StringDump(string source) {
if (null == source)
return source;
return Regex.Replace(
source,
@"\p{Cc}",
m => {
int code = (int)(m.Value[0]);
return code < knownCodes.Length
? $"[{knownCodes[code]}]"
: $"[Control 0x{code:x4}]";
});
}
演示:
Console.WriteLine(StringDump(source));
结果:
BEL [BEL] then CR + LF [CR][LF] SOH [SOH] [NULL][NULL]
我想找到一种在 c# 中获取 non-printable 字符符号的方法(例如 "SOH"
表示标题开始,"BS"
表示退格)。有什么想法吗?
编辑: 我不需要可视化 non-printable 字符的字节值,但它的代码如下所示 https://web.itu.edu.tr/sgunduz/courses/mikroisl/ascii.html
例如 "NUL"
对应 0x00
,"SOH"
对应 0x01
等
在 Visual Studio 中仅显示 SOH 字符 (U+0001),然后像这样编码:
var bytes = Encoding.UTF8.GetBytes("☺");
现在你可以用它做任何你想做的事。对于退格键,使用 U+232B
您可能正在寻找一种 字符串转储 以便 可视化 控制字符。您可以借助 正则表达式 来实现,其中 \p{Cc}
匹配控制符号:
using Systen.Text.RegularExpressions;
...
string source = "BEL \u0007 then CR + LF \r\n SOH \u0001 [=10=][=10=]";
// To get control characters visible, we match them and
// replace with their codes
string result = Regex.Replace(
source, @"\p{Cc}",
m => $"[Control: 0x{(int)m.Value[0]:x4}]");
// Let's have a look:
// Initial string
Console.WriteLine(source);
Console.WriteLine();
// Control symbols visualized
Console.WriteLine(result);
结果:
BEL then CR + LF
SOH
BEL [Control: 0x0007] then CR + LF [Control: 0x000d][Control: 0x000a] SOH [Control: 0x0001] [Control: 0x0000][Control: 0x0000]
编辑: 如果您想以不同的方式进行可视化,您应该编辑 lambda
m => $"[Control: 0x{(int)m.Value[0]:x4}]"
例如:
static string[] knownCodes = new string[] {
"NULL", "SOH", "STX", "ETX", "EOT", "ENQ",
"ACK", "BEL", "BS", "HT", "LF", "VT",
"FF", "CR", "SO", "SI", "DLE", "DC1", "DC2",
"DC3", "DC4", "NAK", "SYN", "ETB", "CAN",
"EM", "SUB", "ESC", "FS", "GS", "RS", "US",
};
private static string StringDump(string source) {
if (null == source)
return source;
return Regex.Replace(
source,
@"\p{Cc}",
m => {
int code = (int)(m.Value[0]);
return code < knownCodes.Length
? $"[{knownCodes[code]}]"
: $"[Control 0x{code:x4}]";
});
}
演示:
Console.WriteLine(StringDump(source));
结果:
BEL [BEL] then CR + LF [CR][LF] SOH [SOH] [NULL][NULL]