Ruby 加密导致非字母数字字符

Question

我正在尝试制作一个基本密码。

def caesar_crypto_encode(text, shift)  
  (text.nil? or text.strip.empty? ) ? "" : text.gsub(/[a-zA-Z]/){ |cstr| 
  ((cstr.ord)+shift).chr }
end

但是当位移太高时，我会得到这些字符：

  Test.assert_equals(caesar_crypto_encode("Hello world!", 127), "eBIIL TLOIA!")

  Expected: "eBIIL TLOIA!", instead got: "\xC7\xE4\xEB\xEB\xEE \xF6\xEE\xF1\xEB\xE3!"

这是什么格式？

Answer 1

I'm still curious about that format though...

这些字符表示得到每个字母的序数(ord)加上127(即(cstr.ord)+shift).chr)后对应的ASCII编码

为什么？从文档中检查 Integer#chr：

Returns a string containing the character represented by the int's value according to encoding.

所以，例如，拿你的第一个字母 "H":

char_ord = "H".ord
#=> 72

new_char_ord = char_ord + 127
#=> 199

new_char_ord.chr
#=> "\xC7"

所以，199对应"\xC7"。继续更改 "Hello world" 中的所有字符，您将得到 "\xC7\xE4\xEB\xEB\xEE \xF6\xEE\xF1\xEB\xE3".

为避免这种情况，您只需使用代表字母的 ord 值进行循环（在 可能重复 link 中回答）。

Answer 2

您得到详细输出的原因是因为 Ruby 是运行 UTF-8 编码，而您的转换刚刚产生了乱码字符（UTF-8 编码下的无效字符序列）。

ASCII字符A-Z表示为十进制数（序数）65-90，a-z为97-122。当您添加 127 时，您会将所有字符推入 8 位 space，这使得它们无法识别正确的 UTF-8 编码。

这就是 Ruby inspect 以引号形式输出编码字符串的原因，它将每个字符显示为其十六进制数 "\xC7..."。

如果你想从中得到一些相似的字符，你可以将乱码重新编码为支持 8 位字符的 ISO8859-1。

这样做会得到以下结果：

s = "\xC7\xE4\xEB\xEB\xEE \xF6\xEE\xF1\xEB\xE3!"
>> s.encoding
=> #<Encoding:UTF-8>

# Re-encode as ISO8859-1.
# Your terminal (and Ruby) is using UTF-8, so Ruby will refuse to print these yet.
>> s.force_encoding('iso8859-1')
=> "\xC7\xE4\xEB\xEB\xEE \xF6\xEE\xF1\xEB\xE3!"

# In order to be able to print ISO8859-1 on an UTF-8 terminal, you have to 
# convert them back to UTF-8 by re-encoding. This way your terminal (and Ruby)
# can display the ISO8859-1 8-bit characters using UTF-8 encoding:
>> s.encode('UTF-8')
=> "Çäëëî öîñëã!"

# Another way is just to repack the bytes into UTF-8:
>> s.bytes.pack('U*')
=> "Çäëëî öîñëã!"

当然，正确的做法是在任何情况下都不要让数字溢出到 8 位 space。你的加密算法有bug，需要保证输出在7位ASCII范围内

更好的解决方案

就像@tadman 建议的那样，您可以改用 tr：

AZ_SEQUENCE = *'A'..'Z' + *'a'..'z'

"Hello world!".tr(AZ_SEQUENCE.join, AZ_SEQUENCE.rotate(127).join)
=> "eBIIL tLOIA!

Ruby 加密导致非字母数字字符

Ruby Cyphering Leads to non Alphanumeric Characters

ruby

caesar-cipher