lua 脚本中的西里尔字符串值与 c#

Cyrillic string value in lua script with c#

我尝试使用 NLua 库 (nlua.org) 在 C# 项目中添加 lua-脚本。 我的问题是字符串值中的西里尔符号表示不正确。 我的 C# 代码是:

Lua lua = new Lua();
lua.DoFile("script.lua");
Console.WriteLine(lua["var"]);

脚本文件代码为:

var = 'кириллица'

更改脚本文件编码对我没有帮助。我还尝试使用以下代码搜索正确的脚本文件编码:

foreach (EncodingInfo ei in Encoding.GetEncodings()) {
    Encoding e = ei.GetEncoding ();
    string s1 = "cyrillic кириллица";
    System.IO.File.Delete ("script.lua");
    System.IO.File.AppendAllText ("script.lua", "var = '" + s1 + "'", e);
    string s2;
    try {
        Lua lua = new Lua ();
        lua.DoFile ("script.lua");
        s2 = lua ["var"] as string;
    } catch {
        s2 = "error in lua";
    }
    Console.WriteLine ("[{0}]\t({1})", s2, e.HeaderName);
}

这是控制台输出:

[error in lua] (IBM037)
[cyrillic ?????????] (IBM437)
[error in lua] (IBM500)
[cyrillic ?????????] (asmo-708)
[cyrillic ?????????] (ibm850)
[cyrillic ?????????] (ibm852)
[cyrillic Æ·á·Ðз¤*] (ibm855)
[cyrillic ?????????] (ibm857)
[cyrillic ?????????] (IBM00858)
[cyrillic ?????????] (ibm860)
[cyrillic ?????????] (ibm861)
[cyrillic ?????????] (ibm861)
[cyrillic ?????????] (IBM863)
[cyrillic ?????????] (ibm864)
[cyrillic ?????????] (IBM865)
[cyrillic ª¨à¨««¨æ*] (ibm866)
[cyrillic ?????????] (ibm869)
[error in lua] (ibm870)
[cyrillic ?????????] (windows-874)
[error in lua] (ibm875)
[cyrillic
{
y
‚
y
|
|
y

p] (iso-2022-jp)
[cyrillic §Ü§Ú§â§Ú§Ý§Ý§Ú§è§Ñ] (gb2312)
[cyrillic ¬Ü¬Ú¬â¬Ú¬Ý¬Ý¬Ú¬è¬Ñ] (ks_c_5601-1987)
[cyrillic ?????????] (big5)
[error in lua] (ibm1026)
[error in lua] (ibm1047)
[error in lua] (IBM01140)
[error in lua] (IBM01141)
[error in lua] (IBM01142)
[error in lua] (IBM01143)
[error in lua] (ibm1144)
[error in lua] (ibm1145)
[error in lua] (ibm1146)
[error in lua] (ibm1147)
[error in lua] (ibm1148)
[error in lua] (ibm1149)
[error in lua] (utf-16)
[error in lua] (utf-16BE)
[cyrillic ?????????] (windows-1250)
[cyrillic êèðèëëèöà] (windows-1251)
[cyrillic ?????????] (Windows-1252)
[cyrillic ?????????] (windows-1253)
[cyrillic ?????????] (windows-1254)
[cyrillic ?????????] (windows-1255)
[cyrillic ?????????] (windows-1256)
[cyrillic ?????????] (windows-1257)
[cyrillic ?????????] (windows-1258)
[cyrillic ?????????] (macintosh)
[cyrillic ?????????] (x-mac-icelandic)
[error in lua] (utf-32)
[error in lua] (utf-32BE)
[cyrillic ?????????] (us-ascii)
[error in lua] (IBM273)
[error in lua] (IBM277)
[error in lua] (IBM278)
[error in lua] (IBM280)
[error in lua] (IBM284)
[error in lua] (IBM285)
[error in lua] (IBM290)
[error in lua] (IBM297)
[error in lua] (IBM420)
[error in lua] (IBM424)
[cyrillic ËÉÒÉÌÌÉÃÁ] (koi8-r)
[error in lua] (IBM871)
[error in lua] (IBM1025)
[cyrillic ËÉÒÉÌÌÉÃÁ] (koi8-u)
[cyrillic ?????????] (iso-8859-1)
[cyrillic ?????????] (iso-8859-2)
[cyrillic ?????????] (iso-8859-3)
[cyrillic ?????????] (iso-8859-4)
[cyrillic ÚØàØÛÛØæÐ] (iso-8859-5)
[cyrillic ?????????] (iso-8859-6)
[cyrillic ?????????] (iso-8859-7)
[cyrillic ?????????] (iso-8859-8)
[cyrillic ?????????] (iso-8859-9)
[cyrillic ?????????] (iso-8859-15)
[cyrillic ?????????] (windows-38598)
[cyrillic ?????????] (iso-2022-jp)
[cyrillic ?????????] (iso-2022-jp)
[cyrillic ?????????] (iso-2022-jp)
[cyrillic §Ü§Ú§â§Ú§Ý§Ý§Ú§è§Ñ] (euc-jp)
[cyrillic ¬Ü¬Ú¬â¬Ú¬Ý¬Ý¬Ú¬è¬Ñ] (euc-kr)
[cyrillic §Ü§Ú§â§Ú§Ý§Ý§Ú§è§Ñ] (GB18030)
[cyrillic ?????????] (x-iscii-de)
[cyrillic ?????????] (x-iscii-be)
[cyrillic ?????????] (x-iscii-ta)
[cyrillic ?????????] (x-iscii-te)
[cyrillic ?????????] (x-iscii-as)
[cyrillic ?????????] (x-iscii-or)
[cyrillic ?????????] (x-iscii-ka)
[cyrillic ?????????] (x-iscii-ma)
[cyrillic ?????????] (x-iscii-gu)
[cyrillic ?????????] (x-iscii-pa)
[error in lua] (utf-7)
[error in lua] (utf-8) 

你可以看到根本没有正确的变体。所以我不知道如何解决这个问题。

我现在已经做了这件事。这是我的代码:

foreach (EncodingInfo ei1 in Encoding.GetEncodings()) {
                Encoding e1 = ei1.GetEncoding ();
                string s1 = "кириллица";
                System.IO.File.Delete ("script.lua");
                System.IO.File.AppendAllText ("script.lua", "var = '" + s1 + "'", e1);
                string s2;
                try {
                    Lua lua = new Lua ();
                    lua.DoFile ("script.lua");
                    s2 = lua ["var"] as string;
                    foreach (EncodingInfo ei2 in Encoding.GetEncodings()) {
                        Encoding e2 = ei2.GetEncoding ();
                        byte[] bytes = e2.GetBytes (s2);
                        foreach (EncodingInfo ei3 in Encoding.GetEncodings()) {
                            try {
                                Encoding e3 = ei3.GetEncoding ();
                                string s3 = e3.GetString (bytes);
                                if (s1 == s3)
                                    Console.WriteLine ("({0})=>({1})=>({2}):[{3}]",e1.HeaderName, e2.HeaderName, e3.HeaderName, s3);
                            } catch { }
                        }
                    }
                } catch { }
            }

我尝试用每种编码编写脚本文件。比读取值并将其从每种编码转换为每种编码。之后我将初始文本与最终文本进行比较。控制台输出表示正确的变体:

(ibm855)=>(Windows-1252)=>(ibm855):[кириллица]
(ibm855)=>(iso-8859-1)=>(ibm855):[кириллица]
(ibm866)=>(Windows-1252)=>(ibm866):[кириллица]
(ibm866)=>(windows-1254)=>(ibm866):[кириллица]
(ibm866)=>(windows-1258)=>(ibm866):[кириллица]
(ibm866)=>(iso-8859-1)=>(ibm866):[кириллица]
(ibm866)=>(iso-8859-9)=>(ibm866):[кириллица]
(iso-2022-jp)=>(asmo-708)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-1)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-2)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-3)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-4)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-5)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-6)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-7)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-8)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-9)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(iso-8859-15)=>(iso-2022-jp):[кириллица]
(iso-2022-jp)=>(windows-38598)=>(iso-2022-jp):[кириллица]
(gb2312)=>(Windows-1252)=>(gb2312):[кириллица]
(gb2312)=>(Windows-1252)=>(euc-jp):[кириллица]
(gb2312)=>(Windows-1252)=>(GB18030):[кириллица]
(gb2312)=>(iso-8859-1)=>(gb2312):[кириллица]
(gb2312)=>(iso-8859-1)=>(euc-jp):[кириллица]
(gb2312)=>(iso-8859-1)=>(GB18030):[кириллица]
(gb2312)=>(iso-8859-15)=>(gb2312):[кириллица]
(gb2312)=>(iso-8859-15)=>(euc-jp):[кириллица]
(gb2312)=>(iso-8859-15)=>(GB18030):[кириллица]
(ks_c_5601-1987)=>(Windows-1252)=>(ks_c_5601-1987):[кириллица]
(ks_c_5601-1987)=>(Windows-1252)=>(euc-kr):[кириллица]
(ks_c_5601-1987)=>(iso-8859-1)=>(ks_c_5601-1987):[кириллица]
(ks_c_5601-1987)=>(iso-8859-1)=>(euc-kr):[кириллица]
(ks_c_5601-1987)=>(iso-8859-15)=>(ks_c_5601-1987):[кириллица]
(ks_c_5601-1987)=>(iso-8859-15)=>(euc-kr):[кириллица]
(windows-1251)=>(Windows-1252)=>(windows-1251):[кириллица]
(windows-1251)=>(iso-8859-1)=>(windows-1251):[кириллица]
(windows-1251)=>(iso-8859-15)=>(windows-1251):[кириллица]
(koi8-r)=>(Windows-1252)=>(koi8-r):[кириллица]
(koi8-r)=>(Windows-1252)=>(koi8-u):[кириллица]
(koi8-r)=>(windows-1254)=>(koi8-r):[кириллица]
(koi8-r)=>(windows-1254)=>(koi8-u):[кириллица]
(koi8-r)=>(iso-8859-1)=>(koi8-r):[кириллица]
(koi8-r)=>(iso-8859-1)=>(koi8-u):[кириллица]
(koi8-r)=>(iso-8859-9)=>(koi8-r):[кириллица]
(koi8-r)=>(iso-8859-9)=>(koi8-u):[кириллица]
(koi8-r)=>(iso-8859-15)=>(koi8-r):[кириллица]
(koi8-r)=>(iso-8859-15)=>(koi8-u):[кириллица]
(koi8-u)=>(Windows-1252)=>(koi8-r):[кириллица]
(koi8-u)=>(Windows-1252)=>(koi8-u):[кириллица]
(koi8-u)=>(windows-1254)=>(koi8-r):[кириллица]
(koi8-u)=>(windows-1254)=>(koi8-u):[кириллица]
(koi8-u)=>(iso-8859-1)=>(koi8-r):[кириллица]
(koi8-u)=>(iso-8859-1)=>(koi8-u):[кириллица]
(koi8-u)=>(iso-8859-9)=>(koi8-r):[кириллица]
(koi8-u)=>(iso-8859-9)=>(koi8-u):[кириллица]
(koi8-u)=>(iso-8859-15)=>(koi8-r):[кириллица]
(koi8-u)=>(iso-8859-15)=>(koi8-u):[кириллица]
(iso-8859-5)=>(Windows-1252)=>(iso-8859-5):[кириллица]
(iso-8859-5)=>(iso-8859-1)=>(iso-8859-5):[кириллица]
(iso-8859-5)=>(iso-8859-15)=>(iso-8859-5):[кириллица]
(euc-jp)=>(Windows-1252)=>(gb2312):[кириллица]
(euc-jp)=>(Windows-1252)=>(euc-jp):[кириллица]
(euc-jp)=>(Windows-1252)=>(GB18030):[кириллица]
(euc-jp)=>(iso-8859-1)=>(gb2312):[кириллица]
(euc-jp)=>(iso-8859-1)=>(euc-jp):[кириллица]
(euc-jp)=>(iso-8859-1)=>(GB18030):[кириллица]
(euc-jp)=>(iso-8859-15)=>(gb2312):[кириллица]
(euc-jp)=>(iso-8859-15)=>(euc-jp):[кириллица]
(euc-jp)=>(iso-8859-15)=>(GB18030):[кириллица]
(euc-kr)=>(Windows-1252)=>(ks_c_5601-1987):[кириллица]
(euc-kr)=>(Windows-1252)=>(euc-kr):[кириллица]
(euc-kr)=>(iso-8859-1)=>(ks_c_5601-1987):[кириллица]
(euc-kr)=>(iso-8859-1)=>(euc-kr):[кириллица]
(euc-kr)=>(iso-8859-15)=>(ks_c_5601-1987):[кириллица]
(euc-kr)=>(iso-8859-15)=>(euc-kr):[кириллица]
(GB18030)=>(Windows-1252)=>(gb2312):[кириллица]
(GB18030)=>(Windows-1252)=>(euc-jp):[кириллица]
(GB18030)=>(Windows-1252)=>(GB18030):[кириллица]
(GB18030)=>(iso-8859-1)=>(gb2312):[кириллица]
(GB18030)=>(iso-8859-1)=>(euc-jp):[кириллица]
(GB18030)=>(iso-8859-1)=>(GB18030):[кириллица]
(GB18030)=>(iso-8859-15)=>(gb2312):[кириллица]
(GB18030)=>(iso-8859-15)=>(euc-jp):[кириллица]
(GB18030)=>(iso-8859-15)=>(GB18030):[кириллица]

这里的名字是文件的编码。比值转换,因为它有第二个编码到第三个(我希望你能理解)。例如看这一行:

(windows-1251)=>(Windows-1252)=>(windows-1251):[кириллица]

表示脚本文件写于windows-1251。但是如果你想得到正确的文本,你需要把它从windows-1252编码转换成windows-1251编码。 我不知道,是 NLua 问题还是其他问题。

我为我的项目找到了简单的解决方案。我用

lua.DoString (System.IO.File.ReadAllText ("script.lua", enc));

而不是

lua.DoFile ("script.lua");

这里enc - 是我的脚本文件编码。