String returns separatedBy后只有数字

Question

我正在尝试像下面这样分隔字符串：

let path = "/Users/user/Downloads/history.csv"

    do {
        let contents = try NSString(contentsOfFile: path, encoding: String.Encoding.utf8.rawValue )
        let rows = contents.components(separatedBy: "\n")

        print("contents: \(contents)")
        print("rows: \(rows)")  

    }
    catch {
    }

我有两个文件，它们看起来几乎一模一样。第一个文件的输出是这样的：

输出文件 1:

contents: 2017-07-31 16:29:53,0.10109999,9.74414271,0.98513273,0.15%,42302999779,-0.98513273,9.72952650
2017-07-31 16:29:53,0.10109999,0.25585729,0.02586716,0.25%,42302999779,-0.02586716,0.25521765


rows: ["2017-07-31 16:29:53,0.10109999,9.74414271,0.98513273,0.15%,42302999779,-0.98513273,9.72952650", "2017-07-31 16:29:53,0.10109999,0.25585729,0.02586716,0.25%,42302999779,-0.02586716,0.25521765", "", ""]

输出文件 2:

contents: 40.75013313,0.00064825,5/18/2017 7:17:01 PM

19.04004820,0.00059900,5/19/2017 9:17:03 PM

rows: ["4[=13=][=13=].[=13=][=13=],[=13=][=13=].[=13=][=13=][=13=][=13=],[=13=]/[=13=]/[=13=][=13=] [=13=]:[=13=]:[=13=][=13=] [=13=]P[=13=]M[=13=]", "[=13=]", "1[=13=].[=13=][=13=][=13=][=13=][=13=],[=13=][=13=].[=13=][=13=][=13=][=13=][=13=][=13=],[=13=][=13=]/[=13=]/[=13=][=13=] [=13=]:[=13=]:[=13=][=13=] [=13=]P[=13=]M[=13=]", "[=13=]", "[=13=]", "[=13=]"]

所以这两个文件都可以作为字符串读取，因为 print(content) 正在工作。但是一旦字符串被分开，第二个文件就不再可读了。我尝试了不同的编码，但没有任何效果。有谁知道如何将字符串强制到第二个文件，以保持可读字符串？

Answer 1

您的文件显然是 UTF-16（小端）编码的：

$ hexdump fullorders4.csv 
0000000 4f 00 72 00 64 00 65 00 72 00 55 00 75 00 69 00
0000010 64 00 2c 00 45 00 78 00 63 00 68 00 61 00 6e 00
0000020 67 00 65 00 2c 00 54 00 79 00 70 00 65 00 2c 00
0000030 51 00 75 00 61 00 6e 00 74 00 69 00 74 00 79 00
...

对于ASCII字符，UTF-16编码的第一个字节是 ASCII码，第二字节为0

如果文件以 UTF-8 格式读取，则零将转换为 ASCII NUL 字符，即您在输出中看到的 [=12=]。

因此将编码指定为 utf16LittleEndian 有效在你的情况下：

let contents = try NSString(contentsOfFile: path, encoding: String.Encoding.utf16LittleEndian.rawValue)
// or:
let contents = try String(contentsOfFile: path, encoding: .utf16LittleEndian)

还有一种方法试图检测使用的编码（比较 iOS: What's the best way to detect a file's encoding）。在 Swift 中会是

var enc: UInt = 0
let contents = try NSString(contentsOfFile: path, usedEncoding: &enc)
// or:
var enc = String.Encoding.ascii
let contents = try String(contentsOfFile: path, usedEncoding: &enc)

但是，在您的特定情况下，这会将文件读取为 UTF-8 再次是因为它是有效的 UTF-8。前置 byte order mark (BOM) 到文件（FF FE for UTF-16 little-endian）将解决这个问题问题可靠。

String returns separatedBy后只有数字

String returns only numbers after separatedBy

string

encoding

utf

ios

swift