将 Swift 中的字符串中的任何 UTF-8 编码字符从 API 中转换

Converting any UTF-8 encoded Characters in String in Swift from API

我正在制作一个应用程序来检查我学校的成绩和作业。在网上查看作业时,您会看到:

但服务器实际上 returns 包含两个常规字符的字符串,而中文字符保持常规 UTF-8 编码形式:

我将如何解析 Swift 中的原始字符串并解码任何 UTF-8 编码的字符。 .我很难在网上找到甚至找出解决方案。仅供参考,我无法在后端更改任何内容。

您有少数 HTML/XML 个实体。您可以像这样将它们转换为 "normal text":

// Class declaration in ViewController.h
@interface ViewController : UIViewController <NSXMLParserDelegate>
// Implementation of methods in ViewController.m
- (void)viewDidLoad {
    [super viewDidLoad];

    NSString *xml = @"<root>&#21271;</root>";
    NSData *data = [NSData dataWithBytes:[xml UTF8String] length:[xml length]];
    NSXMLParser *parser = [[NSXMLParser alloc] initWithData:data];
    parser.delegate = self;

    [parser parse];
}

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {
    NSLog(@"string: %@", string);
}

日志输出为:

string: 北

您可以使用 NSAttributedString 将这些 HTML 实体转换为字符串。

let htmlString = "test&#21271;&#20140;&#30340;test"
if let htmldata = htmlString.dataUsingEncoding(NSUTF8StringEncoding), let attributedString = try? NSAttributedString(data: htmldata, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType], documentAttributes: nil) {
    let finalString = attributedString.string
    print(finalString)
    //output: test北京的test
}

如果只需要转换数字实体,可以使用CFStringTransform(_:_:_:_:)

Declaration

func CFStringTransform(_ string: CFMutableString!, 
                     _ range: UnsafeMutablePointer<CFRange>!, 
                     _ transform: CFString!, 
                     _ reverse: Bool) -> Bool

...

transform

A CFString object that identifies the transformation to apply. For a list of valid values, see Transform Identifiers for CFStringTransform. In macOS 10.4 and later, you can also use any valid ICU transform ID defined in the ICU User Guide for Transforms.

(代码在 Swift 3/Xcode 8,iOS 8.4 模拟器中测试。)

func decodeNumericEntities(_ input: String) -> String {
    let nsMutableString = NSMutableString(string: input)
    CFStringTransform(nsMutableString, nil, "Any-Hex/XML10" as CFString, true)
    return nsMutableString as String
}

print(decodeNumericEntities("from &#21271;&#20140;")) //->from 北京

或者如果您更喜欢计算 属性 和扩展:

extension String {
    var decodingNumericEntities: String {
        let nsMutableString = NSMutableString(string: self)
        CFStringTransform(nsMutableString, nil, "Any-Hex/XML10" as CFString, true)
        return nsMutableString as String
    }
}

print("from &#21271;&#20140;".decodingNumericEntities) //->from 北京

请记住,上面的这些代码不适用于命名字符实体,例如 &gt;&amp;

(来自 スタック・おーバーフロー(日文 Whosebug 中的 this thread。)