iOS objective-C NSData to NSString return nil,如何忽略无效的UTF-8
iOS objective-C NSData to NSString return nil, how to ignore the invalid UTF-8
data
是从网站下载的,
NSString * html = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
html
是 nil
,但是
NSString * html = [[NSString alloc] initWithData:data encoding:NSASCIIStringEncoding];
会有内容。
由于该网站包含中文字符,如果使用Ascii,则无法显示中文。我猜网站中有一些无效的UTF-8,所以使第一个代码无法正常工作。
有什么方法可以继续使用UTF-8但忽略一些无效错误吗?
我想我找到了解决办法。
Vincent Guerci's answer
将 libiconv 添加到您的项目并让它清理无效的 UTF-8,清理后,NSData 可以安全地传递给 [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
具体实现是:
- 将 "Link Binary With Libraries" 中的 "libiconv.2.dylib" 添加到您的目标。
#include "iconv.h"
- 添加此功能:
Objective C:
- (NSData *)cleanUTF8:(NSData *)data {
// this function is from
//
//
//
iconv_t cd = iconv_open("UTF-8", "UTF-8"); // convert to UTF-8 from UTF-8
int one = 1;
iconvctl(cd, ICONV_SET_DISCARD_ILSEQ, &one); // discard invalid characters
size_t inbytesleft, outbytesleft;
inbytesleft = outbytesleft = data.length;
char *inbuf = (char *)data.bytes;
char *outbuf = malloc(sizeof(char) * data.length);
char *outptr = outbuf;
if (iconv(cd, &inbuf, &inbytesleft, &outptr, &outbytesleft)
== (size_t)-1) {
NSLog(@"this should not happen, seriously");
return nil;
}
NSData *result = [NSData dataWithBytes:outbuf length:data.length - outbytesleft];
iconv_close(cd);
free(outbuf);
return result;
}
data
是从网站下载的,
NSString * html = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
html
是 nil
,但是
NSString * html = [[NSString alloc] initWithData:data encoding:NSASCIIStringEncoding];
会有内容。 由于该网站包含中文字符,如果使用Ascii,则无法显示中文。我猜网站中有一些无效的UTF-8,所以使第一个代码无法正常工作。
有什么方法可以继续使用UTF-8但忽略一些无效错误吗?
我想我找到了解决办法。
Vincent Guerci's answer
将 libiconv 添加到您的项目并让它清理无效的 UTF-8,清理后,NSData 可以安全地传递给 [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
具体实现是:
- 将 "Link Binary With Libraries" 中的 "libiconv.2.dylib" 添加到您的目标。
#include "iconv.h"
- 添加此功能:
Objective C:
- (NSData *)cleanUTF8:(NSData *)data {
// this function is from
//
//
//
iconv_t cd = iconv_open("UTF-8", "UTF-8"); // convert to UTF-8 from UTF-8
int one = 1;
iconvctl(cd, ICONV_SET_DISCARD_ILSEQ, &one); // discard invalid characters
size_t inbytesleft, outbytesleft;
inbytesleft = outbytesleft = data.length;
char *inbuf = (char *)data.bytes;
char *outbuf = malloc(sizeof(char) * data.length);
char *outptr = outbuf;
if (iconv(cd, &inbuf, &inbytesleft, &outptr, &outbytesleft)
== (size_t)-1) {
NSLog(@"this should not happen, seriously");
return nil;
}
NSData *result = [NSData dataWithBytes:outbuf length:data.length - outbytesleft];
iconv_close(cd);
free(outbuf);
return result;
}