iOS NSXML 解析器 - 从 XML 标签一致地导出图像源 URL

iOS NSXMLParser - Consistently Derive Image Source URL From XML Tag

我在我的应用程序中使用 RSS 提要,特别是 Drudge Report 的。我对这类东西很陌生,对使用 Xcode 的 NSXMLParser 也是陌生的。每个提要显然代表一篇文章。每个提要都由 <item></item> 标签表示。

在这些标签中,<description></description> 标签包含对信息的描述。在描述中,某些文章可能有与该文章关联的图像,如以下屏幕截图所示:

我突出显示的部分是我需要获取的图像(具体来说,URL 字符串)。我能够将每篇文章的描述导出为 NSMutableString,但是当我使用 NSXMLParser 解析 XML 时,如何导出图像的 URL?以下是我如何完成所有这些的代码:

@interface ViewController () <NSXMLParserDelegate, UITableViewDataSource, UITableViewDelegate> {
    NSXMLParser *parser;
    NSMutableArray *feeds;
    NSMutableDictionary *item;
    NSMutableString *title;
    NSMutableString *link;
    NSMutableString *description;
    NSString *element;
}
.
.(other code)
.
#pragma mark - NSXMLParserDelegate

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
{
    element = elementName;
    if ([element isEqualToString:@"item"]) {
        item        = [[NSMutableDictionary alloc] init];
        title       = [[NSMutableString alloc] init];
        link        = [[NSMutableString alloc] init];
        description = [[NSMutableString alloc] init];
    }
}

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {

    if ([element isEqualToString:@"title"]) {
        [title appendString:string];
    }
    else if ([element isEqualToString:@"feedburner:origLink"]) {
        [link appendString:string];
    }
    else if ([element isEqualToString:@"description"]) {
        [description appendString:string];
    }
}

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {

    if ([elementName isEqualToString:@"item"]) {
        NSString *filteredTitle = [title stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        NSString *filteredLink = [link stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];

        if (![filteredLink containsString:@"https://itunes.apple.com/"]) {
            [item setObject:filteredTitle forKey:@"title"];
            [item setObject:filteredLink forKey:@"link"];
            [item setObject:description forKey:@"description"];

            [feeds addObject:[item copy]];
        }
    }
}

- (void)parserDidEndDocument:(NSXMLParser *)parser {
    [self.tableView reloadData];
}

进度

到目前为止,我在 didEndElement 方法中添加了以下内容:

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {

    if ([elementName isEqualToString:@"item"]) {
        NSString *filteredTitle = [title stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        NSString *filteredLink = [link stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];

        if (![filteredLink containsString:@"https://itunes.apple.com/"]) {
            [item setObject:filteredTitle forKey:@"title"];
            [item setObject:filteredLink forKey:@"link"];
            [item setObject:description forKey:@"description"];
            if ([description rangeOfString:@"img style"].location != NSNotFound)
            {

            }

            [feeds addObject:[item copy]];
        }
    }
}

现在我知道描述中有 img style 字符串,我需要获取 src="whateverImageURL"。如何使用正则表达式获取此图像的第一次出现 URL?

你必须执行这个协议

- (void)parser:(NSXMLParser *)parser foundAttributeDeclarationWithName:(NSString *)attributeName forElement:(NSString *)elementName type:(nullable NSString *)type defaultValue:(nullable NSString *)defaultValue;

这允许您获取找到的每个元素的所有属性。

如果这对你有帮助,请告诉我:)

更新

这里是一个代码,用于查找在给定字符串中找到的第一个 img 的 url

 NSString *descriptionString = @"&lt;br&gt;&lt;tt&gt;&lt;font size=\"3\" color=\"blue\"&gt;&lt;b&gt;&lt;u&gt;LIST: 10 Worst Winter Storms in Washington History...&lt;/u&gt;&lt;/b&gt;&lt;/font&gt;&lt;/tt&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;font face=\"Arial\" size=\"1\"&gt;&lt;i&gt;(Top headline, 3rd story, &lt;a href=\"http://www.nbcwashington.com/news/local/Ten-Worst-Storms-in-DC-History-365815301.html\"&gt;link&lt;/a&gt;)&lt;/i&gt;&lt;/font&gt;&lt;hr style=\"height: 1px; border-style: none; color: #666666; background-color: #666666;\"/&gt;&lt;font face=\"Arial\" size=\"2\"&gt;Related stories:&lt;div class=\"related-links\" id=\"R:H1:S3\"&gt;&lt;a href=\"http://www.wunderground.com/US/DC/001.html#WIN\"&gt;BLIZZARD WARNING ISSUED FOR DC; BURBS UP TO 30\"...&lt;/a&gt;&lt;br&gt;&lt;a href=\"http://washington.cbslocal.com/2016/01/19/winter-is-finally-here-deep-freeze-and-snow-in-the-forecast/\"&gt;Mayor Requests Help From National Guard...&lt;/a&gt;&lt;br&gt;&lt;a href=\"http://www.accuweather.com/en/weather-news/snow-storm-travel-disruptions-aim-for-nyc-dc-boston-philadelphia-friday-saturday/54870622\"&gt;UPDATE...&lt;/a&gt;&lt;br&gt;&lt;a href=\"http://www.infowars.com/snowmaggedon2016-empty-store-shelves-as-panicked-shoppers-ransack-grocery-stores/\"&gt;Anxious Shoppers Ransack Grocery Stores...&lt;/a&gt;&lt;br&gt;&lt;a href=\"http://motherboard.vice.com/read/dark-web-users-are-worried-snowstorm-jonas-will-disrupt-their-deliveries\"&gt;Dark Web Users Fear Delivery Disruptions...&lt;/a&gt;&lt;br&gt;&lt;a href=\"https://www.washingtonpost.com/news/to-your-health/wp/2016/01/21/heres-why-some-people-drop-dead-while-shoveling-snow/\"&gt;Cold weather, shoveling form heart attack 'perfect storm'...&lt;/a&gt;&lt;br&gt;&lt;/div&gt;&lt;/font&gt;&lt;br&gt;&lt;div class=\"feedflare\"&gt;    &lt;a href=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?a=Mtf4NlmV8XU:vDGXzaysxPw:yIl2AUoC8zA\"&gt;&lt;img src=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?d=yIl2AUoC8zA\" border=\"0\"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?a=Mtf4NlmV8XU:vDGXzaysxPw:V_sGLiPBpWU\"&gt;&lt;img src=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?i=Mtf4NlmV8XU:vDGXzaysxPw:V_sGLiPBpWU\" border=\"0\"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?a=Mtf4NlmV8XU:vDGXzaysxPw:qj6IDK7rITs\"&gt;&lt;img src=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?d=qj6IDK7rITs\" border=\"0\"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?a=Mtf4NlmV8XU:vDGXzaysxPw:gIN9vFwOqvQ\"&gt;&lt;img src=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?i=Mtf4NlmV8XU:vDGXzaysxPw:gIN9vFwOqvQ\" border=\"0\"&gt;&lt;/img&gt;&lt;/a&gt; &lt;/div&gt;&lt;img src=\"http://feeds.feedburner.com/~r/DrudgeReportFeed/~4/Mtf4NlmV8XU\" height=\"1\" width=\"1\" alt=\"\"/&gt";
NSString *stringWithoutWhiteSpace = [descriptionString stringByReplacingOccurrencesOfString:@" " withString:@""];
NSInteger srcLocation = [stringWithoutWhiteSpace rangeOfString:@"src="].location;
if ( srcLocation!= NSNotFound) {
    NSString *firstSrcImg = [stringWithoutWhiteSpace substringFromIndex:srcLocation];
    NSArray *componment = [firstSrcImg componentsSeparatedByString:@"\""];
    NSString *url = componment[1];
    NSLog(@"%@", url);
}

我邀请您试用并告诉我它是否能回答您的问题... 我可以给出另一个代码 return all img urls :)

第二次更新 例如,我在这里做了一个你可以使用的方法:

- (NSString*) getNextURLFromString:(NSString*) str withURLTag:(NSString*) urlTag{
NSString *stringWithoutWhiteSpace = [str stringByReplacingOccurrencesOfString:@" " withString:@""];
NSInteger srcLocation = [stringWithoutWhiteSpace rangeOfString:urlTag].location;
if ( srcLocation!= NSNotFound) {
    NSString *firstSrcImg = [stringWithoutWhiteSpace substringFromIndex:srcLocation];
    NSArray *componment = [firstSrcImg componentsSeparatedByString:@"\""];
    NSString *url = componment[1];
    return url;
}
return nil;
}

对于 urlTag 参数放 @"src=" 并为 str 参数添加描述标签值

更新 3

这里有一个方法 return 所有图像 url

- (NSArray*) getAllURLFromString:(NSString*) str withURLTag:(NSString*) urlTag{
NSMutableArray *result = [NSMutableArray array];
NSString *stringWithoutWhiteSpace = [str stringByReplacingOccurrencesOfString:@" " withString:@""];
NSInteger srcLocation = [stringWithoutWhiteSpace rangeOfString:urlTag].location;
if ( srcLocation!= NSNotFound) {
    NSString *firstSrcImg = [stringWithoutWhiteSpace substringFromIndex:srcLocation];
    NSArray *componment = [firstSrcImg componentsSeparatedByString:@"\""];
    if ([componment count]>1) {
        NSString *url = componment[1];
        [result addObject:url];

        NSArray *nextComponent = [stringWithoutWhiteSpace componentsSeparatedByString:url];
        if ([nextComponent count]>1) {
            [result addObjectsFromArray:[self getAllURLFromString:nextComponent[1] withURLTag:urlTag]];
        }
    }

    return result;
}
return result;
}

对于 urlTag 参数放 @"src="

并为 str 参数添加描述标签值

您必须在

中执行以下操作

foundCharacters: method.

   else if ([element isEqualToString:@"description"]) 
{
        [description appendString:string];
if ([description rangeOfString:@"img"].location != NSNotFound)
    {
        NSRange firstRange = [previewImage rangeOfString:@"src="];
        NSRange endRange = [[previewImage substringFromIndex:firstRange.location] rangeOfString:@" width=\""];
        NSString *finalLink = [[NSString alloc] init];
        finalLink = [previewImage substringWithRange:NSMakeRange(firstRange.location, endRange.location)];
        NSString *match = @"src=\"";
        NSString *postMatch;
        NSScanner *scanner = [NSScanner scannerWithString:finalLink];
        [scanner scanString:match intoString:nil];
        postMatch = [finalLink substringFromIndex:scanner.scanLocation];
        NSString *finalURL = [postMatch stringByAppendingString:@""];
        description = finalURL;
    }
    }
}
  • 因为在你的 foundCharacters 中,你已经获得了描述标签 你需要在你的描述数组中搜索你追加的文本 字符串。
  • 你可以扫描整个字符串然后存储所需的 变量中的子字符串...即 ur URL link
  • 使用 firstRange 变量设置 ull 获取字符串的范围
  • 和 endrange 变量将文本设置到您希望字符串结束的位置(在您的情况下 url)

这里我将 URL 存储在 previewImage 中。

希望它对你有用......

经过一些研究,我设法解决了我的问题。我只是需要一些使用 NSRange 的练习。在我的例子中,这个想法是,当我有一个包含 NSString "img style" 的描述时,我知道我需要第一个 "src="whateverImageURL" 字符串可以得到。我在下面的代码中这样做:

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
    if ([elementName isEqualToString:@"item"]) {
        NSString *filteredTitle = [title stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        NSString *filteredLink = [link stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];

        if (![filteredLink containsString:@"https://itunes.apple.com/"]) {
            [item setObject:filteredTitle forKey:@"title"];
            [item setObject:filteredLink forKey:@"link"];
            [item setObject:description forKey:@"description"];
            if ([description rangeOfString:@"img style"].location != NSNotFound) {
                NSString *finalImageURL;
                NSRange startRange = [description rangeOfString:@"src=\""];
                finalImageURL = [description substringFromIndex:startRange.location];
                finalImageURL = [finalImageURL substringFromIndex:startRange.length];
                NSRange endRange = [finalImageURL rangeOfString:@"\""];
                finalImageURL = [finalImageURL substringToIndex:endRange.location];
            }

            [feeds addObject:[item copy]];
        }
    }
}