One of my most popular blog posts — 24,000 reads — in the old, co-mingled site was a short snippet on how to strip HTML tags from a block of content in Objective-C. It’s been used by many-an-iOS developer (which was the original intent).
An intrepid reader & user (“Brian” – no other attribution available) found a memory leak that really rears it’s ugly head when parsing large-content blocks. The updated code is below (with the original post text) and also in the comments on the old site. If Brian reads this, please post full attribution info in the comments or to @hrbrmstr so I can give you proper credit.
I needed to strip the tags from some HTML that was embedded in an XML feed so I could display a short summary from the full content in a UITableView
. Rather than go through the effort of parsing HTML on the iPhone (as I already parsed the XML file) I built this simple method from some half-finished snippets I found. It has worked in all of the cases I have needed, but your mileage may vary. It is at least a working method (which cannot be said about most of the other examples). It works both in iOS (iPhone/iPad) and in plain-old OS X code, too.
– (NSString *) stripTags:(NSString *)str {
NSMutableString *html = [NSMutableString stringWithCapacity:[str length]];
NSScanner *scanner = [NSScanner scannerWithString:str];
NSString *tempText = nil;
while (![scanner isAtEnd]) {
[scanner scanUpToString:@"<" intoString:&tempText];
if (tempText != nil)
[html appendString:tempText];
[scanner scanUpToString:@">" intoString:NULL];
if (![scanner isAtEnd])
[scanner setScanLocation:[scanner scanLocation] + 1];
tempText = nil;
}
return html ;
}
2 Comments
Hi, Please how do i view the original code? The link provided here seems broken. I’ll really appreciate it as I would like to implement this on a current project. Thanks so much for your kind assistance.
I modified the post to not mangle the source. It should be copy/paste-able