Convert HTML to Plain Text in Swift

IosSwiftUitableview

Ios Problem Overview


I'm working on a simple RSS Reader app as a beginner project in Xcode. I currently have it set up that it parses the feed, and places the title, pub date, description and content and displays it in a WebView.

I recently decided to show the description (or a truncated version of the content) in the TableView used to select a post. However, when doing so:

cell.textLabel?.text = item.title?.uppercaseString
cell.detailTextLabel?.text = item.itemDescription //.itemDescription is a String

It shows the raw HTML of the post.

I would like to know how to convert the HTML into plain text for just the TableView's detailed UILabel.

Thanks!

Ios Solutions


Solution 1 - Ios

You can add this extension to convert your html code to a regular string:

edit/update:

> Discussion The HTML importer should not be called from a background > thread (that is, the options dictionary includes documentType with a > value of html). It will try to synchronize with the main thread, fail, > and time out. Calling it from the main thread works (but can still > time out if the HTML contains references to external resources, which > should be avoided at all costs). The HTML import mechanism is meant > for implementing something like markdown (that is, text styles, > colors, and so on), not for general HTML import.

Xcode 11.4 • Swift 5.2

extension Data {
    var html2AttributedString: NSAttributedString? {
        do {
            return try NSAttributedString(data: self, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil)
        } catch {
            print("error:", error)
            return  nil
        }
    }
    var html2String: String { html2AttributedString?.string ?? "" }
}

extension StringProtocol {
    var html2AttributedString: NSAttributedString? {
        Data(utf8).html2AttributedString
    }
    var html2String: String {
        html2AttributedString?.string ?? ""
    }
}

cell.detailTextLabel?.text = item.itemDescription.html2String

Solution 2 - Ios

Swift 4, Xcode 9

extension String {
    
    var utfData: Data {
        return Data(utf8)
    }
    
    var attributedHtmlString: NSAttributedString? {
        
        do {
            return try NSAttributedString(data: utfData, options: [
              .documentType: NSAttributedString.DocumentType.html,
              .characterEncoding: String.Encoding.utf8.rawValue
            ], 
            documentAttributes: nil)
        } catch {
            print("Error:", error)
            return nil
        }
    }
}

extension UILabel {
   func setAttributedHtmlText(_ html: String) {
      if let attributedText = html.attributedHtmlString {
         self.attributedText = attributedText
      } 
   }
}

Solution 3 - Ios

Here is my suggested answer. Instead of extension, if you want to put inside function.

func decodeString(encodedString:String) -> NSAttributedString?
    {
        let encodedData = encodedString.dataUsingEncoding(NSUTF8StringEncoding)!
        do {
            return try NSAttributedString(data: encodedData, options: [NSDocumentTypeDocumentAttribute:NSHTMLTextDocumentType,NSCharacterEncodingDocumentAttribute:NSUTF8StringEncoding], documentAttributes: nil)
        } catch let error as NSError {
            print(error.localizedDescription)
            return nil
        }
    }

And call that function and cast NSAttributedString to String

let attributedString = self.decodeString(encodedString)
let message = attributedString.string

Solution 4 - Ios

Please test with this code for the detailTextLabel:

var attrStr = NSAttributedString(
        data: item.itemDescription.dataUsingEncoding(NSUnicodeStringEncoding, allowLossyConversion: true),
        options: [ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],
        documentAttributes: nil,
        error: nil)
cell.detailTextLabel?.text = attrStr

Solution 5 - Ios

Try this solution in swift3

extension String{
    func convertHtml() -> NSAttributedString{
        guard let data = data(using: .utf8) else { return NSAttributedString() }
        do{
            return try NSAttributedString(data: data, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue], documentAttributes: nil)
        }catch{
            return NSAttributedString()
        }
    }
}

To use

self.lblValDesc.attributedText = str_postdescription.convertHtml()

Solution 6 - Ios

Swift4.0 Extension

 extension String {
    var html2AttributedString: String? {
    guard let data = data(using: .utf8) else { return nil }
    do {
        return try NSAttributedString(data: data, options: [.documentType: NSAttributedString.DocumentType.html, .characterEncoding: String.Encoding.utf8.rawValue], documentAttributes: nil).string

    } catch let error as NSError {
        print(error.localizedDescription)
        return  nil
    }
  }
}

Solution 7 - Ios

i have used Danboz answer, only changed it to return a simple String (not a rich text string):

static func htmlToText(encodedString:String) -> String?
{
    let encodedData = encodedString.dataUsingEncoding(NSUTF8StringEncoding)!
    do
    {
        return try NSAttributedString(data: encodedData, options: [NSDocumentTypeDocumentAttribute:NSHTMLTextDocumentType,NSCharacterEncodingDocumentAttribute:NSUTF8StringEncoding], documentAttributes: nil).string
    } catch let error as NSError {
        print(error.localizedDescription)
        return nil
    }
}

for me, it works like a charm, thanks Danboz

Solution 8 - Ios

let content = givenString // html included string
let attrStr = try! NSAttributedString(data: content.data(using: String.Encoding.unicode, allowLossyConversion: true)!,options: [ NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType],documentAttributes: nil)
self.labelName.attributedText = attrStr    
                 

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionZaid SyedView Question on Stackoverflow
Solution 1 - IosLeo DabusView Answer on Stackoverflow
Solution 2 - IosSuhit PatilView Answer on Stackoverflow
Solution 3 - IosDanbozView Answer on Stackoverflow
Solution 4 - IosAltimir AntonovView Answer on Stackoverflow
Solution 5 - IosHardik ThakkarView Answer on Stackoverflow
Solution 6 - IosMaulik PatelView Answer on Stackoverflow
Solution 7 - IosShaybcView Answer on Stackoverflow
Solution 8 - Iosshahana mhView Answer on Stackoverflow