How to strip or escape html tags in Android

AndroidStrip Tags

Android Problem Overview


PHP has strip_tags function which strips HTML and PHP tags from a string.

Does Android have a way to escape html?

Android Solutions


Solution 1 - Android

The solutions in the answer linked to by @sparkymat generally require either regex - which is an error-prone approach - or installing a third-party library such as jsoup or jericho. A better solution on Android devices is just to make use of the Html.fromHtml() function:

public String stripHtml(String html) {
    if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.N) {
       return Html.fromHtml(html, Html.FROM_HTML_MODE_LEGACY).toString();
    } else {
       return Html.fromHtml(html).toString();
    }
}

This uses Android's built in Html parser to build a Spanned representation of the input html without any html tags. The "Span" markup is then stripped by converting the output back into a string.

As discussed here, Html.fromHtml behaviour has changed since Android N. See the documentation for more info.

Solution 2 - Android

Sorry for the late post, but i think this might help for others,

To just remove the html strips

Html.fromHtml(htmltext).toString()

This way the html tag will be replaced with string, but the string willnot be formatted properly. Hence i did

Html.fromHtml(htmltext).toString().replaceAll("\n", "").trim()

This way i first replace with nextline with blankspace and removed blank space. Similarly you can remove others.

Solution 3 - Android

You can alternatively use Html.escapeHtml(String) if you are targeting API 16 or above.

For also targeting below API 16, you can instead use the below class by calling HtmlUtils.escapeHtml(String) which i simply pulled from the source of Html.escapeHtml(String).

public class HtmlUtils {

    public static String escapeHtml(CharSequence text) {
        StringBuilder out = new StringBuilder();
        withinStyle(out, text, 0, text.length());
        return out.toString();
    }

    private static void withinStyle(StringBuilder out, CharSequence text,
                                    int start, int end) {
        for (int i = start; i < end; i++) {
            char c = text.charAt(i);

            if (c == '<') {
                out.append("&lt;");
            } else if (c == '>') {
                out.append("&gt;");
            } else if (c == '&') {
                out.append("&amp;");
            } else if (c >= 0xD800 && c <= 0xDFFF) {
                if (c < 0xDC00 && i + 1 < end) {
                    char d = text.charAt(i + 1);
                    if (d >= 0xDC00 && d <= 0xDFFF) {
                        i++;
                        int codepoint = 0x010000 | (int) c - 0xD800 << 10 | (int) d - 0xDC00;
                        out.append("&#").append(codepoint).append(";");
                    }
                }
            } else if (c > 0x7E || c < ' ') {
                out.append("&#").append((int) c).append(";");
            } else if (c == ' ') {
                while (i + 1 < end && text.charAt(i + 1) == ' ') {
                    out.append("&nbsp;");
                    i++;
                }

                out.append(' ');
            } else {
                out.append(c);
            }
        }
    }
}

I am using this class which works fine.

Solution 4 - Android

This is for new method alternative (API 16+):

android.text.Html.escapeHtml(your_html).toString();

Solution 5 - Android

Html.fromHtml can be extremely slow for large html strings.

Here's how you can do it, easily and fast with jsoup:

Add this line to your gradle file:

implementation 'org.jsoup:jsoup:1.11.3'

Check what is the latest jsoup version here: https://jsoup.org/download

Add this line to your code:

String text = Jsoup.parse(htmlStr).text();

Check this link here to learn how to preserve line breaks:

https://stackoverflow.com/questions/5640334/how-do-i-preserve-line-breaks-when-using-jsoup-to-convert-html-to-plain-text

Solution 6 - Android

 Spanned spanned;
        if (android.os.Build.VERSION.SDK_INT >= android.os.Build.VERSION_CODES.N) {
            spanned = Html.fromHtml(textToShare, Html.FROM_HTML_MODE_LEGACY);
        } else {
            spanned = Html.fromHtml(textToShare);
        }
tv.setText(spanned.toString());

Solution 7 - Android

This is dead simple with jsoup

public static String html2text(String html) {
   return Jsoup.parse(html).text();
}

Solution 8 - Android

As it has not been mentioned yet, the way to do this in a backwards compatible manner would be to use the HtmlCompat utility class, and simply call (with 0 if you require no specific flags to be used)

HtmlCompat.from(inputString, 0).toString()

Under the hood it already does all the required api checks for you

if (Build.VERSION.SDK_INT >= 24) {
   return Html.fromHtml(source, flags);
}
return Html.fromHtml(source);

So for for the input

<a href="https://www.stackoverflow.com">Click me!</a>

you will receive only the string 'Click me!' as output.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKrisView Question on Stackoverflow
Solution 1 - AndroidNick StreetView Answer on Stackoverflow
Solution 2 - Androidyubaraj poudelView Answer on Stackoverflow
Solution 3 - AndroidBuddyView Answer on Stackoverflow
Solution 4 - AndroidTomero IndonesiaView Answer on Stackoverflow
Solution 5 - Androidlive-loveView Answer on Stackoverflow
Solution 6 - AndroidAtif MahmoodView Answer on Stackoverflow
Solution 7 - AndroidJayakrishnanView Answer on Stackoverflow
Solution 8 - AndroidHrafnView Answer on Stackoverflow