Detect HTML tags in a string

Php

Php Problem Overview


I need to detect whether a string contains HTML tags.

if(!preg_match('(?<=<)\w+(?=[^<]*?>)', $string)){ 
    return $string;
}

The above regex gives me an error:

preg_match() [function.preg-match]: Unknown modifier '\'

I'm not well up on regex so not sure what the problem was. I tried escaping the \ and it didn't do anything.

Is there a better solution than regex? If not, what would be the correct regex to work with the preg_match?

Php Solutions


Solution 1 - Php

A simple solution is:

if($string != strip_tags($string)) {
    // contains HTML
}

The benefit of this over a regex is it's easier to understand, however I could not comment on the speed of execution of either solution.

Solution 2 - Php

you need to 'delimit' the regex with some character or another. Try this:

if(!preg_match('#(?<=<)\w+(?=[^<]*?>)#', $string)){ 
    return $string;
}

Solution 3 - Php

If you just want to detect/replace certain tags: This function will search for certain html tags and encapsulate them in brackets - which is pretty senseless - just modify it to whatever you want to do with the tags.

$html = preg_replace_callback(
	'|\</?([a-zA-Z]+[1-6]?)(\s[^>]*)?(\s?/)?\>|',
	function ($found) {
		if(isset($found[1]) && in_array(
			$found[1], 
			array('div','p','span','b','a','strong','center','br','h1','h2','h3','h4','h5','h6','hr'))
		) {
			return '[' . $found[0] . ']';
		};
	},
	$html  
);

Explaination of the regex:

\< ... \>   //start and ends with tag brackets
\</?        //can start with a slash for closing tags
([a-zA-Z]+[1-6]?)    //the tag itself (for example "h1")
(\s[^>]*)? //anything such as class=... style=... etc.
(\s?/)?     //allow self-closing tags such as <br />

Solution 4 - Php

If purpose is just to check if string contain html tag or not. No matter html tags are valid or not. Then you can try this.

function is_html($string) {
  // Check if string contains any html tags.
  return preg_match('/<\s?[^\>]*\/?\s?>/i', $string);
}

This works for all valid or invalid html tags. You can check confirm here https://regex101.com/r/2g7Fx4/3

Solution 5 - Php

I would use strlen() because if you don't, then a character-by-character comparison is done and that can be slow, though I would expect the comparison to quit as soon as it found a difference.

Solution 6 - Php

I would recommend you to allow defined tags only! You don't want the user to type the <script> tag, which could cause a XSS vulnerability.

Try it with:

$string = '<strong>hello</strong>';
$pattern = "/<(p|span|b|strong|i|u) ?.*>(.*)<\/(p|span|b|strong|i|u)>/"; // Allowed tags are: <p>, <span>, <b>, <strong>, <i> and <u>
preg_match($pattern, $string, $matches);

if (!empty($matches)) {
    echo 'Good, you have used a HTML tag.';
}
else {
    echo 'You didn\'t use a HTML tag or it is not allowed.';
}

Solution 7 - Php

Parsing HTML in general is a hard problem, there is some good material here:

But regarding your question ('better' solution) - can be more specific regarding what you are trying to achieve, and what tools are available to you?

Solution 8 - Php

If your not good at regular expressions (like me) I find lots of regex libraries out there that usually help me accomplish my task.

Here is a little tutorial that will explain what your trying to do in php.

Here is one of those libraries I was referring to.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionbcmcfcView Question on Stackoverflow
Solution 1 - PhpDiarmaidView Answer on Stackoverflow
Solution 2 - PhpsimonView Answer on Stackoverflow
Solution 3 - PhpGerfriedView Answer on Stackoverflow
Solution 4 - PhpMutantMaheshView Answer on Stackoverflow
Solution 5 - PhpslsdougView Answer on Stackoverflow
Solution 6 - PhpReza SaadatiView Answer on Stackoverflow
Solution 7 - PhpAddysView Answer on Stackoverflow
Solution 8 - PhpclamchodaView Answer on Stackoverflow