Equals(=) vs. LIKE

SqlPerformanceEqualsSql Like

Sql Problem Overview


When using SQL, are there any benefits of using = in a WHERE clause instead of LIKE?

Without any special operators, LIKE and = are the same, right?

Sql Solutions


Solution 1 - Sql

Different Operators

LIKE and = are different operators. Most answers here focus on the wildcard support, which is not the only difference between these operators!

= is a comparison operator that operates on numbers and strings. When comparing strings, the comparison operator compares whole strings.

LIKE is a string operator that compares character by character.

To complicate matters, both operators use a collation which can have important effects on the result of the comparison.

Motivating Example

Let us first identify an example where these operators produce obviously different results. Allow me to quote from the MySQL manual:

> Per the SQL standard, LIKE performs matching on a per-character basis, thus it can produce results different from the = comparison operator:

mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci;
+-----------------------------------------+
| 'ä' LIKE 'ae' COLLATE latin1_german2_ci |
+-----------------------------------------+
|                                       0 |
+-----------------------------------------+
mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci;
+--------------------------------------+
| 'ä' = 'ae' COLLATE latin1_german2_ci |
+--------------------------------------+
|                                    1 |
+--------------------------------------+

Please note that this page of the MySQL manual is called String Comparison Functions, and = is not discussed, which implies that = is not strictly a string comparison function.

How Does = Work?

The SQL Standard § 8.2 describes how = compares strings:

> The comparison of two character strings is determined as follows: > > a) If the length in characters of X is not equal to the length > in characters of Y, then the shorter string is effectively > replaced, for the purposes of comparison, with a copy of > itself that has been extended to the length of the longer > string by concatenation on the right of one or more pad > characters, where the pad character is chosen based on CS. If > CS has the NO PAD attribute, then the pad character is an > implementation-dependent character different from any > character in the character set of X and Y that collates less > than any string under CS. Otherwise, the pad character is a > <space>. > > b) The result of the comparison of X and Y is given by the > collating sequence CS. > > c) Depending on the collating sequence, two strings may > compare as equal even if they are of different lengths or > contain different sequences of characters. When the operations > MAX, MIN, DISTINCT, references to a grouping column, and the > UNION, EXCEPT, and INTERSECT operators refer to character > strings, the specific value selected by these operations from > a set of such equal values is implementation-dependent.

(Emphasis added.)

What does this mean? It means that when comparing strings, the = operator is just a thin wrapper around the current collation. A collation is a library that has various rules for comparing strings. Here is an example of a binary collation from MySQL:

static int my_strnncoll_binary(const CHARSET_INFO *cs __attribute__((unused)),
                               const uchar *s, size_t slen,
                               const uchar *t, size_t tlen,
                               my_bool t_is_prefix)
{
  size_t len= MY_MIN(slen,tlen);
  int cmp= memcmp(s,t,len);
  return cmp ? cmp : (int)((t_is_prefix ? len : slen) - tlen);
}

This particular collation happens to compare byte-by-byte (which is why it's called "binary" — it doesn't give any special meaning to strings). Other collations may provide more advanced comparisons.

For example, here is a UTF-8 collation that supports case-insensitive comparisons. The code is too long to paste here, but go to that link and read the body of my_strnncollsp_utf8mb4(). This collation can process multiple bytes at a time and it can apply various transforms (such as case insensitive comparison). The = operator is completely abstracted from the vagaries of the collation.

How Does LIKE Work?

The SQL Standard § 8.5 describes how LIKE compares strings:

> The <predicate> > > M LIKE P > > is true if there exists a partitioning of M into substrings > such that: > > i) A substring of M is a sequence of 0 or more contiguous > <character representation>s of M and each <character > representation> of M is part of exactly one substring. > > ii) If the i-th substring specifier of P is an arbitrary > character specifier, the i-th substring of M is any single > <character representation>. > > iii) If the i-th substring specifier of P is an arbitrary string > specifier, then the i-th substring of M is any sequence of > 0 or more <character representation>s. > > iv) If the i-th substring specifier of P is neither an > arbitrary character specifier nor an arbitrary string specifier, > then the i-th substring of M is equal to that substring > specifier according to the collating sequence of > the <like predicate>, without the appending of <space> > characters to M, and has the same length as that substring > specifier. > > v) The number of substrings of M is equal to the number of > substring specifiers of P.

(Emphasis added.)

This is pretty wordy, so let's break it down. Items ii and iii refer to the wildcards _ and %, respectively. If P does not contain any wildcards, then only item iv applies. This is the case of interest posed by the OP.

In this case, it compares each "substring" (individual characters) in M against each substring in P using the current collation.

Conclusions

The bottom line is that when comparing strings, = compares the entire string while LIKE compares one character at a time. Both comparisons use the current collation. This difference leads to different results in some cases, as evidenced in the first example in this post.

Which one should you use? Nobody can tell you that — you need to use the one that's correct for your use case. Don't prematurely optimize by switching comparison operators.

Solution 2 - Sql

The equals (=) operator is a "comparison operator compares two values for equality." In other words, in an SQL statement, it won't return true unless both sides of the equation are equal. For example:

SELECT * FROM Store WHERE Quantity = 200;

The LIKE operator "implements a pattern match comparison" that attempts to match "a string value against a pattern string containing wild-card characters." For example:

SELECT * FROM Employees WHERE Name LIKE 'Chris%';

LIKE is generally used only with strings and equals (I believe) is faster. The equals operator treats wild-card characters as literal characters. The difference in results returned are as follows:

SELECT * FROM Employees WHERE Name = 'Chris';

And

SELECT * FROM Employees WHERE Name LIKE 'Chris';

Would return the same result, though using LIKE would generally take longer as its a pattern match. However,

SELECT * FROM Employees WHERE Name = 'Chris%';

And

SELECT * FROM Employees WHERE Name LIKE 'Chris%';

Would return different results, where using "=" results in only results with "Chris%" being returned and the LIKE operator will return anything starting with "Chris".

Hope that helps. Some good info can be found here.

Solution 3 - Sql

This is a copy/paste of another answer of mine for question https://stackoverflow.com/questions/6142235/sql-like-vs-performance/28609481#28609481:

A personal example using mysql 5.5: I had an inner join between 2 tables, one of 3 million rows and one of 10 thousand rows.

When using a like on an index as below(no wildcards), it took about 30 seconds:

where login like '12345678'

using 'explain' I get:

enter image description here

When using an '=' on the same query, it took about 0.1 seconds:

where login ='12345678'

Using 'explain' I get:

enter image description here

As you can see, the like completely cancelled the index seek, so query took 300 times more time.

Solution 4 - Sql

LIKE and = are different. LIKE is what you would use in a search query. It also allows wildcards like _ (simple character wildcard) and % (multi-character wildcard).

= should be used if you want exact matches and it will be faster.

This site explains LIKE

Solution 5 - Sql

One difference - apart from the possibility to use wildcards with LIKE - is in trailing spaces: The = operator ignores trailing space, but LIKE does not.

Solution 6 - Sql

Depends on the database system.

Generally with no special characters, yes, = and LIKE are the same.

Some database systems, however, may treat collation settings differently with the different operators.

For instance, in MySQL comparisons with = on strings is always case-insensitive by default, so LIKE without special characters is the same. On some other RDBMS's LIKE is case-insensitive while = is not.

Solution 7 - Sql

For this example we take it for granted that varcharcol doesn't contain '' and have no empty cell against this column

select * from some_table where varcharCol = ''
select * from some_table where varcharCol like ''

The first one results in 0 row output while the second one shows the whole list. = is strictly-match case while like acts like a filter. if filter has no criteria, every data is valid.

like - by the virtue of its purpose works a little slower and is intended for use with varchar and similar data.

Solution 8 - Sql

Using = avoids wildcards and special characters conflicts in the string when you build the query at run time.

This makes the programmer's life easier by not having to escape all special wildcard characters that might slip in the LIKE clause and not producing the intended result. After all, = is the 99% use case scenario, it would be a pain to have to escape them every time.

rolls eyes at '90s

I also suspect it's a little bit slower, but I doubt it's significant if there are no wildcards in the pattern.

Solution 9 - Sql

If you search for an exact match, you can use both, = and LIKE.

Using "=" is a tiny bit faster in this case (searching for an exact match) - you can check this yourself by having the same query twice in SQL Server Management Studio, once using "=", once using "LIKE", and then using the "Query" / "Include actual execution plan".

Execute the two queries and you should see your results twice, plus the two actual execution plans. In my case, they were split 50% vs. 50%, but the "=" execution plan has a smaller "estimated subtree cost" (displayed when you hover over the left-most "SELECT" box) - but again, it's really not a huge difference.

But when you start searching with wildcards in your LIKE expression, search performance will dimish. Search "LIKE Mill%" can still be quite fast - SQL Server can use an index on that column, if there is one. Searching "LIKE %expression%" is horribly slow, since the only way SQL Server can satisfy this search is by doing a full table scan. So be careful with your LIKE's !

Marc

Solution 10 - Sql

To address the original question regarding performance, it comes down to index utilization. When a simple table scan occurs, "LIKE" and "=" are identical. When indexes are involved, it depends on how the LIKE clause is formed. More specifically, what is the location of the wildcard(s)?


Consider the following:

CREATE TABLE test(
	txt_col  varchar(10) NOT NULL
)
go

insert test (txt_col)
select CONVERT(varchar(10), row_number() over (order by (select 1))) r
  from master..spt_values a, master..spt_values b
go

CREATE INDEX IX_test_data 
    ON test (txt_col);
go 

--Turn on Show Execution Plan
set statistics io on

--A LIKE Clause with a wildcard at the beginning
DBCC DROPCLEANBUFFERS
SELECT txt_Col from test where txt_col like '%10000'
--Results in
--Table 'test'. Scan count 3, logical reads 15404, physical reads 2, read-ahead reads 15416, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Index SCAN is 85% of Query Cost

--A LIKE Clause with a wildcard in the middle
DBCC DROPCLEANBUFFERS
SELECT txt_Col from test where txt_col like '1%99'
--Results in
--Table 'test'. Scan count 1, logical reads 3023, physical reads 3, read-ahead reads 3018, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Index Seek is 100% of Query Cost for test data, but it may result in a Table Scan depending on table size/structure

--A LIKE Clause with no wildcards
DBCC DROPCLEANBUFFERS
SELECT txt_Col from test where txt_col like '10000'
--Results in
--Table 'test'. Scan count 1, logical reads 3, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Index Seek is 100% of Query Cost
GO

--an "=" clause = does Index Seek same as above
DBCC DROPCLEANBUFFERS
SELECT txt_Col from test where txt_col = '10000'
--Results in
--Table 'test'. Scan count 1, logical reads 3, physical reads 2, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Index Seek is 100% of Query Cost
GO


DROP TABLE test

There may be also negligible difference in the creation of the query plan when using "=" vs "LIKE".

Solution 11 - Sql

Besides the wildcards, the difference between = AND LIKE will depend on both the kind of SQL server and on the column type.

Take this example:

CREATE TABLE testtable (
  varchar_name VARCHAR(10),
  char_name CHAR(10),
  val INTEGER
);

INSERT INTO testtable(varchar_name, char_name, val)
    VALUES ('A', 'A', 10), ('B', 'B', 20);

SELECT 'VarChar Eq Without Space', val FROM testtable WHERE varchar_name='A'
UNION ALL
SELECT 'VarChar Eq With Space', val FROM testtable WHERE varchar_name='A '
UNION ALL
SELECT 'VarChar Like Without Space', val FROM testtable WHERE varchar_name LIKE 'A'
UNION ALL
SELECT 'VarChar Like Space', val FROM testtable WHERE varchar_name LIKE 'A '
UNION ALL
SELECT 'Char Eq Without Space', val FROM testtable WHERE char_name='A'
UNION ALL
SELECT 'Char Eq With Space', val FROM testtable WHERE char_name='A '
UNION ALL
SELECT 'Char Like Without Space', val FROM testtable WHERE char_name LIKE 'A'
UNION ALL
SELECT 'Char Like With Space', val FROM testtable WHERE char_name LIKE 'A '
  • Using MS SQL Server 2012, the trailing spaces will be ignored in the comparison, except with LIKE when the column type is VARCHAR.

  • Using MySQL 5.5, the trailing spaces will be ignored for =, but not for LIKE, both with CHAR and VARCHAR.

  • Using PostgreSQL 9.1, spaces are significant with both = and LIKE using VARCHAR, but not with CHAR (see documentation).

    The behaviour with LIKE also differs with CHAR.

    Using the same data as above, using an explicit CAST on the column name also makes a difference:

      SELECT 'CAST none', val FROM testtable WHERE char_name LIKE 'A'
      UNION ALL
      SELECT 'CAST both', val FROM testtable WHERE
          CAST(char_name AS CHAR) LIKE CAST('A' AS CHAR)
      UNION ALL
      SELECT 'CAST col', val FROM testtable WHERE CAST(char_name AS CHAR) LIKE 'A'
      UNION ALL
      SELECT 'CAST value', val FROM testtable WHERE char_name LIKE CAST('A' AS CHAR)
    

    This only returns rows for "CAST both" and "CAST col".

Solution 12 - Sql

Really it comes down to what you want the query to do. If you mean an exact match then use =. If you mean a fuzzier match, then use LIKE. Saying what you mean is usually a good policy with code.

Solution 13 - Sql

The LIKE keyword undoubtedly comes with a "performance price-tag" attached. That said, if you have an input field that could potentially include wild card characters to be used in your query, I would recommend using LIKE only if the input contains one of the wild cards. Otherwise, use the standard equal to comparison.

Best regards...

Solution 14 - Sql

= is much faster than LIKE.

Tested on MySQL with 11GB of data and more than 10 million of records, the f_time column is indexed.

SELECT * FROM XXXXX WHERE f_time = '1621442261' - took 0.00sec and return 330 records

SELECT * FROM XXXXX WHERE f_time like '1621442261' - took 44.71sec and return 330 records

Solution 15 - Sql

In Oracle, a ‘like’ with no wildcards will return the same result as an ‘equals’, but could require additional processing. According to Tom Kyte, Oracle will treat a ‘like’ with no wildcards as an ‘equals’ when using literals, but not when using bind variables.

Solution 16 - Sql

= and LIKE is not the same;

  1. = matches the exact string
  2. LIKE matches a string that may contain wildcards (%)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTravisView Question on Stackoverflow
Solution 1 - SqlMark E. HaaseView Answer on Stackoverflow
Solution 2 - Sqlachinda99View Answer on Stackoverflow
Solution 3 - SqlArisView Answer on Stackoverflow
Solution 4 - SqlWalterJ89View Answer on Stackoverflow
Solution 5 - SqlISWView Answer on Stackoverflow
Solution 6 - SqlʞɔıuView Answer on Stackoverflow
Solution 7 - SqlArnabView Answer on Stackoverflow
Solution 8 - SqlCoincoinView Answer on Stackoverflow
Solution 9 - Sqlmarc_sView Answer on Stackoverflow
Solution 10 - SqlLaramieView Answer on Stackoverflow
Solution 11 - SqlBrunoView Answer on Stackoverflow
Solution 12 - SqlnotnotView Answer on Stackoverflow
Solution 13 - SqlJosh StodolaView Answer on Stackoverflow
Solution 14 - SqlbeloncfyView Answer on Stackoverflow
Solution 15 - SqlChris BView Answer on Stackoverflow
Solution 16 - SqlbarettaView Answer on Stackoverflow