SQL 'like' vs '=' performance

SqlPerformance

Sql Problem Overview


https://stackoverflow.com/questions/543580/equals-vs-like"> This question skirts around what I'm wondering, but the answers don't exactly address it.

It would seem that in general '=' is faster than 'like' when using wildcards. This appears to be the conventional wisdom. However, lets suppose I have a column containing a limited number of different fixed, hardcoded, varchar identifiers, and I want to select all rows matching one of them:

select * from table where value like 'abc%'

and

select * from table where value = 'abcdefghijklmn'

'Like' should only need to test the first three chars to find a match, whereas '=' must compare the entire string. In this case it would seem to me that 'like' would have an advantage, all other things being equal.

This is intended as a general, academic question, and so should not matter which DB, but it arose using SQL Server 2005.

Sql Solutions


Solution 1 - Sql

See https://web.archive.org/web/20150209022016/http://myitforum.com/cs2/blogs/jnelson/archive/2007/11/16/108354.aspx

Quote from there:

> the rules for index usage with LIKE > are loosely like this: > > > - If your filter criteria uses equals = > and the field is indexed, then most > likely it will use an INDEX/CLUSTERED > INDEX SEEK
> > - If your filter criteria uses LIKE, > with no wildcards (like if you had a > parameter in a web report that COULD > have a % but you instead use the full > string), it is about as likely as #1 > to use the index. The increased cost > is almost nothing. > > - If your filter criteria uses LIKE, but > with a wildcard at the beginning (as > in Name0 LIKE '%UTER') it's much less > likely to use the index, but it still > may at least perform an INDEX SCAN on > a full or partial range of the index. > > - HOWEVER, if your filter criteria uses > LIKE, but starts with a STRING FIRST > and has wildcards somewhere AFTER that > (as in Name0 LIKE 'COMP%ER'), then SQL > may just use an INDEX SEEK to quickly > find rows that have the same first > starting characters, and then look > through those rows for an exact match. > > > (Also keep in mind, the SQL engine > still might not use an index the way > you're expecting, depending on what > else is going on in your query and > what tables you're joining to. The > SQL engine reserves the right to > rewrite your query a little to get the > data in a way that it thinks is most > efficient and that may include an > INDEX SCAN instead of an INDEX SEEK)

Solution 2 - Sql

It's a measureable difference.

Run the following:

Create Table #TempTester (id int, col1 varchar(20), value varchar(20))
go

INSERT INTO #TempTester (id, col1, value)
VALUES
(1, 'this is #1', 'abcdefghij')
GO

INSERT INTO #TempTester (id, col1, value)
VALUES
(2, 'this is #2', 'foob'),
(3, 'this is #3', 'abdefghic'),
(4, 'this is #4', 'other'),
(5, 'this is #5', 'zyx'),
(6, 'this is #6', 'zyx'),
(7, 'this is #7', 'zyx'),
(8, 'this is #8', 'klm'),
(9, 'this is #9', 'klm'),
(10, 'this is #10', 'zyx')
GO 10000
    
CREATE CLUSTERED INDEX ixId ON #TempTester(id)CREATE CLUSTERED INDEX ixId ON #TempTester(id)
    
CREATE NONCLUSTERED INDEX ixTesting ON #TempTester(value)

Then:

SET SHOWPLAN_XML ON

Then:

SELECT * FROM #TempTester WHERE value LIKE 'abc%'

SELECT * FROM #TempTester WHERE value = 'abcdefghij'

The resulting execution plan shows you that the cost of the first operation, the LIKE comparison, is about 10 times more expensive than the = comparison.

If you can use an = comparison, please do so.

Solution 3 - Sql

You should also keep in mind that when using like, some sql flavors will ignore indexes, and that will kill performance. This is especially true if you don't use the "starts with" pattern like your example.

You should really look at the execution plan for the query and see what it's doing, guess as little as possible.

This being said, the "starts with" pattern can and is optimized in sql server. It will use the table index. EF 4.0 switched to like for StartsWith for this very reason.

Solution 4 - Sql

If value is unindexed, both result in a table-scan. The performance difference in this scenario will be negligible.

If value is indexed, as Daniel points out in his comment, the = will result in an index lookup which is O(log N) performance. The LIKE will (most likely - depending on how selective it is) result in a partial scan of the index >= 'abc' and < 'abd' which will require more effort than the =.

Note that I'm talking SQL Server here - not all DBMSs will be nice with LIKE.

Solution 5 - Sql

You are asking the wrong question. In databases is not the operator performance that matters, is always the SARGability of the expression, and the coverability of the overall query. Performance of the operator itself is largely irrelevant.

So, how do LIKE and = compare in terms of SARGability? LIKE, when used with an expression that does not start with a constant (eg. when used LIKE '%something') is by definition non-SARGabale. But does that make = or LIKE 'something%' SARGable? No. As with any question about SQL performance the answer does not lie with the query of the text, but with the schema deployed. These expression may be SARGable if an index exists to satisfy them.

So, truth be told, there are small differences between = and LIKE. But asking whether one operator or other operator is 'faster' in SQL is like asking 'What goes faster, a red car or a blue car?'. You should eb asking questions about the engine size and vechicle weight, not about the color... To approach questions about optimizing relational tables, the place to look is your indexes and your expressions in the WHERE clause (and other clauses, but it usually starts with the WHERE).

Solution 6 - Sql

A personal example using mysql 5.5: I had an inner join between 2 tables, one of 3 million rows and one of 10 thousand rows.

When using a like on an index as below(no wildcards), it took about 30 seconds:

where login like '12345678'

using 'explain' I get:

enter image description here

When using an '=' on the same query, it took about 0.1 seconds:

where login ='12345678'

Using 'explain' I get:

enter image description here

As you can see, the like completely cancelled the index seek, so query took 300 times more time.

Solution 7 - Sql

= is much faster than LIKE, even without wildcard. I tested on MySQL with 11GB of data and more than 100 million of records, the f_time column is indexed.

SELECT * FROM XXXXX WHERE f_time = '1621442261' 
#took 0.00sec and return 330 records

SELECT * FROM XXXXX WHERE f_time LIKE '1621442261' 
#took 44.71sec and return 330 records

Solution 8 - Sql

Besides all the answers, there this to consider:

'like' is case insensitive, so every character needs to be compared twice, whereas the '=' only compares once for identical characters.

This issue arises with or without indexes.

Solution 9 - Sql

Maybe you are looking about https://msdn.microsoft.com/en-us/library/ms142571.aspx">Full Text Search.

>In contrast to full-text search, the LIKE Transact-SQL predicate works on > character patterns only. Also, you cannot use the LIKE predicate to > query formatted binary data. Furthermore, a LIKE query against a large > amount of unstructured text data is much slower than an equivalent > full-text query against the same data. A LIKE query against millions > of rows of text data can take minutes to return; whereas a full-text > query can take only seconds or less against the same data, depending > on the number of rows that are returned.

Solution 10 - Sql

First things first ,

they are not always equal

    select 'Hello' from dual where 'Hello  ' like 'Hello';

    select 'Hello' from dual where 'Hello  ' =  'Hello';

when things are not always equal , talking about their performance isn't that relevant.

If you are working on strings and only char variables , then you can talk about performance . But don't use like and "=" as being generally interchangeable .

As you would have seen in many posts ( above and other questions) , in cases when they are equal the performance of like is slower owing to pattern matching (collation)

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMickeyfAgain_BeforeExitOfSOView Question on Stackoverflow
Solution 1 - SqlBonyTView Answer on Stackoverflow
Solution 2 - SqlJNKView Answer on Stackoverflow
Solution 3 - SqlBlindyView Answer on Stackoverflow
Solution 4 - SqlWill AView Answer on Stackoverflow
Solution 5 - SqlRemus RusanuView Answer on Stackoverflow
Solution 6 - SqlArisView Answer on Stackoverflow
Solution 7 - SqlbeloncfyView Answer on Stackoverflow
Solution 8 - SqlMarco WobbenView Answer on Stackoverflow
Solution 9 - Sqluser586399View Answer on Stackoverflow
Solution 10 - Sqluser5190021View Answer on Stackoverflow