how to use a like with a join in sql?

MysqlSqlJoinSql Like

Mysql Problem Overview


I have 2 tables, say table A and table B and I want to perform a join, but the matching condition has to be where a column from A 'is like' a column from B meaning that anything can come before or after the column in B:

for example: if the column in A is 'foo'. Then the join would match if column in B is either: 'fooblah', 'somethingfooblah', or just 'foo'. I know how to use the wildcards in a standard like statement, but am confused when doing a join. Does this make sense? Thanks.

Mysql Solutions


Solution 1 - Mysql

Using INSTR:

SELECT *
  FROM TABLE a
  JOIN TABLE b ON INSTR(b.column, a.column) > 0

Using LIKE:

SELECT *
  FROM TABLE a
  JOIN TABLE b ON b.column LIKE '%'+ a.column +'%'

Using LIKE, with CONCAT:

SELECT *
  FROM TABLE a
  JOIN TABLE b ON b.column LIKE CONCAT('%', a.column ,'%')

Mind that in all options, you'll probably want to drive the column values to uppercase BEFORE comparing to ensure you are getting matches without concern for case sensitivity:

SELECT *
  FROM (SELECT UPPER(a.column) 'ua'
         TABLE a) a
  JOIN (SELECT UPPER(b.column) 'ub'
         TABLE b) b ON INSTR(b.ub, a.ua) > 0

The most efficient will depend ultimately on the EXPLAIN plan output.

JOIN clauses are identical to writing WHERE clauses. The JOIN syntax is also referred to as ANSI JOINs because they were standardized. Non-ANSI JOINs look like:

SELECT *
  FROM TABLE a,
       TABLE b
 WHERE INSTR(b.column, a.column) > 0

I'm not going to bother with a Non-ANSI LEFT JOIN example. The benefit of the ANSI JOIN syntax is that it separates what is joining tables together from what is actually happening in the WHERE clause.

Solution 2 - Mysql

In MySQL you could try:

SELECT * FROM A INNER JOIN B ON B.MYCOL LIKE CONCAT('%', A.MYCOL, '%');

Of course this would be a massively inefficient query because it would do a full table scan.

Update: Here's a proof


create table A (MYCOL varchar(255));
create table B (MYCOL varchar(255));
insert into A (MYCOL) values ('foo'), ('bar'), ('baz');
insert into B (MYCOL) values ('fooblah'), ('somethingfooblah'), ('foo');
insert into B (MYCOL) values ('barblah'), ('somethingbarblah'), ('bar');
SELECT * FROM A INNER JOIN B ON B.MYCOL LIKE CONCAT('%', A.MYCOL, '%');
+-------+------------------+
| MYCOL | MYCOL            |
+-------+------------------+
| foo   | fooblah          |
| foo   | somethingfooblah |
| foo   | foo              |
| bar   | barblah          |
| bar   | somethingbarblah |
| bar   | bar              |
+-------+------------------+
6 rows in set (0.38 sec)

Solution 3 - Mysql

If this is something you'll need to do often...then you may want to denormalize the relationship between tables A and B.

For example, on insert to table B, you could write zero or more entries to a juncion table mapping B to A based on partial mapping. Similarly, changes to either table could update this association.

This all depends on how frequently tables A and B are modified. If they are fairly static, then taking a hit on INSERT is less painful then repeated hits on SELECT.

Solution 4 - Mysql

Using conditional criteria in a join is definitely different than the Where clause. The cardinality between the tables can create differences between Joins and Where clauses.

For example, using a Like condition in an Outer Join will keep all records in the first table listed in the join. Using the same condition in the Where clause will implicitly change the join to an Inner join. The record has to generally be present in both tables to accomplish the conditional comparison in the Where clause.

I generally use the style given in one of the prior answers.

tbl_A as ta
    LEFT OUTER JOIN tbl_B AS tb
		    ON ta.[Desc] LIKE '%' + tb.[Desc] + '%'

This way I can control the join type.

Solution 5 - Mysql

When writing queries with our server LIKE or INSTR (or CHARINDEX in T-SQL) takes too long, so we use LEFT like in the following structure:

select *
from little
left join big
on left( big.key, len(little.key) ) = little.key

I understand that might only work with varying endings to the query, unlike other suggestions with '%' + b + '%', but is enough and much faster if you only need b+'%'.

Another way to optimize it for speed (but not memory) is to create a column in "little" that is "len(little.key)" as "lenkey" and user that instead in the query above.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionjimView Question on Stackoverflow
Solution 1 - MysqlOMG PoniesView Answer on Stackoverflow
Solution 2 - MysqlAsaphView Answer on Stackoverflow
Solution 3 - MysqlDavid AndresView Answer on Stackoverflow
Solution 4 - MysqlGeoffrey FullerView Answer on Stackoverflow
Solution 5 - MysqlVitalijs ArkulinskisView Answer on Stackoverflow