Performance of inner join compared to cross join

SqlPerformance

Sql Problem Overview


The effect of issuing an inner join is the same as stating a cross join with the join condition in the WHERE-clause. I noticed that many people in my company use cross joins, where I would use inner joins. I didn't notice any significant performance gain after changing some of these queries and was wondering if it was just a coincidence or if the DBMS optimizes such issues transparently (MySql in our case). And here a concrete example for discussion:

SELECT User.*
FROM User, Address
WHERE User.addressId = Address.id;

SELECT User.*
FROM User
INNER JOIN Address ON (User.addressId = Address.id);

Sql Solutions


Solution 1 - Sql

Cross Joins produce results that consist of every combination of rows from two or more tables. That means if table A has 6 rows and table B has 3 rows, a cross join will result in 18 rows. There is no relationship established between the two tables – you literally just produce every possible combination.

With an inner join, column values from one row of a table are combined with column values from another row of another (or the same) table to form a single row of data.

If a WHERE clause is added to a cross join, it behaves as an inner join as the WHERE imposes a limiting factor.

As long as your queries abide by common sense and vendor specific performance guidelines (i), I like to think of the decision on which type of join to use to be a simple matter of taste.

(i) Vendor Specific Performance Guidelines

  1. MySQL Performance Tuning and Optimization Resources
  2. PostgreSQL Performance Optimization

Solution 2 - Sql

There is no difference other than the inner join is a lot clearer because it defines the join, leaving the where clause to be the actual limiting condition.

Solution 3 - Sql

Use EXPLAIN to view the query plan for both queries, and see if there's any difference. Quite possibly MySQL will use the same execution plan in both cases. I use the INNER JOIN syntax mainly because it's a lot clearer.

Solution 4 - Sql

I find that work-places that allow the first syntax (comma separated tables) tend to have significant time taken up debugging cases where more rows are returned than intended. Unintentional cross joins are the bane of a system, and can bring even the most well-tuned database to it's knees. It has brought our pre-prod system to a screeching halt on at least two occasions in the last year.

The second syntax (join syntax) forces the writer to think about how the tables are joined together first, and then only return the interesting rows. It is impossible to accidentally do a cross join using this syntax, and thus the danger of accidental poorly performing queries is reduced.

However, that issue aside, I have never noticed any speed difference between the two syntaxes in any systems I have had.

Solution 5 - Sql

The first example is functionally the same as the second example. However, this syntax should be avoided for several reasons. First it is much easier to accidentally get a cross join when using this syntax especially when there are mulitple joins in the table. If you see a lot of this type of query with the keyword distinct, you probably have someone who is trying to fix the cross joins.

Next, the left and right join syntax using the older style is deprecated and will no longer be supported. Further, it doesn't work correctly now anyway. Sometimes it misinterprets the outer join and sends back the wrong results set. So any queries you have using = or = in the where clause should immediately be replaced.

Third, ANSI standard joins are easier to understand and maintain. An understanding of joins is one of the most critical basic skills that anyone querying any relational database needs to have. It has been my experience that some people who use the older style don't really understand joins and how they work and thus write queries that do not actually do what they intended.

Solution 6 - Sql

SQL Server said "When a WHERE turns a Cross Join into an Inner Join", so there are not difference. http://msdn.microsoft.com/en-us/library/ms190690.aspx

I did SQL server "Execution plan" the Performance is same.

Solution 7 - Sql

Explaining both queries gives same output

mysql> explain select * from t T1, t T2 where T1.ID=T2.ID;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | SIMPLE      | T1    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 |                                |
|  1 | SIMPLE      | T2    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.00 sec)

mysql> explain select * from t T1  join t T2 on T1.ID=T2.ID;
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra                          |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
|  1 | SIMPLE      | T1    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 |                                |
|  1 | SIMPLE      | T2    | ALL  | PRIMARY       | NULL | NULL    | NULL |    3 | Using where; Using join buffer |
+----+-------------+-------+------+---------------+------+---------+------+------+--------------------------------+
2 rows in set (0.00 sec)

But using inner join syntax is preferable as its more clearer and more precise. Mysql may internally tune Left and Right join queries to select less data as compared to Cross Join.

Solution 8 - Sql

The order in which you join tables or you put your ON / WHERE conditions should not matter.

The query optimizer should optimize and use the best order anyway (and chosing how to best filter the data, where to start, etc)

As many others though, I suggest using the INNER JOIN syntax, since it makes things much more readable, it is more transparent with the syntax of LEFT or FULL joins as well.

There's a somewhat more dwelling text about it here: http://linus.brimstedt.se/?/article/articleview/SQL Syntax

/B

Solution 9 - Sql

One additional benefit of the first syntax is you can be more general in your limiting condition. Not just equality.

But if you are using equality, why trust the optimizer? Make sure it won't first generate the cross join and then eliminate rows. Use the second one.

Solution 10 - Sql

There is not much difference between cross join with where clause and inner join they are probably the same but with the use of inner join we would decrease some part in code.

Solution 11 - Sql

Since the beginning of time optimizers have being built around classic restrict-project-cartesian product syntax. Virtually all the vendors copied the design pioneered by System R. Then, grudgingly, vendors adopted "the latest-and-greatest" ANSI syntax and retrofitted their SQL execution engines. Contrary to what marketing brochure can tell you ("use the latest syntax"), not much on physical implementation level has been changed: it is still [indexed] nested loops, or hash or sort-merge join. Therefore, there is no basis to assume superiority of one syntax over the other.

To my personal taste, the new syntax is redundant, noisy, and inconsistent. As to being sanctioned by the committee, "walk into any park in every city and you'll find no statue of committee".

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionsoulmergeView Question on Stackoverflow
Solution 1 - Sqlkarim79View Answer on Stackoverflow
Solution 2 - SqlOtávio DécioView Answer on Stackoverflow
Solution 3 - SqlEmil HView Answer on Stackoverflow
Solution 4 - SqlJonathanView Answer on Stackoverflow
Solution 5 - SqlHLGEMView Answer on Stackoverflow
Solution 6 - SqlnisiumiView Answer on Stackoverflow
Solution 7 - Sqlminhas23View Answer on Stackoverflow
Solution 8 - SqlBrimstedtView Answer on Stackoverflow
Solution 9 - SqlOzgur OzturkView Answer on Stackoverflow
Solution 10 - SqlNiharView Answer on Stackoverflow
Solution 11 - SqlTegiri NenashiView Answer on Stackoverflow