INNER JOIN vs INNER JOIN (SELECT . FROM)

SqlSql ServerJoinInner Join

Sql Problem Overview


Is there any difference in terms of performance between these two versions of the same query?

--Version 1
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN SalesOrderDetail s on p.ProductID = s.ProductID

--Version 2
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN (SELECT ProductID, OrderQty FROM SalesOrderDetail) s on p.ProductID = s.ProductID

I've heard it said (by a DBA) that Version 2 is faster because it fetches, within the inner SELECT statement, only the columns that are required for the query. But that doesn't seem to make sense, since query performance (as I know) is based on number of rows affected and final list of columns returned.

The query plans for both are identical, so I'm guessing there isn't any difference between the two.

Am I correct?

Sql Solutions


Solution 1 - Sql

You are correct. You did exactly the right thing, checking the query plan rather than trying to second-guess the optimiser. :-)

Solution 2 - Sql

There won't be much difference. Howver version 2 is easier when you have some calculations, aggregations, etc that should be joined outside of it

--Version 2 
SELECT p.Name, s.OrderQty 
FROM Product p 
INNER JOIN 
(SELECT ProductID, SUM(OrderQty) as OrderQty FROM SalesOrderDetail GROUP BY ProductID
HAVING SUM(OrderQty) >1000) s 
on p.ProductID = s.ProdctId 

Solution 3 - Sql

Seems to be identical just in case that SQL server will not try to read data which is not required for the query, the optimizer is clever enough

It can have sense when join on complex query (i.e which have joings, groupings etc itself) then, yes, it is better to specify required fields.

But there is one more point. If the query is simple there is no difference but EVERY extra action even which is supposed to improve performance makes optimizer works harder and optimizer can fail to get the best plan in time and will run not optimal query. So extras select can be a such action which can even decrease performance

Solution 4 - Sql

You did the right thing by checking from query plans. But I have 100% confidence in version 2. It is faster when the number off records are on the very high side.

My database has around 1,000,000 records and this is exactly the scenario where the query plan shows the difference between both the queries. Further, instead of using a where clause, if you use it in the join itself, it makes the query faster :
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN (SELECT ProductID, OrderQty FROM SalesOrderDetail) s on p.ProductID = s.ProductID WHERE p.isactive = 1

The better version of this query is :

SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN (SELECT ProductID, OrderQty FROM SalesOrderDetail) s on p.ProductID = s.ProductID AND p.isactive = 1

(Assuming isactive is a field in product table which represents the active/inactive products).

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionKarunView Question on Stackoverflow
Solution 1 - SqlChristian HayterView Answer on Stackoverflow
Solution 2 - SqlMadhivananView Answer on Stackoverflow
Solution 3 - SqlVladmirView Answer on Stackoverflow
Solution 4 - SqlNG.View Answer on Stackoverflow