INNER JOIN vs INNER JOIN (SELECT . FROM)
SqlSql ServerJoinInner JoinSql Problem Overview
Is there any difference in terms of performance between these two versions of the same query?
--Version 1
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN SalesOrderDetail s on p.ProductID = s.ProductID
--Version 2
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN (SELECT ProductID, OrderQty FROM SalesOrderDetail) s on p.ProductID = s.ProductID
I've heard it said (by a DBA) that Version 2 is faster because it fetches, within the inner SELECT statement, only the columns that are required for the query. But that doesn't seem to make sense, since query performance (as I know) is based on number of rows affected and final list of columns returned.
The query plans for both are identical, so I'm guessing there isn't any difference between the two.
Am I correct?
Sql Solutions
Solution 1 - Sql
You are correct. You did exactly the right thing, checking the query plan rather than trying to second-guess the optimiser. :-)
Solution 2 - Sql
There won't be much difference. Howver version 2 is easier when you have some calculations, aggregations, etc that should be joined outside of it
--Version 2
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN
(SELECT ProductID, SUM(OrderQty) as OrderQty FROM SalesOrderDetail GROUP BY ProductID
HAVING SUM(OrderQty) >1000) s
on p.ProductID = s.ProdctId
Solution 3 - Sql
Seems to be identical just in case that SQL server will not try to read data which is not required for the query, the optimizer is clever enough
It can have sense when join on complex query (i.e which have joings, groupings etc itself) then, yes, it is better to specify required fields.
But there is one more point. If the query is simple there is no difference but EVERY extra action even which is supposed to improve performance makes optimizer works harder and optimizer can fail to get the best plan in time and will run not optimal query. So extras select can be a such action which can even decrease performance
Solution 4 - Sql
You did the right thing by checking from query plans. But I have 100% confidence in version 2. It is faster when the number off records are on the very high side.
My database has around 1,000,000 records and this is exactly the scenario where the query plan shows the difference between both the queries.
Further, instead of using a where clause, if you use it in the join itself, it makes the query faster :
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN (SELECT ProductID, OrderQty FROM SalesOrderDetail) s on p.ProductID = s.ProductID
WHERE p.isactive = 1
The better version of this query is :
SELECT p.Name, s.OrderQty
FROM Product p
INNER JOIN (SELECT ProductID, OrderQty FROM SalesOrderDetail) s on p.ProductID = s.ProductID AND p.isactive = 1
(Assuming isactive is a field in product table which represents the active/inactive products).