MySQL Join Where Not Exists

Mysql Problem Overview

I have a MySQL query that joins two tables

Voters
Households

They join on voters.household_id and household.id.

Now what I need to do is to modify it where the voter table is joined to a third table called elimination, along voter.id and elimination.voter_id. However the catch is that I want to exclude any records in the voter table that have a corresponding record in the elimination table.

How do I craft a query to do this?

This is my current query:

SELECT `voter`.`ID`, `voter`.`Last_Name`, `voter`.`First_Name`,
       `voter`.`Middle_Name`, `voter`.`Age`, `voter`.`Sex`,
       `voter`.`Party`, `voter`.`Demo`, `voter`.`PV`,
       `household`.`Address`, `household`.`City`, `household`.`Zip`
FROM (`voter`)
JOIN `household` ON `voter`.`House_ID`=`household`.`id`
WHERE `CT` = '5'
AND `Precnum` = 'CTY3'
AND  `Last_Name`  LIKE '%Cumbee%'
AND  `First_Name`  LIKE '%John%'
ORDER BY `Last_Name` ASC
LIMIT 30

Mysql Solutions

Solution 1 - Mysql

I'd probably use a LEFT JOIN, which will return rows even if there's no match, and then you can select only the rows with no match by checking for NULLs.

So, something like:

SELECT V.*
FROM voter V LEFT JOIN elimination E ON V.id = E.voter_id
WHERE E.voter_id IS NULL

Whether that's more or less efficient than using a subquery depends on optimization, indexes, whether its possible to have more than one elimination per voter, etc.

Solution 2 - Mysql

I'd use a 'where not exists' -- exactly as you suggest in your title:

SELECT `voter`.`ID`, `voter`.`Last_Name`, `voter`.`First_Name`,
       `voter`.`Middle_Name`, `voter`.`Age`, `voter`.`Sex`,
       `voter`.`Party`, `voter`.`Demo`, `voter`.`PV`,
       `household`.`Address`, `household`.`City`, `household`.`Zip`
FROM (`voter`)
JOIN `household` ON `voter`.`House_ID`=`household`.`id`
WHERE `CT` = '5'
AND `Precnum` = 'CTY3'
AND  `Last_Name`  LIKE '%Cumbee%'
AND  `First_Name`  LIKE '%John%'

AND NOT EXISTS (
  SELECT * FROM `elimination`
   WHERE `elimination`.`voter_id` = `voter`.`ID`
)

ORDER BY `Last_Name` ASC
LIMIT 30

That may be marginally faster than doing a left join (of course, depending on your indexes, cardinality of your tables, etc), and is almost certainly much faster than using IN.

Solution 3 - Mysql

There are three possible ways to do that.

Option

 SELECT  lt.* FROM    table_left lt
 LEFT JOIN
     table_right rt
 ON      rt.value = lt.value
 WHERE   rt.value IS NULL

Option

 SELECT  lt.* FROM    table_left lt
 WHERE   lt.value NOT IN
 (
 SELECT  value
 FROM    table_right rt
 )

Option

 SELECT  lt.* FROM    table_left lt
 WHERE   NOT EXISTS
 (
 SELECT  NULL
 FROM    table_right rt
 WHERE   rt.value = lt.value
 )

Solution 4 - Mysql

Be wary of "LEFT" JOINS - LEFT JOINS are essentially OUTER JOINS. Different RDBMS query parsers and optimizers may handle OUTER JOINS very differently. Take for instance, how LEFT (OUTER) JOINS are parsed by MySQL's query optimizer, and the difference in resulting execution plans they could evaluate to per iteration:

https://dev.mysql.com/doc/refman/8.0/en/outer-join-simplification.html

LEFT JOINS by their very nature are ALWAYS going to be NonDeterministic. IMO - they should not be used in Production code.

I prefer to write JOIN type statements in a more "old school" approach first, leaving out any specific JOIN declarations. Let the RDBMS query parser do what its designed to do - analyze your statement and translate it to most optimal execution plan based on its evaluation of your index stats and data model design. That said, the build in query parsers / optimizers can even get it wrong, trust me I've seen it happen many times. In general, I feel like taking this approach first generally provides sufficient baseline information to make informed further tuning decisions in most cases.

To illustrate - using the question query from this thread:

SELECT `voter`.`ID`, `voter`.`Last_Name`, `voter`.`First_Name`,
       `voter`.`Middle_Name`, `voter`.`Age`, `voter`.`Sex`,
       `voter`.`Party`, `voter`.`Demo`, `voter`.`PV`,
       `household`.`Address`, `household`.`City`, `household`.`Zip`
FROM (`voter`)
JOIN `household` ON `voter`.`House_ID`=`household`.`id`
WHERE `CT` = '5'
AND `Precnum` = 'CTY3'
AND  `Last_Name`  LIKE '%Cumbee%'
AND  `First_Name`  LIKE '%John%'

AND NOT EXISTS (
  SELECT * FROM `elimination`
   WHERE `elimination`.`voter_id` = `voter`.`ID`
)

ORDER BY `Last_Name` ASC
LIMIT 30

Consider it re-written without the explicit JOIN and NOT EXISTS statements above (assumes the non fully qualified fields in the WHERE clause belonged to the voter table):

SELECT v.`ID`, v.`Last_Name`, v.`First_Name`,
       v.`Middle_Name`, v.`Age`, v.`Sex`,
       v.`Party`, v.`Demo`, v.`PV`,
       h.`Address`, h.`City`, h.`Zip`
FROM `voter` v, `household` h, `elimination` e
WHERE v.`House_ID` = h.`id`
AND v.`ID` != e.`voter_id`
AND v.`CT` = '5'
AND v.`Precnum` = 'CTY3'
AND  v.`Last_Name`  LIKE '%Cumbee%'
AND  v.`First_Name`  LIKE '%John%'
ORDER BY v.`Last_Name` ASC
LIMIT 30;

Try writing some of your future SQL queries BOTH ways syntactically going forward, compare their results, and see what you think. Writing your SQL in the style I have suggested above comes with the added benefit of being more RDBMS agnostic, also.

Cheers!

Content Type	Original Author	Original Content on Stackoverflow
Question	gsueagle2008	View Question on Stackoverflow
Solution 1 - Mysql	NickZoic	View Answer on Stackoverflow
Solution 2 - Mysql	Ian Clelland	View Answer on Stackoverflow
Solution 3 - Mysql	Dumindu Madushanka	View Answer on Stackoverflow
Solution 4 - Mysql	Christopher Bishop	View Answer on Stackoverflow

MySQL Join Where Not Exists

Mysql Problem Overview

Mysql Solutions

Solution 1 - Mysql

Solution 2 - Mysql

Solution 3 - Mysql

Solution 4 - Mysql

What is a good regular expression for catching typos in an email address?

How to add element to C++ array?

Attributions