How to find duplicates in 2 columns not 1
MysqlDuplicatesMysql Problem Overview
I have a MySQL database table with two columns that interest me. Individually they can each have duplicates, but they should never have a duplicate of BOTH of them having the same value.
stone_id
can have duplicates as long as for each upsharge
title is different, and in reverse. But say for example stone_id
= 412 and upcharge_title
= "sapphire" that combination should only occur once.
This is ok:
stone_id = 412 upcharge_title = "sapphire"
stone_id = 412 upcharge_title = "ruby"
This is NOT ok:
stone_id = 412 upcharge_title = "sapphire"
stone_id = 412 upcharge_title = "sapphire"
Is there a query that will find duplicates in both fields? And if possible is there a way to set my data-base to not allow that?
I am using MySQL version 4.1.22
Mysql Solutions
Solution 1 - Mysql
You should set up a composite key between the two fields. This will require a unique stone_id and upcharge_title for each row.
As far as finding the existing duplicates try this:
select stone_id,
upcharge_title,
count(*)
from your_table
group by stone_id,
upcharge_title
having count(*) > 1
Solution 2 - Mysql
I found it helpful to add a unqiue index using an "ALTER IGNORE" which removes the duplicates and enforces unique records which sounds like you would like to do. So the syntax would be:
ALTER IGNORE TABLE `table` ADD UNIQUE INDEX(`id`, `another_id`, `one_more_id`);
This effectively adds the unique constraint meaning you will never have duplicate records and the IGNORE deletes the existing duplicates.
You can read more about eh ALTER IGNORE here: http://mediakey.dk/~cc/mysql-remove-duplicate-entries/
Update: I was informed by @Inquisitive that this may fail in versions of MySql> 5.5 :
> It fails On MySQL > 5.5 and on InnoDB table, and in Percona because of
> their InnoDB fast index creation feature [http://bugs.mysql.com/bug.php?id=40344]. In this case
> first run set session old_alter_table=1
and then the above command
> will work fine
Update - ALTER IGNORE
Removed In 5.7
From the docs > As of MySQL 5.6.17, the IGNORE clause is deprecated and its use > generates a warning. IGNORE is removed in MySQL 5.7.
One of the MySQL dev's give two alternatives:
- Group by the unique fields and delete as seen above
- Create a new table, add a unique index, use
INSERT IGNORE
, ex:
CREATE TABLE duplicate_row_table LIKE regular_row_table;
ALTER TABLE duplicate_row_table ADD UNIQUE INDEX (id, another_id);
INSERT IGNORE INTO duplicate_row_table SELECT * FROM regular_row_table;
DROP TABLE regular_row_table;
RENAME TABLE duplicate_row_table TO regular_row_table;
But depending on the size of your table, this may not be practical
Solution 3 - Mysql
You can find duplicates like this..
Select
stone_id, upcharge_title, count(*)
from
particulartable
group by
stone_id, upcharge_title
having
count(*) > 1
Solution 4 - Mysql
To find the duplicates:
select stone_id, upcharge_title from tablename group by stone_id, upcharge_title having count(*)>1
To constrain to avoid this in future, create a composite unique key on these two fields.
Solution 5 - Mysql
Incidentally, a composite unique constraint on the table would prevent this from occurring in the first place.
ALTER TABLE table
ADD UNIQUE(stone_id, charge_title)
(This is valid T-SQL. Not sure about MySQL.)
Solution 6 - Mysql
this SO post helped me, but i too wanted to know how to delete and keep one of the rows... here's a PHP solution to delete the duplicate rows and keep one (in my case there were only 2 columns and it is in a function for clearing duplicate category associations)
$dupes = $db->query('select *, count(*) as NUM_DUPES from PRODUCT_CATEGORY_PRODUCT group by fkPRODUCT_CATEGORY_ID, fkPRODUCT_ID having count(*) > 1');
if (!is_array($dupes))
return true;
foreach ($dupes as $dupe) {
$db->query('delete from PRODUCT_CATEGORY_PRODUCT where fkPRODUCT_ID = ' . $dupe['fkPRODUCT_ID'] . ' and fkPRODUCT_CATEGORY_ID = ' . $dupe['fkPRODUCT_CATEGORY_ID'] . ' limit ' . ($dupe['NUM_DUPES'] - 1);
}
the (limit NUM_DUPES - 1) is what preserves the single row...
thanks all
Solution 7 - Mysql
This is what worked for me (ignoring null and blank). Two different email columns:
SELECT *
FROM members
WHERE email IN (SELECT soemail
FROM members
WHERE NOT Isnull(soemail)
AND soemail <> '');