I've been write SQL to remove duplicate rows in PostgreSQL and many other trick with huge DB.
Now I try with MySQL.
Find all duplicate rows by composite columns, ie. user_id and file_date. Both is integer: file_date in unix_timestamp.
select id,
user_id,
file_date,
count(*)
from detailed_claims
group by user_id,
file_date
having count(*) > 1
We just need id column to remove, so only select id:
select id from (select id, user_id, file_date, count(*) from detailed_claims group by user_id, file_date having count(*) > 1 ) as blondie
Result:
|id |
| 46400 |
| 46421 |
| 46402 |
| 46159 |
| 46414 |
| 46157 |
| 46161 |
| 46163 |
| 46405 |
| 46164 |
| 46165 |
| 46166 |
| 46158 |
| 46167 |
| 46162 |
| 46406 |
| 46417 |
| 46177 |
+--------+
4885 rows in set (1.14 sec)
Now remove:
delete from detailed_claims where id in (select id from (select id,
user_id,
file_date,
count(*)
from detailed_claims
group by user_id,
file_date
having count(*) > 1
) as blondie
)
OK, now we can add Unique indexes to 2 columns user_id and file_date without error msg.
Now I try with MySQL.
Find all duplicate rows by composite columns, ie. user_id and file_date. Both is integer: file_date in unix_timestamp.
select id,
user_id,
file_date,
count(*)
from detailed_claims
group by user_id,
file_date
having count(*) > 1
We just need id column to remove, so only select id:
select id from (select id, user_id, file_date, count(*) from detailed_claims group by user_id, file_date having count(*) > 1 ) as blondie
Result:
|id |
| 46400 |
| 46421 |
| 46402 |
| 46159 |
| 46414 |
| 46157 |
| 46161 |
| 46163 |
| 46405 |
| 46164 |
| 46165 |
| 46166 |
| 46158 |
| 46167 |
| 46162 |
| 46406 |
| 46417 |
| 46177 |
+--------+
4885 rows in set (1.14 sec)
Now remove:
delete from detailed_claims where id in (select id from (select id,
user_id,
file_date,
count(*)
from detailed_claims
group by user_id,
file_date
having count(*) > 1
) as blondie
)
OK, now we can add Unique indexes to 2 columns user_id and file_date without error msg.
Comments
Post a Comment