Skip to main content

MySql remove duplicate on composite 2 or more columns

I've been write SQL to remove duplicate rows in PostgreSQL and many other trick with huge DB.

Now I try with MySQL.

Find all duplicate rows by composite columns, ie. user_id and file_date. Both is integer: file_date in unix_timestamp.

select   id,
 user_id,
         file_date,
         count(*)
from     detailed_claims
group by user_id,
         file_date
having   count(*) > 1

We just need id column to remove, so only select id:

select id from (select id, user_id, file_date, count(*) from detailed_claims group by user_id, file_date having count(*) > 1 ) as blondie

Result:
|id          |
|  46400 |
|  46421 |
|  46402 |
|  46159 |
|  46414 |
|  46157 |
|  46161 |
|  46163 |
|  46405 |
|  46164 |
|  46165 |
|  46166 |
|  46158 |
|  46167 |
|  46162 |
|  46406 |
|  46417 |
|  46177 |
+--------+
4885 rows in set (1.14 sec)

Now remove:

delete from detailed_claims where id in (select id from (select   id,
    user_id,
         file_date,
         count(*)
from     detailed_claims
group by user_id,
         file_date
having   count(*) > 1
                ) as blondie
)


OK, now we can add Unique indexes to 2 columns user_id and file_date without error msg.


Comments

Popular posts from this blog

Rand mm 10

https://stackoverflow.com/questions/2447791/define-vs-const Oh const vs define, many time I got unexpected interview question. As this one, I do not know much or try to study this. My work flow, and I believe of many programmer is that search topic only when we have task or job to tackle. We ignore many 'basic', 'fundamental' documents, RTFM is boring. So I think it is a trade off between the two way of study language. And I think there are a bridge or balanced way to extract both advantage of two method. There are some huge issue with programmer like me that prevent we master some technique that take only little time if doing properly. For example, some Red Hat certificate program, lesson, course that I have learned during Collage gave our exceptional useful when it cover almost all topic while working with Linux. I remember it called something like RHEL (RedHat Enterprise Linux) Certificate... I think there are many tons of documents, guide n books about Linux bu

Martin Fowler - Software Architecture - Making Architecture matter

  https://martinfowler.com/architecture/ One can appreciate the point of this presentation when one's sense of code smell is trained, functional and utilized. Those controlling the budget as well as developer leads should understand the design stamina hypothesis, so that the appropriate focus and priority is given to internal quality - otherwise pay a high price soon. Andrew Farrell 8 months ago I love that he was able to give an important lesson on the “How?” of software architecture at the very end: delegate decisions to those with the time to focus on them. Very nice and straight-forward talk about the value of software architecture For me, architecture is the distribution of complexity in a system. And also, how subsystems communicate with each other. A battle between craftmanship and the economics and economics always win... https://hackernoon.com/applying-clean-architecture-on-web-application-with-modular-pattern-7b11f1b89011 1. Independent of Frameworks 2. Testable 3. Indepe