How to delete duplicate rows

Basically, there are several ways to do this. In this post, I show two different ways. Let’s start with a basic example.

Set Up

create table t1
(
id number,
c1 varchar2(50),
c2 varchar2(50),
c3 varchar2(50),
c4 date
);

insert into t1 values(1, 'row1', 'txt1', 'txt1', trunc(sysdate));

insert into t1 values(2, 'row1', 'txt1', 'txt1', trunc(sysdate)); --duplicate row

insert into t1 values(3, 'row1', 'txt1', 'txt1', trunc(sysdate)); --duplicate row

insert into t1 values(4, 'row2', 'txt2', 'txt2', trunc(sysdate));

insert into t1 values(5, 'row2', 'txt2', 'txt2', trunc(sysdate)); --duplicate row

insert into t1 values(6, 'row2', 'txt2', 'txt2', trunc(sysdate)); --duplicate row

commit;

Checking if the table contains duplicate rows.

select c1, c3, c3, c4, count(*) from t1 group by c1, c3, c3, c4 having count(*) > 1;
Solution 1) If you have unique id in your table

You can delete duplicate rows by id. Let’s say that you want to keep the first inserted rows but delete the others. You can do this by using ROW_NUMBER() analytic function.

delete from t1 
where id in
(
    select id from (
        select 
            row_number() over(partition by c1, c2, c3, c4 order by id) rn, 
            t.* 
        from t1 t
    ) where rn != 1
);
Solution 2) If your table doesn’t contain id column.

In that case, you can delete rows by rowid as shown below.

delete from t1 
where rowid not in (
    select min(rowid) from t1 group by c1, c2, c3, c4
);

As you can see, it is quite easy. The order in which you want to delete rows may vary depending on your business logic. I hope this helps you.

Leave a Reply

Your email address will not be published. Required fields are marked *