OracleBrains.Com header image 2

Deleting duplicate row in certain condition using LEAD an analytic function!

September 3rd, 2007 by Rajender Singh · 1 Comment

Today, while going through Oracle Forum, I came across this really interesting solution!

Lets suppose, We have a table tab1 having following columns (never mind the data type):

TimeIn
DateIn
Locationin
UseridIn
Status

Now let suppose I need to delete the duplicate rows in tab1 and leave only the minimum DateIn and TimeIn in that table by UseridIn.

If I had this problem I would have solved the problem as follows:

DELETE FROM tab1
WHERE rowid NOT IN (
SELECT MIN(rowid)
FROM tab1
WHERE (UseridIn, TO_DATE(TO_CHAR(DateIn,’YYYYMMDD’) || TimeIn,’YYYYMMDDHH24:MI:SS’)) IN
( SELECT UseridIn, MIN(TO_DATE(TO_CHAR(DateIn,’YYYYMMDD’) || timein,’YYYYMMDDHH24:MI:SS’))
FROM tab1
GROUP BY UseridIn)
GROUP BY UseridIn
)

But I found solution provided by William Robertson interesting which was as follows:

DELETE tab1
WHERE rowid IN
( SELECT LEAD(rowid)
OVER(PARTITION BY useridin ORDER BY datein,timein)
FROM tab1 );

Where he is using (which gives the access to a row at a given physical offset) with OVER keyword, PARTITION it by useridin and order by datein and timein.

Now let’s check out, how it is working:

Now when I executed following sql

SELECT useridin,
timein,
rowid current_rowid,
LEAD(rowid) OVER(PARTITION BY useridin ORDER BY datein,timein) lead_rowid
FROM tab1

USE TIMEIN CURRENT_ROWID LEAD_ROWID
— ——– —————— ——————
U01 07:20:00 AAAOdoAAEAAAAKEAAC AAAOdoAAEAAAAKEAAB
U01 08:10:30 AAAOdoAAEAAAAKEAAB AAAOdoAAEAAAAKEAAA
U01 08:20:00 AAAOdoAAEAAAAKEAAA
U02 06:10:30 AAAOdoAAEAAAAKEAAF AAAOdoAAEAAAAKEAAE
U02 06:10:30 AAAOdoAAEAAAAKEAAE AAAOdoAAEAAAAKEAAD
U02 08:14:00 AAAOdoAAEAAAAKEAAD

As we can see for each “useridin”, result data is order in asc order by datain and timein. So minimum datein and timein for each useridin will be the first row in the resultset for each useridin if we partition it by useridin as shown above.

And this minimum row will be not any rows LEAD_ROWID as it is the first row for each resultset partition by useridin.

Note:
If you do not specify offset(as in case above), then default will be 1.
If you don not specify how to do order by, then its default will be ASC order.


Tags: SQL and PL/SQL

1 response so far ↓

  • 1 anonymous // Sep 5, 2007 at 10:46 am

    cooooool!

Leave a Comment