lundi 24 août 2015

T-SQL - Remove All Duplicates Except Most Recent (SQL Server 2005)

This problem is beginning to drive me crazy and I have been unsuccessful at locating an easily adaptable answer for what I'm trying to do. The basic idea is that I have a T-SQL function that will pull all records inserted into a main table within the last 60 minutes and insert them into a table variable. I've then got some more code that will filter that set into another table variable to be returned.

In this set I'm expecting some records to have multiple occurrences but they will have a unique date time. I would like to delete every record that has greater than or equal to 3 occurrences, but keep the one with the most recent datetime value. So far I've set up a CTE that manages to delete all the records with a count greater than or equal to 3, but this does not leave the record with the most recent occurrence.

    WITH cte AS (
                  SELECT ColA, ColB, /*DateTimeColumn,*/ ColC, ROW_NUMBER() OVER (PARTITION BY ColA, ColB, /*DateTimeColumn,*/ ColC ORDER BY ColA /*,DateTimeColumn*/) AS r_count
                  FROM @table_variable_2
                                    )
                  DELETE
                  FROM      cte
                  WHERE     r_count >= 3 -- I think I need an AND NOT IN subquery here

Any advice would be greatly appreciated!

Aucun commentaire:

Enregistrer un commentaire