samedi 12 septembre 2015

Aggregate value by any of two columns

Suppose I have a Customers Table:

Cutsomers
-----------------------------------------------
Id INTEGER
SSN NCHAR(11)
FullName NVARCHAR(100)
LastPurchaseDate DATETIME

There are many stores around the city, and the customer can register in any of them, each one giving him a different Id. Whenever he buys, the corresponding Id gets it's LastPurchaseDate updated.

Now I need to get the Id corresponding to the 'latest' LastPurchaseDate by person. Problem is, due to X different reasons, there can be typos on either the SSN or the FullName. Let's say I have the next data:

Id          SSN            FullName      LastPurchaseDate
----------- -----------    ------------- -----------------
200123      123-45-6789    John Doe      10-09-2015 
201978      456-78-9012    Mary Jane     15-08-2015 
380789      789-01-2345    Pete Zahut    01-08-2015 
389236      123-45-6789    Jhon Doe      23-07-2015 
215875      456-87-9012    Mary Jane     30-08-2015 
974186      123456789      John Doe      28-04-2015 
123758      789-01-2345    Pete Zaut     18-08-2015 

A customer is considered to be the same person if it has either the same SSN or the same FullName. So in this sample, customers 200123, 389236 and 974186 are the same person. Therefore, the resulting Ids should be

200123
215875
123758

How could I achieve this?

Aucun commentaire:

Enregistrer un commentaire