lundi 16 novembre 2015

Why does query performance diminish with additional join conditions

I have a simple query with two tables. massfehler_hardware has 50K rows and no indexes. f_produkt has 38M rows with a clustered unique index on account_number, item_number, contract_number, [and some other columns]

Following query takes 11 seconds:

select count(distinct a.serial_number)
from massfehler_hardware a
join f_produkt f
on a.account_number = f.account_number
where f.product_code = 'VOD'

With two additional join conditions, it gets 5x slower:

select count(distinct a.serial_number)
from massfehler_hardware a
join f_produkt f
on a.account_number = f.account_number
and a.service_address_id = f.service_address_id
and a.outlet_location = f.outlet_code
where f.product_code = 'VOD'

Why is the impact of additional join conditions so extreme? Joining with only account_number already limits the number of results to approx 46K, so the additional join conditions (service_address_id and outlet_location) should only run on those 46K rows, and shouldn't increase the costs so much. What do I miss?

Strangely enough, when I compare the execution plans, the estimated relative cost of the faster running query is bigger! (57%)

Here are the two execution plans respectively:

Faster running query: Faster running query

Slower running query: Slower running query

Aucun commentaire:

Enregistrer un commentaire