mercredi 1 avril 2015

Force "TOP 100 PERCENT" in EF "sub-query" queryable

Update: I was mistaken about top 100 percent generating a better query plan (the plan is still much better for a reasonably sized top N, and probably has to do with parameter sniffing).


While I still think this focused question has merit, it is not "a useful solution" for my problem2, and might not be for yours ..




I am running into some queries which SQL Server optimizes poorly. The statistics appear correct, and SQL Server chooses the 'worse' plan that performs a seek over millions of records even though the estimated and actual values are the same - but this question is not about that1.


In the problematic queries are of the simplified form:



select * from x
join y on ..
join z on ..
where z.q = ..


However (and since I know the cardinalities better, apparently) the following form consistently results in a much better query plan:



select * from x
join (
-- the result set here is 'quite small'
select top 100 percent *
from y on ..
join z on ..
where z.q = ..) t on ..


In L2S the Take function can be used to limit to top N, but the "problem" I have with this approach is that requires a finite/fixed N such that some query could hypothetically just break, instead of just running really slow with the forced materialization.


While I could choose a 'very large' value for the top N this, ironically (wrt to the initial problem), increases the SQL query execution time as the value of N increases. The expected intermediate result is only expected to be a few dozen to a few hundred records. The current code I have runs a top 100 and then, if such was detected to contain too many results, runs the query again with a top 1000: but this feels like a kludge .. on top of a kludge.


The question is then: can a EF/L2E/LINQ query generate the equivalent of a top 100 percent on an EF Queryable?


(Forcing materialization via ToList is not an option because the result should be an EF Queryable and remain in LINQ to Entities, not LINQ to Objects.)


While I am currently dealing with EF4, if this is [only] possible in a later version of EF I would accept such as an answer - such is useful knowledge and does answer the question asked.




1 If wishing to answer with "don't do that" or an "alternative", please make it is an secondary answer or aside along with an answer to the actual question being asked. Otherwise, feel free to use the comments.




2 In addition to top 100 percent not generating a better query plan, I forgot to include the 'core issue' at stake, which is bad parameter sniffing (instance is SQL Server 2005).


The following query takes a very long to to complete while direct variable substitution runs "in the blink of an eye" indicating an issue with the parameter sniffing.



declare @x int
set @x = 19348659

select
op.*
from OrderElement oe
join OrderRatePlan rp on oe.OrdersElementID = rp.OrdersElementID
join OrderPrice op on rp.OrdersRatePlanID = op.OrdersRatePlanID
where oe.OrdersProductID = @x


The kludged-but-workable query



select
op.*
from OrderPrice op
join (
-- Choosing a 'small value of N' runs fast and it slows down as the
-- value of N is increases where N >> 1000 simply "takes too long".
-- Using TOP 100 PERCENT also "takes too long".
select top 100
rp.*
from OrderElement oe
join OrderRatePlan rp on oe.OrdersElementID = rp.OrdersElementID
where oe.OrdersProductID = @x
) rp
on rp.OrdersRatePlanID = op.OrdersRatePlanID

Aucun commentaire:

Enregistrer un commentaire