Suppose I have a query with a "complex" filtering condition
SELECT
<fields>
FROM
<tables>
WHERE
<complex condition on non-indexed fields>
ORDER BY
<indexed fields>
Now further suppose the database is very large (as in hundredthousands of records)
And this query is part of a webservice (which is what I'm writing) used from a third party website (which I'm not).
In order to limit the time the DB is busy processing the request, and because I don't want to return potentially thousands of records at once, I want to retrieve a set number of records at a time which would be presented on the website one page at a time with previous/next buttons.
I know I can restrict the sql query with say "TOP 10", but what would be a good way to do the next page (11 to 20) and the page after (21 to 30) etc. in such a way that the DB doesn't have to restart the entire query over from the start. or in other words, what's the "optimal way" to continue a previous (set of) TOP X
We haven't settled on a DB manager yet, ideally we would want to be supporting several. I'm not sure we can "require" MSSQL 2012, some of our smaller customers might have budget issues with that. A more generic solution would be prefered.
I'm not sure I fully understand the Offset / Fetch next approach. Or rather, how the DB manages to optimize this efficiently. The issue here is the "complex condition".
if I do an offset X, fetch next Y. How does this prevent the DB from starting the query all over again, and rerunning the query to find the first X records. It is very possible that it may need to process thousands and thousands of records to obtain the first top X.
If what the DB does is "fetch top X+Y, and don't return first X in the resultset" then this probably won't work efficiently enough.
if I do an offset X, fetch next Y. How does this prevent the DB from starting the query all over again, and rerunning the query to find the first X records. It is very possible that it may need to process thousands and thousands of records to obtain the first top X.
If what the DB does is "fetch top X+Y, and don't return first X in the resultset" then this probably won't work efficiently enough.
I don't know how it works internally. Could you take some performance metrics and see if the results return consistently? If the implementation internally is correct, I would expect the timings of the last set of records to be similar to the first set.
I don't have MSSQL 2012 installed, so can't do any tests atm, nor can I easily convert the current database to MSSQL. This whole thing is currently ISAM based, and I'm looking at doing the same functionality with the SQL/ODBC layer, so we can transitionaway from ISAM and into relational. With ISAM, this whole thing is easy enough, we simply return the record position for each record, and for a next/previous, we call the webservice with either the record position of the first record in the current "page" (for a previous) or the last record (for a next).
Ideally we want this to work with any db that has a sufficiently compatible SQL syntax/features. We have customers on MSSQL already, but for the smaller customers, MSSQL really isn't an option because of the pricing.
Bookmarks