I've seen the following technique in several beginner code samples for demonstrating
SQL Server 2005's ability to return paged results.
I've added the TotalRows = Count(*) OVER() line to demonstrate how return the total
rows returned above and beyond the row count for the paged set. This removes
the need for a second query to get the total rows available for paging techniques
in your application. In your application, just check to make sure your resultset
has records, then just grab the first record and retrieve its TotalRows column
value.
Notice that in this query, the JOIN between the Orders table and the Users table
is being run across all records that are found NOT just the records returned
in the paged set.
declare @StartRow int
declare @MaxRows int
select @StartRow = 1
select @MaxRows = 10
select *
from
(select o.*,u.FirstName,u.LastName,
TotalRows=Count(*) OVER(),
ROW_NUMBER() OVER(ORDER BY o.CreateDateTime desc) as RowNum
from Orders o , Users u
WHERE o.CreateDateTime > getdate() -30
AND (o.UserID = u.UserID)
)
WHERE RowNum BETWEEN @StartRow AND (@StartRow + @MaxRows) -1
If you adjust your query as follows, you will see a substantial boost in performance.
Notice this query only performs the join on the returned resultset which is much,
much smaller.
SELECT MyTable.*,u.FirstName,u.LastName
FROM
(SELECT o.*,
TotalRows=Count(*) OVER(),
ROW_NUMBER() OVER(ORDER BY o.CreateDateTime desc) as RowNum
FROM Orders o
WHERE o.CreateDateTime > getdate() -30
) as MyTable, Users u
WHERE RowNum BETWEEN @StartRow AND (@StartRow + @MaxRows) -1
and (MyTable.UserID = u.UserID)
Having recently reviewed ways to speed up the paging grids at EggHeadCafe (they were
never slow but I'm always looking for ways to optimize things), I decided to
put SQL Server's new paging mechanism to a test between itself and using standard
queries in conjunction with TABLE variables. Across 5 different very large tables
at EggHeadCafe, the TABLE variable option performed twice as fast as the suggested
paging mechanism above. Why I didn't of think of why much sooner is beyond me...
When you look at the inner query in the paging sample above, you notice that it is
pulling back the entire Orders record for every single record in the Orders table.
Then, the outer query queries the inner table's results.
The TABLE variable sample stored procedure syntax doesn't perform the same JOINS as above.
It is included
here just to give you an idea of how you might write your own. Of course, depending
on how often your data changes, you could even implement a cache of the primary keys to speed this up even more.
I saw the biggest performance gains when querying for pages deep in the result set. In our case, that was tens of thousands of pages deep.
CREATE PROCEDURE [dbo].[GetRecordsPaged]
(
@StartRow int,
@MaxRows int
)
AS
declare @TotalRows bigint
declare @Pager table
(
RowNumber int IDENTITY (1, 1) Primary key NOT NULL ,
RecordID bigint,
Primary Key Clustered(RowNumber)
)
-- Notice that this INSERT INTO query can get 100% of its results from the clustered
primary key index.
INSERT INTO @Pager (RecordID)
SELECT RecordID
FROM dbo.Record
ORDER BY RecordID desc -- You can ORDER BY datetime columns if your primary key is not a number oriented column.
SELECT top 1 @TotalRows = COUNT(*) from @Pager -- Did this because it is a little faster than TotalRows=Count(*) OVER()
-- You would append your JOINS etc. to the result query below
SELECT Record.*,
@TotalRows as TotalRows
FROM dbo.Record
WHERE RecordID in (SELECT RecordID
FROM @Pager i
WHERE i.RowNumber BETWEEN @StartRow AND (@StartRow
+ @MaxRows) - 1)
ORDER BY RecordID desc