Monday, September 11, 2006

SQL Server 2005 Ordered View and Inline Function Problems

SQL Server 2000 introduced the TOP n [PERCENT] operator to enable restricting the number of rows in the resultset of a view or inline table-valued function (TVF). In SQL Server 2000, starting the SQL statement for a view or TVF with SELECT TOP n [PERCENT] ... enables adding an ORDER BY clause to order the query's internal rowset so as to return the correct resultset. As a byproduct of that operation, views and TVFs returned resultsets sorted in accordance with the ORDER BY clause. A SELECT TOP 100 PERCENT query with an ORDER BY clause in SQL Server 2000 or MSDE 2000 views and TVFs returns all rows sorted. SQL Server Management Studio [Express] and the Microsoft Access Upsizing Wizard expect—and the Wizard depends on—this behavior. SQL Server 2005 [Express] does not respect the SELECT TOP 100 PERCENT operator or it's SQL Server 2005-only functional equivalent—SELECT TOP (100) PERCENT—and ignores the ORDER BY clause in views and TVFs. Thus resultsets are not sorted. ANSI SQL specifies that ORDER BY clauses are valid only in the top-most (outer) SELECT block of a query. But no ANSI SQL version supports the TOP operator, so the ANSI rules don't apply to SELECT TOP queries. Most SQL Server DBAs don't permit use of ordered views or TVFs, preferring to apply the sort with an ORDER BY clause in the SELECT block that invokes the view or TVF. Visual Studio and Microsoft Access developers commonly use the shortcut. As mentioned earlier, SQL Server 2005's corresponding TOP (100) PERCENT function also enables adding an ORDER BY clause to a view or TVF. However, the SQL 2005 query optimizer ignores the TOP (100) PERCENT and ORDER BY clauses and does not return a result set with the specified custom sort. The query optimizer "optimizes out" the TOP (100) PERCENT function as redundant, which deactivates the ORDER BY clause without warning. Fortunately, there's a workaroud that I'll discuss shortly. SQL Server Books Online's ORDER BY Clause (Transact-SQL) topic has a note that states:

When ORDER BY is used in the definition of a view, inline function, derived table, or subquery, the clause is used only to determine the rows returned by the TOP clause. The ORDER BY clause does not guarantee ordered results when these constructs are queried, unless ORDER BY is also specified in the query itself.
The upshot of this note is that there is no guarantee whatsoever that SELECT TOP ... ORDER BY ... query constructs will return rows in sequence specified by the ORDER BY clause. My experience—and that of most other commentators on this issue—is that SQL Server 2000 returns rows in the expected order for all valid TOP integer and TOP float PERCENT values, as does SQL Server 2005 for valid integer and all float values except 100. The da Vinci Toolset Obscures the Issue The Visual Data Tools (VDT)—more commonly called the da Vinci toolset—first appeared in the SQL Server 7.0 version of SQL Server Enterprise Manager (codenamed "Starfighter"). SQL Server 2005 Management Studio (SSMS), SQL Server Management Studio Express (SSMSX), Visual Studio 2005, and Access 2007 use the current da Vinci toolset version. The da Vinci toolset's query parser automatically adds a TOP (100) PERCENT operator when you save or execute an ALTER VIEW or ALTER FUNCTION definition that contains and ORDER BY clause in its SELECT block. Note: SQL Server Express (SSX) SP1 offers SSMSX as an individual download or bundled with SSX SP1 in SSX with Advanced Services SP1. Here's an example of a TOP (100) PERCENT query (Access's Alphabetical List of Products) in SSMSX's view design window:

Notice that the resultset in the lower Results pane is ordered alphabetically by the ProductName column. However, when you right-click the Alphabetical List of Products item in Object Explorer and choose Open View, the resultset isn't ordered by ProductName (or any other set of column values), as seen here: The sorted display of the Results pane is undoubtedly due to the SELECT query in the SQL pane being the outermost query at this point. This ambiguous presentation for views is likely to lead unsuspecting developers down the primrose path.

Note: SSMX and a Microsoft Access query from upsized Northwind.accdb is used as the example, because many more data-intensive application developers use (or will use) SQL Server (2005) Express (SSX) with SSMSX and Visual Studio Express editions or Access than SSMS and SQL Server 2005 Standard or Enterprise editions. (Visual Studio Express editions don't include the VDT.) The da Vinci Tools in the Access 2007 Project Designer Microsoft Access 2000, 2002, 2003, and 2007 use the da Vinci toolset as the Project Designer for Access Data Projects (ADPs). However, the Access 2007 Beta 2 Technical Refresh version shown here appears to use a slightly different query parser. Adding a sort to a view or inline TVF adds the SQL Server 2000-style TOP 100 PERCENT operator (without the parenthesis for backward compatibility with the Microsoft Desktop Engine (MSDE) 2000.) Note: Access 2007 Beta 2 TR won't display the contents of the Diagram or Grid panes if you or SSMS[X] add parenthesis to the TOP operator. Addition of parenthesis by SSMS[X] prevents round-tripping of view or TVF designs to Access 2007 without loss of Diagram and Grid panes. Running the preceding query displays the unordered resultset in Access's Datasheet View, as shown here: Access substitutes it's Datasheet View for SSMS[X]'s Results pane. The preceding resultset obviously isn't ordered by ProductName, so developers will know immediately that the sort wasn't applied.

Note: In Access 2007 Beta 2, clicking the Tools group's Run (!) button of the Function & View Tools, Design ribbon doesn't update the row order of an open Datasheet View. You must close and reopen the Datasheet View to see the effect of a change to the view or inline TVF row order. A Simple Workaround for the TOP (100) PERCENT Problem If you must apply a custom sort order to views and TVFs, which is generally considered to be a hack, substitute TOP (VeryLargeInteger) for TOP (100) PERCENT. The largest integer that the query parser will accept is MAX(int) or 2147483647. This limitation appears to be for backward compatibility, because SQL Server 2005 converts TOP (n) values to the bigint data type. Here's the view's new query in SSMSX: In this case, the Results pane above reflects reality, as shown below: The TOP 2147483647 operator also behaves correctly in Access views and queries. However, Access developers have the option to set the Order By property of a view or inline TVF to Column1 [DESC][, Column2 [DESC] ... ] on the Data page of the ObjectName Properties dialog. This property specifies the ORDER BY clause of the outer query. Visual Studio developers and others might need to alter their TableAdapters or SELECT queries to apply an ex post facto sort order.

Note: SQL Server 2005 TOP (n) views are updatable if the view without the TOP (n) function is updatable. Conclusion It's surprising that the SQL Server tools group didn't alter the query parser to replace TOP (100) Percent with TOP (2147483647). The group also should have fixed—or warned users about—the ambiguous presentation in the Results pane for views.

The appropriate solution for SQL Server 2005 [Express] SP2 is to require SQL Server 2005's query processor to respect SELECT TOP (100) PERCENT ... ORDER BY views and TVFs by emulating SQL Server 2000 behavior, and change the party line by guaranteeing that SELECT TOP ... ORDER BY views and TVFs return a sorted resultset.

I haven't tested the workaround exhaustively, but it's likely to work in most or all cases. However, you're far better off from a performance and reusability standpoint to follow the ANSI SQL rules and apply the ORDER BY clause within the outermost query block, not in views or inline TVFs.

Note: For more information about the use of ORDER BY in SQL Server 2005, see the Ordering guarantees in SQL Server... item by Conor Cunningham, Query Optimizer Development Lead, in the SQL Engine Tips blog. See Books Online's TOP (Transact-SQL) topic for more detailed information on the TOP function.

Updates: Minor updates with clarifications on 9/27/2006 and 9/29/2006.

Technorati tags: , , , , , , , , , , , , , , ,

5 comments:

john said...

Thanks for this article - was pulling my hair out until I saw this. Looks like someone has changed something without telling the 5 other teams!

David said...

Actually, nothing changed. Order of queries without ORDER BY has always been undefined (7.0,2000,2005). I did post a comment here demonstrating that the suggested solution doesn't work. Somehow my comment disappeared :(

--rj said...

All comments were lost from earlier posts when I changed the blog format in late February 2007.

--rj

Anonymous said...

All I can say is thank you !

I have been struggling for sometime trying to figure out the cause of this mysterious issue.

Regards

Umar

IDisposable said...

Microsoft released a hotfix that deals with this today (6/13/2007) here