Monday, October 01, 2007

Save Server Round Trips by Preloading LINQ to SQL EntityRefs

Replacing foreign key values with readable names from property values of associated (related) objects makes grid-based lists or detail panes much more user-friendly. For example, an employee's last (or first and last) name is much more meaningful to a application's user than an integer EmployeeID value.

As mentioned in the Problems Using Stored Procedures for LINQ to SQL Data Retrieval (Updated 9/30/2007) post and explained in my "Optimize LINQ to SQL Performance" article for the November 2007 issue of Visual Studio Magazine, you can save a substantial number of round trips to the server per postback by preloading all or selected records of tables that persist EntityRefs for the root objects of your Windows form session or Web page.

The process is most useful when you use stored procedures for data retrieval because LINQ for SQL ordinarily requires executing a stored procedure to retrieve a row for each EntityRef object. For example, an unsorted web page that displays the last 10 rows of Northwind Orders requires 17 stored procedure executions without preloading and 5 queries when you preload the associated entities.

Preload Code

I call the process preloading because you execute a stored procedure to retrieve each group of entities needed for the session or page after specifying the eager-loading (also called pre-fetching) options and before you execute the stored procedure to populate the root objects. Here's the code in the LinqDataSource_Selecting event handler that preloads entities to replace the CustomerID, EmployeeID, and ShipVia fields with readable names:

Protected Sub ldsOrders_Selecting(ByVal sender As Object, _
        ByVal e As System.Web.UI.WebControls.LinqDataSourceSelectEventArgs) _
        Handles ldsOrders.Selecting
    dcNwind = New NorthwindDataContext()

    'Specify eager loading before any query is executed
    Dim dlOrder As New DataLoadOptions()
    dlOrder.LoadWith(Of Order)(Function(o) o.Customer)
    dlOrder.LoadWith(Of Order)(Function(o) o.Employee)
    dlOrder.LoadWith(Of Order)(Function(o) o.Shipper)
    dcNwind.LoadOptions = dlOrder

    'Preload EntityRefs (m:1 associations) to minimize round-trips

    'Get the total row count
    ldsOrders.AutoPage = False
    e.Arguments.RetrieveTotalRowCount = False
    e.Arguments.TotalRowCount = _
    dcNwind.ExecuteQuery(Of Integer) _
    ("EXEC usp_GetOrdersCountByCountry 'USA'").First()

    'Retrieve the number of orders specified by MaximumRows
    e.Result = dcNwind.usp_GetOrdersByCountryPaged("USA", _
        e.Arguments.MaximumRows, e.Arguments.StartRowIndex).ToList()
End Sub

Preloading works because the DataContext's IdentityManager detects the presence of the entities and doesn't execute the stored procedure function overloads specified by the DataLoadOptions collection.

Following is a typical code block for a templated GridView column that displays the Customer.CompanyName property value by default and the Order.CustomerID property value when editing:

<asp:templatefield HeaderText="Customer" SortExpression="Customer.CompanyName">
        <asp:TextBox ID="TextBox1" runat="server"
             Text='<%# Bind("CustomerID") %>'></asp:TextBox>
        <asp:Label ID="Label1" runat="server"
             Text='<%# Eval("Customer.CompanyName") %>'></asp:Label>

The DataGridView control doesn't support substituting associated entity (sub-object) properties for foreign key values. It's unfortunate that the Windows form toolbox doesn't include a customizable grid that's similar to a GridView with template fields.

Web Page with Preloaded EntityRefs

Here's a screen capture of the last 10 Northwind orders for US customers sorted by OrderID descending in a GridView. (This is the OrdersLinqDataSourceSP.sln test harness for pre-loading EntityRef objects and server-side sorting with a LinqDataSource control.)

Click image for full-size screen capture.

Customer.Name, Employee.LastName, and Shipper.CompanyName EntityRefs substitute for CustomerID, EmployeeID, and ShipVia property values respectively. The 17 queries executed without preloading retrieve the following:

  1. 6 unique Customer.CompanyName values
  2. 6 unique Employee.LastName values
  3. 3 Shipper.CompanyName values
  4. 1 row count
  5. 1 set of 10 Orders rows

Preloading reduces the number of executions for items 1 through 3 to one each for a total of 5.

Preloading Performance Trade-Offs

The Northwind Customers, Employees, and Shippers tables have an artificially small set of records, so loading most or all rows is practical from a performance statement. The sample stored procedures load only those rows required for orders by US customers.

Another criterion might be date to return only the rows required for the last n days' orders. Such stored procedures require joins or subselects, so their complexity might impact performance. For example, the average cached data retrieval time for the sample page is 23 milliseconds for ordinary eager loading and 48 milliseconds for preloaded eager loading.

However, the performance is likely to be much better than that of the dynamic SQL "JOIN Query from Hell" described in the "David Hayden Comments on Soma's LINQ to SQL Performance Post" topic of my LINQ and Entity Framework Posts for 9/28/2007+ post.