Thursday, November 09, 2006

ADO.NET Sync Framework for Occasionally Connected Systems

WiFi Internet connectivity is becoming ubiquitous in first-world urban areas, but there are classes of data collection/management applications that must be operable during wired and/or wireless network outages—emergency and health services, for example—or exhibit dramatically decreased user productivity without a live network connection—field sales workers come to mind. Some occupations are inherently occasionally connected: governmental, academic, and corporate field workers in rural locations; land surveyors; forest rangers; environmental scientists; marine biologists and the like. Seldom, occasionally, and usually-connected users need the ability to cache data locally and synchronize local data with large back-end databases. The preferred caching and synchronization model is Outlook 2003/2007, where clients always work with their local data store and the server only propagates data changes to the client. Following are the most common methods of client-side caching and synchronization:

  • ADO.NET Datasets persisted as XML files
  • Remote Data Access (RDA) for SQL Mobile/SQLce databases
  • Merge replication to SQL Server 2005 (not Express)
  • Occasionally Connected Systems (OCS)Sync Framework (future)
ADO.NET Datasets Microsoft's developers designed ADO.NET Datasets to support data management applications for usually, occasionally, or seldom-connected mobile users. Datasets persist dynamic (read/write) and a subset of catalog (read-only, slowly changing) relational data in XML files. Datasets replace combinations of merge and snapshot replication with a lighter-weight, XML file-based system that uses DataAdapters to communicate with back-end RDBMSes. I described disk and RAM resource consumption issues with large ADO.NET 1.1 and 2.0 Datasets in my May 2004 "Store Large Lookup Tables in DataSets" article for Visual Studio Magazine (reading page 2 requires free registration.) The article's Table 1 shows the time to load Datasets of various sizes from XML files Although multi-MB Datasets are practical for today's laptop, tablet, and ultra-mobile (portable) PCs, their resource consumption precludes use on Windows CE, PocketPC, SmartPhone, and similar devices. Single-file relational databases, such as SQL Server 2005 Compact Edition (SQLce), are better suited than Datasets for portable PCs and devices when data storage requirements exceed a few MB. Remote Data Access RDA is the basic application for synchronizing data between SQL Server Mobile or, more recently, SQLce—formerly called SQL Server Everywhere (SQLEv). RDA pulls data from a remote SQL Server 2000 or 2005 table to a client's SQLce table. The SQLce database keeps track of changes changes to the pulled data and, on demand, pushes the changes to the back-end server. RDA also can execute CRUD (CREATE, RETRIEVE, UPDATE, and DELETE) operations on the remote server with Transact-SQL (T-SQL) statements. SQL Server doesn't push changes in its data to SQLce; thus, RDA is a one-way replication technology.

RDA connects to the back-end database with the HTTP or HTTPS protocols to Internet Information Services (IIS) on the server and pass compressed data through firewalls. Custom C#/VB code or the Connection Manager application handles the details of the connection. RDA is the only method of replicating data between SQL Server 2005 Express (SQLX) and SQLce, because SQLX (unlike MSDE 2000) isn't a merge replication publisher. RDA can detect but doesn't handle concurrency or referential integrity conflicts, and schema changes aren't allowed.

Merge Replication Replication is the traditional means of updating data between SQL Server instances, and merge replication is the most common type. Merge replication uses a different Connection Manager to handle communication with the back-end database using HTTP or HTTPS protocols to pass compressed data through firewalls. Merge replication is two-way, supports referential integrity, handles concurrency conflicts, and allows schema changes. Merge replication, including subscription and publication, is a heavyweight process compared to RDA push/pull operations, but is quite reliable. OCS Sync Framework The OCS Sync Framework involves a SyncAgent on the client that uses a SyncTable to manage a ClientSyncProvider for each type of local store, such as SQLce or SQLX. Server-side components include a ServerSyncProvider, SyncGroup, SyncAdapter, and SyncAdapter commands. The SyncAdapter and SyncAdapter commands (DBCommand objects) are ADO.NET database-agnostic objects so synchronization can occur with SQL Server, Oracle, IBM DB2 or any other RDBMSes that have an .NET native data provider. IncrementalInserts, IncrementalUpdates and IncrementalDeletes enable clients to request the server to update catalog data. The client and server can push changes directly with SQL Server TDS or another RDBMS's wire protocol when a physical network connection is available. Otherwise the client can push and pull changes with a proxy that connects to a service. Synchronization methods can be:
  • Pull only from server (unidirectional)
  • Push only to server (unidirectional)
  • Push and pull (bidirectional)
  • Snapshot (no synchronization)

The OCS Sync Framework appears to fall midway between the lightweight RDA and heavyweight merge replication method in operational complexity and resource consumption, but the current (very early) implementation requires a substantial amount of client code to specify synchronization type and SQL statements for DBCommands.

Live From Redmond: Smart Client: Offline Data Synchronization and Caching for Smart Clients is Steve Lasker's 94-minute Webcast that covers Dataset, RDA, merge replication, SQL Query Notifications, and a preview of the OCS Sync Framework (requires free registration). Steve is a Microsoft program manager/technical lead and informal evangelist for OCS, Windows forms, VB .NET, and ADO.NET Datasets. Steve also delivered a more general Live From Redmond: Visual Studio: Developing Local and Mobile Data Solutions with SQL Server Compact Edition Webcast on December 4. Steve also offers an Occasionally Connected Systems slide deck and a screencast that demonstrates how the OCS Sync Framework operates and VB WinForm code to synchronize a client with SQL Server. Most public OCS-related information presently comes from Steve's blog. His "Tech Ed 06 Slides and Demos on SQL Server Everywhere Edition" post has links to DEV351: Developing Local and Mobile Data Solutions with SQL Server Everywhere Edition and DEV343: Offline Data Synchronization and Caching with SQL Server Everywhere Edition.

Update: 12/9/2006: 12/4/2006 Webcast name updated.

Technorati tags: , , , , , , , , , , , , ,