Monday, January 22, 2007

More LINQ to XSD Rumblings

It's been quiet on the LINQ to XSD (a.k.a. "l2xsd") front since Microsoft Data Programmability Team program manager Ralf Lämmel announced the public preview on November 27, 2006 with the release of the Alpha 0.1 version.

Mike Champion gave LINQ to XSD a boost in his "Convergence Zones" 2007 preview for the future of XML post:

The better the XML is hidden, the more the bulk of the world likes it. Being out of sight, it's also out of mind. Few noticed that "Asynchronous JavaScript And XML" actually refers to the XmlHttpRequest API much more than XML content, so the substitution of JSON for XML as the typical bits on the wire format caused little concern. Actually that understates the case—most people are happy to use more familiar technologies instead of XML or in front of XML. What might we predict from this pattern?

  • I'll make a (self serving!) prediction that the LINQ approach of focusing on what is common across data formats will get more mainstream traction than will the notion that users want specialized tools for their XML data.
  • Tools that make it relatively easy to consume XML directly into programming objects such as LINQ to XSD will continue to mature technically and be adopted by pragmatic developers.

Helping LINQ to XSD to "continue to mature technically" is a new technical paper, "Programming with Triangular Circles," by Ralf Lämmel and Dave Remy, presented at the XML 2006 conference in Boston on December 5, 2006. Here's the introduction and abstract:

When deriving object types from schema types, developers are impaired by the infamous X/O impedance mismatch. We deliver a typed programming approach for XML that greatly reduces the impairment by leveraging rather than abandoning XML semantics.

Given the capabilities of OO programming, its maturity and generality, it is a sane expectation to adopt OO as the paradigm of choice for XML processing. To serve that expectation, one would map XML schemas to object models so that XML data can be processed through familiar objects. The first generation of mapping technologies has been somewhat disappointing and the term X/O impedance mismatch has been coined in this context. By devising a mapping that caters for XML-aware objects as typed views on untyped XML trees (as opposed to plain objects with fields), and by leveraging functional OO programming, one can actually deliver on the expectation to adopt the (enriched) OO paradigm for XML processing.

Ralf's "Style Normalization for Canonical X-to-O Mappings" paper for the ACM SIGPLAN 2007 Workshop on Partial Evaluation and Program Manipulation (PEPM 2007), held in Nice, France, on January 15 and 16, 2007 tackles "normalization of schema-organization styles and its support by automated transformations." Here's the first paragraph of the Introduction and the abstract:

This paper deals with the overall problem of mapping XML schemas to object models. (We use the term X-to-O mapping from here on.) More specifically, this paper devises transformations for normalizing styles of schema organization as a separate aspect of X-to-O mappings. This separation of concerns is helpful in attacking the so-called X/O impedance mismatch—the overall difficulty of viewing domain-specific XML data as objects of a schema-derived object model in a satisfactory manner.

An X-to-O mapping takes an XML schema as input and returns an object model as output; this object model is meant for programmatic, schema-aware access to XML data. The provision of X-to-O mappings involves various challenges; one of them is addressed by the present paper: variation in style of schema organization, which should not unduly affect the outcome of X-to-O mappings. We devise transformations for style normalization (and conversion); these transformations operate at both levels of the X-to-O mapping: schemas and object models. An important byproduct of the present work is to showcase functional OO programming as a viable setup for devising software transformations.

Ralf also presented at XML 2006 the LINQ to XML-based "API-based XML Streaming with FLWOR Power and Functional Updates" paper with this introduction and abstract:

We enable streaming for an in-memory XML API so that a convenient (declarative) programming model is preserved. Some restrictions are to be imposed on the streaming mode of the API, thereby achieving also predictability and controllability of the streaming power.

The size of XML trees that can be processed by an XML API for declarative, in-memory processing is bound by the available memory. As a result, certain scenarios involving huge trees or many processes on large trees may fail to be feasible with (declarative) in-memory processing; lower-level APIs for XML parsing and unparsing must be reconsidered. As a remedy, we enable streaming for an in-memory XML API so that the convenient (declarative) programming model of the baseline API is preserved. Some restrictions are to be imposed on the streaming mode of the API, thereby achieving also predictability and controllability of the streaming power. The presented development uses LINQ to XML (formerly called XLinq) as the baseline API and works out one particular API extension for a streaming-enabled LINQ to XML API. As an important byproduct of this effort, functional updates are proposed, which complement the existing, imperative updates of the LINQ to XML API.

Lets hope both these technologies make Visual Studio's Orcas release. A post RTM add-on might suffice for the streaming API, but LINQ to XSD definitely deserves the resources required to make the VS vNext cut.

Technorati Tags: LINQ to XSD, LINQ to XML, XLinq, C# 3.0, VB 9.0, XML, XSD, XML Schema, Orcas