Wednesday, November 23, 2005

Jim Gray Podcast About SQL Server 2005 (and Later)

The Aussie SQL Down Under site features frequent podcasts about new features of SQL Server 2005. Their latest (November 22, 2005) one-hour production features Microsoft Research's Jim Gray discussing the future of SQL Server, LINQ, and T-SQL, among a host of other SQL Server 2005 topics. To make access to particular topics easier, I've logged the WMA version of the podcast with brief descriptions of most major topics: 00:00 - Introduction, Jim Gray's CV, and how he came to Microsoft Research. 05:20 - Why it took five years to relase SQL Server 2005. "Database systems have become ecosystems in which you have the traditional tabular data store, an XML store, data mining, cubes, an extract-transfor-load service, a whole security model, management [applications], and self-tuning [features]." 07:00 - Unification of SQL Server and programming languages. The SQL Server team expected to ship V.Next in 2003, but underestimated the effort required to unify SQL Server and the .NET Framework. It was a very painful experience. 09:30 - Issues with feature currency and large development teams. The currency inside SQL Server is a dataset or a Tabular Data Stream, although we're gradually moving away from TDS toward the Web services model. The T-SQL command-in/dataset out model is today's key to unifying access to relational data, text, and XML. 11:15 - Release frequency. Annual releases are very destabilizing but less frequen releases result in huge changes instead of "lots of little ones." 13:45 - CLR, LINQ, and "T-SQL is dead" - "FORTRAN isn't dead." Any CLR projgram has T-SQL at its root. T-SQL is loosely typed and late-bound, so it's very easy to write. 17:05 - Looseness of T-SQL typing. LINQ is wonderful, but it's compiled and its data definitions are static. 19:30 - DB2 and Oracle are much more strict about data typing. T-SQL uses data-type coercion. Jim mentions an ANSI flag to prevent coercion (but I'm not aware such a flag exists.). 20:45 - LINQ. "I'm wildly enthusiastic about LINQ." Microsoft isn't very good at supporting embedded SQL because of type conflicts between T-SQL and programming languages. LINQ treats tables as a class; rows as objects. Tables are enumerable; you can do a For Each on a table or answer to a query. Tables are collections, so cursors go away. "The syntax is a little screwy to make IntelliSense work." 25:00 - What's the story on DLinq and XLinq? Both will become extremely important to folks who like to program in VB and C#. "It's one of the things that might attract you away from T-SQL because it really [offers] early-binding. The amount of gunk you need to write for ADO.NET to get the null program to work is just disgusting." The big selling point for LINQ is it's so easy to get started. 26:45 - CLR types vs. SQL types. "It may seem like a mismatch, but I declare everything to be a SQL type and everything works out great. ... A friend wrote the 'Null Memo,' which was an impassioned plea that we get rid of null values, but we had a group of theory guys who loved three-state logic. We're stuck with nulls." 29:00 - Object purists want to treat the database as a repository for objects. "Just put my objects in the database." The result is a fat table with many sparse columns, which pivots to a skinny table with three columns. 31:45 - Inheritence in LINQ. A LINQ table is a minimalist class that doesn't support much inheritence. The specification is mute about how interitence works in the LINQ model. One way inheritence would work is with "a universal relationship at the bottom." 34.00 - Inheritence in T-SQL. T-SQL doesn't have a class concept at all. It's so loosely typed that its only classes are tables, but you can't pass tables as parameters. "T-SQL is a great scripting language, but it's never going to be as clean as C#, ever, period." 35:15 - Break 37:00 - TerraServer, SkyServer, and spatial indexing in SQL Server 2005. 44:45 - Where are spatial applications heading? Billions of cellphones means that location services are central to future applications. Going beyond four dimensions (latitude, longitude, altitude, and time) is difficult. 48:10 - Very large databases. VLDBs are in our future and most will be spatially oriented. The goal is to tell users about things that are nearby. 49:40 - Evolution of SQL Server. "I came to Microsoft to scale up SQL Server. We've done a reasonable job of scaling up and scaling down, but we haven't done a good job of scaling out to self-organizing arrays of SQL Server instances. Over the next five years, we'll deliver on scale-out; what Oracle calls 'rack'. We're getting beat-up pretty badly about that, because it's the one thing we don't do." 52:00 - SQL Server parity with DB2 and Oracle. "We made a decision not to chase DB2 or Oracle tailpipes. Instead, we made SQL Server solve the next generation rather than the last generation problems. So we added data mining, automanagement, XML support, and a bunch of things we think are forward-looking." 52:20 - Limited resources caused a few things to slide. "In the next five years you'll see many things that were thrown out of the lifeboat just before SQL Server 2005 shipped: WinFS, LINQ, better integration with Visual Studio, more data mining rules, deeper XML support, and more Web services." 54:00 - Scaleout will have a major Web services component story. "Having Web services and Service Broker built into SQL Server means that you don't need IIS any more." 55:15 - What's coming up in Jim Gray's world? "We are working very hard to get scientific literature as well as scientific data online. PubMed Central is run by the National Library of Medicine (NLM) on SQL Server and has the abstracts—mostly in XMLish format—of all of the [NLM's] medical literature. The U.S. Congress has mandated that any research that the National Institutes of Healt (NIH) sponsors be deposited with the NLM and be published within six months of its publication in a journal. This is called taxpayer access, so if you get some exotic disease, you can go to the NLM and see the research that your tax dollars paid for, instead of paying $50 to get a copy of it. We've made a portable version of PubMed that's been installed in the U.K., Italy, and South Africa, and will be installed in Japan and elsewhere. The copies federate with one another using Web services. When a document is deposited in one place, it goes to all the other places. PubMed is a poster child for XML Web services." 59:03 - End Technorati: