Sunday, October 30, 2005

OpenOffice 2.0 Base Matches Microsoft Access? Surely You Jest.

Updated 10/30/2010: See end of post.

eWeek's October 21, 2005 "OpenOffice Packs a Powerful New Database Punch" article by Lisa Vaas opens with the following lead: "OpenOffice.org's latest update includes a database that matches Microsoft's popular and competing Access database, experts say." If you believe this sentence, I have a bridge to sell you.

Full Disclosure: I make a significant part of my living by writing books about and consulting on Microsoft Access and other Microsoft databases, development platforms, and operating systems.

I've been testing the RC2 and release version of OOo 2.0's Base feature with registered (i.e., attached or linked) SQL Server 2005/Express tables, plus Access 2003 (Jet) *.mdb and comma-separated-value (*.csv) files for the past 10 days or so. My first two sets of tests, which I've documented here and here, indicate that OOo 2.0 Base isn't even close to competitive with Access 2000+ as a desktop database development platform.

The only "expert" Ms. Vaas quoted—who wasn't an OpenOffice.org member—was RedMonk analyst Stephen O'Grady. Mr. O'Grady, however, admitted: "I have not evaluated [Base] to the extent that I can call it 100 percent at functional parity with Access". I can find no evidence at all in the article that any "expert" said or even implied that "OpenOffice.org's latest update includes a database that matches Microsoft's popular and competing Access database ...".

The Details

Ms. Vaas quotes Mr. O'Grady as saying, "Both large and small businesses have investments in Access as a database. From that standpoint, the capabilities in previous versions of OpenOffice.org just didn't cut it. They couldn't help businesses make the transition."

OOo Base 2.0 still won't "help businesses make the transition" with existing Access applications that include Access macros or VBA code, have crosstab queries, or use complex forms and reports. You'll need to manually convert Access macros and VBA code to OOo macros in the OOo Basic dialect, where possible. Base uses OOo Writer forms, which raise fewer events than Access data entry forms. Form controls also raise fewer events than their Access counterparts.

Data Entry Forms

You can't import Access forms to OOo Base. The OOo 2.0 Base Form Wizard lets you create a conventional single-table or master-details form. This master-details form displays a Northwind Orders row and its Order Details line items:

After you create the basic form design with the Wizard, you can customize it with common form controls (widgets). It's a major project to create a navigation form that's equivalent to one you can produce in a few minutes with Access's built-in Switchboard manager.

Formatted Reports

Nor can you import Access reports. The OOo 2.0 Base Report Wizard creates reports with HTML-like tables to display column names and values. There's no evident method for adding subreports, which are critical for most business applications. The following was about the best I could do with the Report Wizard and a complex query from the AdventureWorks 2005 database:

Access provides a Report Chart Wizard to generate reports that include charts whose data source is a table or query. I wasn't able to find a direct method of adding such a chart to an OOo Base report.

HSQLDB Files Are NOT Stored as XML

Ms. Vaas states on the second page of her article, "The full version of the HSQL database engine included in OOo Base stores data in XML files." HSQLDB stores data in a proprietary binary format as a data file that's stored in a compressed (ZIPped ) *.odb file.

Here's a capture of the first few records of the Northwind Customers table from the 1.5-MB decompressed data file that contains all tables from the Northwind sample database:

Stating that HSQLDB stores data in XML files seems to me to be a gratuitous attempt to represent HSQLDB storage as similar to the OpenDocument 1.0 XML format for other OOo documents. There is no way that anyone who had investigated HSQLDB's storage format could confuse the binary format with XML.

If OOo Base can incorporate proprietary binary formats in what are purported to be OpenDocument 1.0-compliant files, why not incorporate *.mdb, *.doc, *.xls, or *.ppt binary images in corresponding *.od? files?

Embedded HSQLDB Does NOT Have JDBC (or ODBC) Support

eWeek's Steven J. Vaughan-Nichols says in his October 20, 2005 "Why OpenOffice.org 2.0 Is Your Best Choice" column: "OpenOffice.org's Base Java-based HSQLDB database engine is a solid database. Its real selling point to me is that it has both good SQL and JDBC (Java Database Connect) support. In short, while I still look to FoxPro for PC-based databases, I can see using OpenOffice.org 2.0 to model a serious, server-based DBMS. The idea of even attempting that same job with Microsoft Access makes me ill."

OOo Base uses the embedded version of HSQLDB that stores its data in the *.ods file. According to Frank Schoenheit of Sun Microsystems Germany, there are no current plans to implement either a JDBC or ODBC driver for the embedded version. Embedded HSQLDB isn't a server-based DBMS and, according to Schoenheit, "is not capable of concurrent multi-user access."

Mr. Vaughan-Nichols offers no explanation why using Access as a desktop database development application makes him "ill." I find Access to be well-suited to creating sophisticated database front ends with minimal effort and development investment. He also mentions that OOo 2.0 lacks a grammar checker but doesn't mention OOo 2.0's omission of a Microsoft Outlook clone.

Query Designer and Relations(hips) Windows

OOo Base's Query Designer window is a da Vinci database tools clone. SQL Server 2000 Enterprise Manager, SQL Server 2005 Management Studio, Visual Studio 2005, and Access data projects (*.adp) use the da Vinci tools.

Here's an example of an T-SQL query against the AdventureWorks database running under SQL Server 2005 Express Edition:

The preceding query serves as the data source for the earlier report sample.

Here's an example of a graphical design for the preceding query:

On the whole, I found OOo Base's query design features to rival those of Access data projects. They probably are an adequate substitute for Access's default query-by-example (QBE) designer. Just don't try executing a Jet crosstab query with PIVOT and TRANSFORM keywords. SQL syntax error messages are cryptic, at best.

The Relations(hips) window also clones the Access version and enables specifying relations between primary and foreign key fields. This screen capture shows the relationships between the four AdventureWorks tables:

The Relations(hips) designer persists the table icons, but not the relation lines for SQL Server 2005 tables. However, the designer does save Jet database relationships correctly.

The Table Wizard for speeding creation of common table designs successfully duplicates the appearance and operation of Access's Table Wizard.

Exporting HSQLDB table data in formats other than spreadsheet or *.csv files doesn't appear to be supported. The 64K row limit for spreadsheet files would appear to limit the number of rows in a *.csv file.

Programming Event-Handlers

Like Access, OOo Base doesn't support recording macros. The OpenOffice.org API (Application Programming Interface) provides a set of platform- and language-independent interfaces for programming event handlers. Universal Network Objects (UNO) implements the API for languages that OOo supports. The 900-page OOo Developer's Guide provides API/UNO documentation with an emphasis on Java and C++ sample code.

Choosing Tools, Macros, Organize Macros displays a choice of four programming languages—OpenOffice.org Basic, JavaScript, BeanShell, and Python. Choosing OpenOffice.org Basic opens this dialog:

The OpenOffice.org Macros node contains sets of standard (global) read-only macros. Following is the OOo Basic IDE displaying the ChooseMarket event-handler:

You won't find many of the conveniences offered by Access's VBA Editor—such as IntelliSense, statement completion, and color-coded reserved words. Documentation for migrating from Access VBA to OOo Base macros—such as "Porting Excel/VBA to Calc/Star Basic," one of the how-to documents for OOo version 1.0—would enable developers to estimate the time required to port VBA code behind forms or in modules to OOo Basic.

Performance Comparisons

Unlike ZDNet's George Ou, I haven't published performance or resource consumption comparisons between Access 2003 and OOo Base at this point. Most of my tests so far have been conduced with with simple tables having fewer than 5,000 rows. I've noticed that opening an embedded HSQLDB clone of the Access 2000 Northwind database (tables only), executing SELECT and INSERT queries, and navigating records in a simple master-details form appear slower than the same operations in Access 2003. Mr. Ou recently gathered more detailed memory and CPU resource consumption comparisons by taking advantage of Sysinternals' free Process Explorer utility.

Comparing performance with directly-connected and linked Jet and SQL Server Northwind-style tables that have, say, 500,000 Orders rows and 2,000,000 Order Details rows and timing execution of the INSERT statements is relatively easy. What's not so simple is generating these rows and timing their addition to an embedded HSQLDB database. Importing *.csv data 64K rows at a time isn't my idea of a great way to spend a weekend, so I'll probably need to get up to speed writing lengthy DDL batches or OOo Basic macros to generate dynamic SQL for the project. Performance tests on complex queries also require evaluating the influence of indexes added to WHERE clause fields.

Conclusion

The development investment to migrate a moderately complex Access front end to OOo 2.0—even without moving the underlying data from the Jet *.mdb file to HSQLDB and the *.odb file—is likely to be much greater than the potential cost recovery from moving to free OOo 2.0. Changing from Windows to Linux requires adopting another RDBMS—such as MySQL, PostgreSQL, or Oracle 10g Express Edition—for multi-user applications or using the embedded HSQLDB for single-user projects that require OpenDocument 1.0-complient files. (This assumes that embedding table data in a compressed, proprietary binary format qualifies as OpenDocument-complient.) As mentioned earlier, embedded HSQLDB databases don't support multiple concurrent users.

Moving data from other RDBMS tables to HSQLDB appears to require comma-separated-value (CSV) or fixed-width files. Importing CSV files with date, time, or date/time fields is a very tedious process, as described in the "OpenOffice Base 2.0 vs. Microsoft Access 2003 - Part II HSQLDB Tables" post's "Importing CSV Dates, Times, or Both" section.

OOo 2.0's Base feature is version 1.0 and appears to have been rushed to release with the other updated OOo 2.0 applications. Base v.1.0 might be useful for simple database projects with one or two tables, such as address lists for mail merge operations or a set of Customers and Contacts tables. But from what I've seen so far, I wouldn't advise clients to migrate from Access multi-user production applications that require automated object navigation, or sophisticated forms and reports. Neither would I advise clients to invest developer time to gain proficiency in writing OOo Basic macros without assurance of substantial usability and developer improvements to Base in forthcoming upgrades.

Frank Shoenheit provides this insight on future OOo Base upgrades:

"This is what I would like to focus [on] for 2.x: Make Base capable to create database-driven 'solutions' or 'applications' or however you would call this. This includes the auto-run forms, dialog-base forms, reasonable scripting support, perhaps a more comprehensible form API, and much more.

So, my priorities here would be: Improve the application, so that people can work with it without too much database knowledge, and then improve (or expose) the export core functionality."

Of course, it might be a while before OOo 2.x with the capability of creating database-driven solutions or applications appears.

[Minor update on 10/23/2005]

According to ZDNet's David Berlind, "... the Massachusetts Senate Post Audit Committee has apparently called for hearing to take place at 1pm on Monday October 31st ..." to discuss "Massachusetts' recent ratification of the OpenDocument Format (ODF) as one of two standard file formats (the other is Adobe's Portable Document Format) that all state agencies must start using on January 1, 2007." ("Is Massachusetts' OpenDocument decision on the rocks?" October 24, 2005).

[Minor updates on 10/25/2005]

1. Is there a connection between Google's forthcoming Google Base and OpenOffice.org Base? The short-lived exposure of Google Base screen captures indicates that you'll be able to upload small database and other document files (<10MB) directly and files >=10MB to Google Base via FTP.

Google's following Terms of Service, however, might cause potential users to have second-thoughts about using Google Base as a future replacement for Intuit's QuickBase service.

“By submitting, posting or displaying Content on or through Google services which are intended to be available to the general public, you grant Google a worldwide, non-exclusive, royalty-free license to reproduce, modify, adapt, publish and otherwise use, with or without attribution such Content on Google services solely for the purpose of displaying, distributing and promoting Google services.”

There's no indication from the screen captures I've seen that one can restrict file access to specific users or user groups. Stay tuned for posts that point to public sample OOo Base (*.odb), Jet (*.mdb), and SQL Server (*.mdf/*.ldf) examples for linking after Google Base goes live. Search Engine Roundtable reports that Google Base would "be introduced during the 'Google Zeitgeist '05 Partner Forum' to be held [10/25 to 10/27/2005] at Google HQ in California."

My take is that the phantom preview was intended to demonstrate Google's hype factor—not their shadowy weltanshauung—to Zeitgeist attendees. 2. Information Week: "Microsoft Lines Up Politician Support In Mass. Format Battle" by W. David Gardner; ZDNet: "InfoWeek: OpenDocument tangled in MA political showdown" by David Berlind; and BetaNews: "Mass. Secretary Attacks Open Doc Plan" by Nate Mook.

[A Later Update on 10/30/2005]

William Grosso comments on the speculation surrounding Google Base and proposes a new unit of measure of Web-application effort named the craig. The effort unit reminds me of the Technocracy movement's attempt to quantify physical and intellectual work by the erg or Joule: "10,000 lines of debugged AJAX code is worth how many ergs?"

PBS's Robert X. Cringely weighs in with a pre-Google Base (10/20/2005) article titled "Energy Crisis: It's Hard to Make a Buck When You are Storing Everyone's Stuff for Free." Cringely notes that "NOW we know why Google bought those 30 acres on the Columbia River in Oregon right next to a generating station from the Bonneville Power Administration. It's a source of cheap, uninterruptible power."

Cringely's latest (10/27/2005) piece, "Changing the Guard: Retiring Baby Boomers Are Going to Invigorate open Source," postulates that future retirees with IT talent will devote their golden years to writing code for Open Source projects.

Technorati: