Sunday, October 30, 2005

OpenOffice 2.0 Base Matches Microsoft Access? Surely You Jest.

Updated 10/30/2010: See end of post.

eWeek's October 21, 2005 "OpenOffice Packs a Powerful New Database Punch" article by Lisa Vaas opens with the following lead: "OpenOffice.org's latest update includes a database that matches Microsoft's popular and competing Access database, experts say." If you believe this sentence, I have a bridge to sell you.

Full Disclosure: I make a significant part of my living by writing books about and consulting on Microsoft Access and other Microsoft databases, development platforms, and operating systems.

I've been testing the RC2 and release version of OOo 2.0's Base feature with registered (i.e., attached or linked) SQL Server 2005/Express tables, plus Access 2003 (Jet) *.mdb and comma-separated-value (*.csv) files for the past 10 days or so. My first two sets of tests, which I've documented here and here, indicate that OOo 2.0 Base isn't even close to competitive with Access 2000+ as a desktop database development platform.

The only "expert" Ms. Vaas quoted—who wasn't an OpenOffice.org member—was RedMonk analyst Stephen O'Grady. Mr. O'Grady, however, admitted: "I have not evaluated [Base] to the extent that I can call it 100 percent at functional parity with Access". I can find no evidence at all in the article that any "expert" said or even implied that "OpenOffice.org's latest update includes a database that matches Microsoft's popular and competing Access database ...".

The Details

Ms. Vaas quotes Mr. O'Grady as saying, "Both large and small businesses have investments in Access as a database. From that standpoint, the capabilities in previous versions of OpenOffice.org just didn't cut it. They couldn't help businesses make the transition."

OOo Base 2.0 still won't "help businesses make the transition" with existing Access applications that include Access macros or VBA code, have crosstab queries, or use complex forms and reports. You'll need to manually convert Access macros and VBA code to OOo macros in the OOo Basic dialect, where possible. Base uses OOo Writer forms, which raise fewer events than Access data entry forms. Form controls also raise fewer events than their Access counterparts.

Data Entry Forms

You can't import Access forms to OOo Base. The OOo 2.0 Base Form Wizard lets you create a conventional single-table or master-details form. This master-details form displays a Northwind Orders row and its Order Details line items:

After you create the basic form design with the Wizard, you can customize it with common form controls (widgets). It's a major project to create a navigation form that's equivalent to one you can produce in a few minutes with Access's built-in Switchboard manager.

Formatted Reports

Nor can you import Access reports. The OOo 2.0 Base Report Wizard creates reports with HTML-like tables to display column names and values. There's no evident method for adding subreports, which are critical for most business applications. The following was about the best I could do with the Report Wizard and a complex query from the AdventureWorks 2005 database:

Access provides a Report Chart Wizard to generate reports that include charts whose data source is a table or query. I wasn't able to find a direct method of adding such a chart to an OOo Base report.

HSQLDB Files Are NOT Stored as XML

Ms. Vaas states on the second page of her article, "The full version of the HSQL database engine included in OOo Base stores data in XML files." HSQLDB stores data in a proprietary binary format as a data file that's stored in a compressed (ZIPped ) *.odb file.

Here's a capture of the first few records of the Northwind Customers table from the 1.5-MB decompressed data file that contains all tables from the Northwind sample database:

Stating that HSQLDB stores data in XML files seems to me to be a gratuitous attempt to represent HSQLDB storage as similar to the OpenDocument 1.0 XML format for other OOo documents. There is no way that anyone who had investigated HSQLDB's storage format could confuse the binary format with XML.

If OOo Base can incorporate proprietary binary formats in what are purported to be OpenDocument 1.0-compliant files, why not incorporate *.mdb, *.doc, *.xls, or *.ppt binary images in corresponding *.od? files?

Embedded HSQLDB Does NOT Have JDBC (or ODBC) Support

eWeek's Steven J. Vaughan-Nichols says in his October 20, 2005 "Why OpenOffice.org 2.0 Is Your Best Choice" column: "OpenOffice.org's Base Java-based HSQLDB database engine is a solid database. Its real selling point to me is that it has both good SQL and JDBC (Java Database Connect) support. In short, while I still look to FoxPro for PC-based databases, I can see using OpenOffice.org 2.0 to model a serious, server-based DBMS. The idea of even attempting that same job with Microsoft Access makes me ill."

OOo Base uses the embedded version of HSQLDB that stores its data in the *.ods file. According to Frank Schoenheit of Sun Microsystems Germany, there are no current plans to implement either a JDBC or ODBC driver for the embedded version. Embedded HSQLDB isn't a server-based DBMS and, according to Schoenheit, "is not capable of concurrent multi-user access."

Mr. Vaughan-Nichols offers no explanation why using Access as a desktop database development application makes him "ill." I find Access to be well-suited to creating sophisticated database front ends with minimal effort and development investment. He also mentions that OOo 2.0 lacks a grammar checker but doesn't mention OOo 2.0's omission of a Microsoft Outlook clone.

Query Designer and Relations(hips) Windows

OOo Base's Query Designer window is a da Vinci database tools clone. SQL Server 2000 Enterprise Manager, SQL Server 2005 Management Studio, Visual Studio 2005, and Access data projects (*.adp) use the da Vinci tools.

Here's an example of an T-SQL query against the AdventureWorks database running under SQL Server 2005 Express Edition:

The preceding query serves as the data source for the earlier report sample.

Here's an example of a graphical design for the preceding query:

On the whole, I found OOo Base's query design features to rival those of Access data projects. They probably are an adequate substitute for Access's default query-by-example (QBE) designer. Just don't try executing a Jet crosstab query with PIVOT and TRANSFORM keywords. SQL syntax error messages are cryptic, at best.

The Relations(hips) window also clones the Access version and enables specifying relations between primary and foreign key fields. This screen capture shows the relationships between the four AdventureWorks tables:

The Relations(hips) designer persists the table icons, but not the relation lines for SQL Server 2005 tables. However, the designer does save Jet database relationships correctly.

The Table Wizard for speeding creation of common table designs successfully duplicates the appearance and operation of Access's Table Wizard.

Exporting HSQLDB table data in formats other than spreadsheet or *.csv files doesn't appear to be supported. The 64K row limit for spreadsheet files would appear to limit the number of rows in a *.csv file.

Programming Event-Handlers

Like Access, OOo Base doesn't support recording macros. The OpenOffice.org API (Application Programming Interface) provides a set of platform- and language-independent interfaces for programming event handlers. Universal Network Objects (UNO) implements the API for languages that OOo supports. The 900-page OOo Developer's Guide provides API/UNO documentation with an emphasis on Java and C++ sample code.

Choosing Tools, Macros, Organize Macros displays a choice of four programming languages—OpenOffice.org Basic, JavaScript, BeanShell, and Python. Choosing OpenOffice.org Basic opens this dialog:

The OpenOffice.org Macros node contains sets of standard (global) read-only macros. Following is the OOo Basic IDE displaying the ChooseMarket event-handler:

You won't find many of the conveniences offered by Access's VBA Editor—such as IntelliSense, statement completion, and color-coded reserved words. Documentation for migrating from Access VBA to OOo Base macros—such as "Porting Excel/VBA to Calc/Star Basic," one of the how-to documents for OOo version 1.0—would enable developers to estimate the time required to port VBA code behind forms or in modules to OOo Basic.

Performance Comparisons

Unlike ZDNet's George Ou, I haven't published performance or resource consumption comparisons between Access 2003 and OOo Base at this point. Most of my tests so far have been conduced with with simple tables having fewer than 5,000 rows. I've noticed that opening an embedded HSQLDB clone of the Access 2000 Northwind database (tables only), executing SELECT and INSERT queries, and navigating records in a simple master-details form appear slower than the same operations in Access 2003. Mr. Ou recently gathered more detailed memory and CPU resource consumption comparisons by taking advantage of Sysinternals' free Process Explorer utility.

Comparing performance with directly-connected and linked Jet and SQL Server Northwind-style tables that have, say, 500,000 Orders rows and 2,000,000 Order Details rows and timing execution of the INSERT statements is relatively easy. What's not so simple is generating these rows and timing their addition to an embedded HSQLDB database. Importing *.csv data 64K rows at a time isn't my idea of a great way to spend a weekend, so I'll probably need to get up to speed writing lengthy DDL batches or OOo Basic macros to generate dynamic SQL for the project. Performance tests on complex queries also require evaluating the influence of indexes added to WHERE clause fields.

Conclusion

The development investment to migrate a moderately complex Access front end to OOo 2.0—even without moving the underlying data from the Jet *.mdb file to HSQLDB and the *.odb file—is likely to be much greater than the potential cost recovery from moving to free OOo 2.0. Changing from Windows to Linux requires adopting another RDBMS—such as MySQL, PostgreSQL, or Oracle 10g Express Edition—for multi-user applications or using the embedded HSQLDB for single-user projects that require OpenDocument 1.0-complient files. (This assumes that embedding table data in a compressed, proprietary binary format qualifies as OpenDocument-complient.) As mentioned earlier, embedded HSQLDB databases don't support multiple concurrent users.

Moving data from other RDBMS tables to HSQLDB appears to require comma-separated-value (CSV) or fixed-width files. Importing CSV files with date, time, or date/time fields is a very tedious process, as described in the "OpenOffice Base 2.0 vs. Microsoft Access 2003 - Part II HSQLDB Tables" post's "Importing CSV Dates, Times, or Both" section.

OOo 2.0's Base feature is version 1.0 and appears to have been rushed to release with the other updated OOo 2.0 applications. Base v.1.0 might be useful for simple database projects with one or two tables, such as address lists for mail merge operations or a set of Customers and Contacts tables. But from what I've seen so far, I wouldn't advise clients to migrate from Access multi-user production applications that require automated object navigation, or sophisticated forms and reports. Neither would I advise clients to invest developer time to gain proficiency in writing OOo Basic macros without assurance of substantial usability and developer improvements to Base in forthcoming upgrades.

Frank Shoenheit provides this insight on future OOo Base upgrades:

"This is what I would like to focus [on] for 2.x: Make Base capable to create database-driven 'solutions' or 'applications' or however you would call this. This includes the auto-run forms, dialog-base forms, reasonable scripting support, perhaps a more comprehensible form API, and much more.

So, my priorities here would be: Improve the application, so that people can work with it without too much database knowledge, and then improve (or expose) the export core functionality."

Of course, it might be a while before OOo 2.x with the capability of creating database-driven solutions or applications appears.

[Minor update on 10/23/2005]

According to ZDNet's David Berlind, "... the Massachusetts Senate Post Audit Committee has apparently called for hearing to take place at 1pm on Monday October 31st ..." to discuss "Massachusetts' recent ratification of the OpenDocument Format (ODF) as one of two standard file formats (the other is Adobe's Portable Document Format) that all state agencies must start using on January 1, 2007." ("Is Massachusetts' OpenDocument decision on the rocks?" October 24, 2005).

[Minor updates on 10/25/2005]

1. Is there a connection between Google's forthcoming Google Base and OpenOffice.org Base? The short-lived exposure of Google Base screen captures indicates that you'll be able to upload small database and other document files (<10MB) directly and files >=10MB to Google Base via FTP.

Google's following Terms of Service, however, might cause potential users to have second-thoughts about using Google Base as a future replacement for Intuit's QuickBase service.

“By submitting, posting or displaying Content on or through Google services which are intended to be available to the general public, you grant Google a worldwide, non-exclusive, royalty-free license to reproduce, modify, adapt, publish and otherwise use, with or without attribution such Content on Google services solely for the purpose of displaying, distributing and promoting Google services.”

There's no indication from the screen captures I've seen that one can restrict file access to specific users or user groups. Stay tuned for posts that point to public sample OOo Base (*.odb), Jet (*.mdb), and SQL Server (*.mdf/*.ldf) examples for linking after Google Base goes live. Search Engine Roundtable reports that Google Base would "be introduced during the 'Google Zeitgeist '05 Partner Forum' to be held [10/25 to 10/27/2005] at Google HQ in California."

My take is that the phantom preview was intended to demonstrate Google's hype factor—not their shadowy weltanshauung—to Zeitgeist attendees. 2. Information Week: "Microsoft Lines Up Politician Support In Mass. Format Battle" by W. David Gardner; ZDNet: "InfoWeek: OpenDocument tangled in MA political showdown" by David Berlind; and BetaNews: "Mass. Secretary Attacks Open Doc Plan" by Nate Mook.

[A Later Update on 10/30/2005]

William Grosso comments on the speculation surrounding Google Base and proposes a new unit of measure of Web-application effort named the craig. The effort unit reminds me of the Technocracy movement's attempt to quantify physical and intellectual work by the erg or Joule: "10,000 lines of debugged AJAX code is worth how many ergs?"

PBS's Robert X. Cringely weighs in with a pre-Google Base (10/20/2005) article titled "Energy Crisis: It's Hard to Make a Buck When You are Storing Everyone's Stuff for Free." Cringely notes that "NOW we know why Google bought those 30 acres on the Columbia River in Oregon right next to a generating station from the Bonneville Power Administration. It's a source of cheap, uninterruptible power."

Cringely's latest (10/27/2005) piece, "Changing the Guard: Retiring Baby Boomers Are Going to Invigorate open Source," postulates that future retirees with IT talent will devote their golden years to writing code for Open Source projects.

Technorati:

Saturday, October 29, 2005

Oracle Releases Free 10g Express Edition Beta

Not to be outdone by Microsoft's October 28, 2005 posting of the release version of SQL Server 2005 Express Edition (requires subscription) on MSDN, Oracle the same day posted a beta version of Oracle 10g Release 2 Express Edition for Linux x86 on the Oracle Technical Network (OTN). The Oracle 10.2 XE download, which requires an OTN account, weighs in at 198.5 MB, while SQL Server 2005 Express is a more svelte 54 MB. Martin Lamonica's "Oracle to offer free database" CNet article, also of 10/28/2005, quotes Oracle's Andrew Mendelsohn, senior vice president of Oracle's server technologies division:

There is definitely a market there [for low-end databases] and a demand. And we want them to be using Oracle and not MySQL or SQL Server Express. It's definitely a reaction to the market interest. Even though the database is initially free, standards progress and those university students who are playing with the database today will eventually be working at corporations and making product decisions. We want to have mind-share with those people.

Oracle 10g and SQL Server 2005 Express editions share a 1-CPU with 1-GB RAM limit and a 4-GB maximum database size. IBM has offered a 90-day trial version of its IBM DB2 Universal Database Express Edition since 10/9/2004, and a free download of its DB2 v8.2.3 Personal Developers Edition for single-user desktop database applications since 9/23/2005 (requires registration). The DB2 Express Edition supports up to 2 CPUs and 4 GB RAM on Linux and Windows with these license prices. IBM's Express Edition appears to have been an attempt to counter Oracle Database Standard Edition One—introduced more than a year ago—that carries a license fee of $149 per user. In my opinion, DB2 Personal Developer Edition is likely to become a victim of the "free database" wars as a result of its single-user, desktop-database stigma. The future of Oracle Standard Edition One, which was aimed primarily at workgroups, also is open to question in the current environment. In the meantime, check out the SQL Server 2005 Express Edition specs. The last-minute demise of SQL Server Express Manager for Express database management in favor of a promised restriced version of SQL Server 2005 Management Studio might dissuade developers from adopting SQL Server 2005 Express Edition until Microsoft releases a replacement set of visual management tools. Update: The November 2005 Community Technical Preview of SQL Server Management Studio Express (SSMSE) is now available for download. --rj Technorati:

Saturday, October 22, 2005

OpenOffice Base 2.0 vs. Microsoft Access 2003 - Part II HSQLDB Tables

As I mentioned in Part I of this series, OOo Base is a desktop-database application that can perform CRUD (create, retrieve, update, and delete) operations on tables from almost any commercial or open-source RDBMS (relational database management system) that has ODBC, JDBC, or OLE DB drivers. OOo Base supports Jet (Microsoft Office Access 2003 and earlier), MySQL, PostgreSQL, Oracle, DB2, SQL Server, SQL Server Express, and MSDE databases, as well as many others.

Note: This and succeeding parts of the series were written based on the OOo 2.0 release version of 10/20/2005.

OOo/SO users and developers who have existing production databases will find that registering the database with—a.k.a, attaching database tables to—a new DatabaseName.odb file is considerably easier than adding and populating new, embedded HSQLDB tables. However, OOo 2.0/SO 8 will be many users' first introduction to PC databases. OOo considers HSQLDB to be "targeted at unexperienced users" or for the small-office/home-office (SOHO) market. Most of the targe market won't have existing databases, relational or otherwise. Thus, the capbility to easily create a useful database from scratch is an important feature of any desktop database application.

Why Use the Embedded HSQLDB Database?

The justifications proposed for moving from proprietary (vendor-specific binary or XML) file formats to a single standards-based schema—Relax NG instead of W3C-standard XML Schema 1.0, unfortunately—clearly apply to relational databases. Vendor lock-in by proprietary RDBMS file storage, metadata, and extensions to ANSI-standard SQL has an exclusionary effect that's identical to the use of Microsoft Word, Excel, and PowerPoint binary files. OOo 2.0/SO 8 with HSQLDB eliminate platform (i.e., Windows) lock-in; OOo 2.0/SO 8 versions are available for Windows, Linux (x86), Solaris (x86), and Solaris (SPARC). OOo 2.0 offers versions for Linux (PPC), FreeBSD, and Macintosh OS X also. (Some operations described in this post might not work with the OOo X11 release for the Mac).

The primary market-related difference between office software suites and RDBMSs is that no RDBMS (including Access) has a market share that's even close to that of Microsoft Office Standard edition. The Yankee Group says Microsoft Office has 90% of the market, according to Computer Reseller News. eWeek's Lisa Vaas reported in March 2005 that Oracle had 41.3% of the RDBMS market, followed by IBM with 30.6% and Microsoft SQL Server with 13.4%. Oracle's RDBMS market share is less than half Microsoft Office's share of the overall office suite market. However, it's generally accepted that Access—included with Microsoft Office Professional and higher—has a virtual lock on the Windows desktop database category. The cost differential between Office 2003 Standard and Professional purchased from Internet merchants ranges from US$21 to US$166 for the standard SKU and US$58 to US$90 for the upgrade SKU.

Filemaker Pro 8 Advanced is Access's only serious desktop-database competition on the PC, but costs substantially more than the Office Standard-to-Pro increment. (Plain Filemaker Pro 8 abandons many Access developer features to achieve simplicity.)

Currently, Microsoft Office Access 2003 is the standard of comparison for all desktop database applications, as noted in the lead paragraph of Lisa Vaas's October 21, 2005 eWeek article. Access's default Jet (*.mdb) binary format for storing all database objects—tables, query definitions, form and report designs, and VBA code modules—is proprietary and, of course, Windows-only. Many organizations upsize Access applications to take advantage of more robust client/server databases—such as Microsoft SQL Server 2000 or the no-charge MSDE 2000—by linking (attaching) tables to a Jet file or connecting directly to the database with Access data projects (*.adp).

Jet databases are suited for multi-user applications; HSQLDB isn't capable of concurrent multi-user access. Jet has ODBC and OLE DB drivers, but HSQLDB has no ODBC or JDBC drivers—the current JDBC implementation doesn't support *.odb files. Disregarding OOo Base's form, report, and programming limitations, HSQLDB's single-user restriction, questions about data security and user permissions, and lack of drivers for data stored in OpenDocument files are issues that are likely to dissuade most potential adopters from using the built-in HSQLSDB database. Almost all commercial OOo Base implementations will undoubtedly use a "commercial" open-source database—such as MySQL (with the Oracle-owned InnoBase engine) or PostgreSQL. If you must adopt the OASIS Open Document Format for Office Applications (OpenDocument) 1.0 specification for your single-user database projects, as well as for word-processing, spreadsheet, and presentation applications, the embedded HSQLDB RDBMS appears to be your only choice.

Only OOo 2.0/SO 8 with HSQLDB—to my knowledge—claim to store local table data in OpenDocument 1.0's all-encompassing Package format. The OpenDocument 1.0 specification's Section 6.5 defines Database Fields, section 8.6 specifies Database Ranges, section 11 decrees Form Content, and section 17 describes the ZIP-compressed Package format. Database Fields and Database Ranges refer to databases in Calc spreadsheets, not Base databases.

Files in a Base Package (*.odb File) for an HSQLDB Embedded Database

A DatabaseName.odb file for a Base project with an embedded HSQLDB file contains the following files:

  • manifest.xml provides a list of all files the in the package; the media type of the files, if applicable; and the information to decrypt the file, if it's encrypted.
  • context.xml contains data source settings and related database information, including forms and reports (if applicable).
  • settings.xml contains application configuration data.
  • mimetype is an uncompressed text file that contains application/vnd.sun.xml.base.
  • data contains all cached table data in compressed binary format.
  • properties contains is a compressed list of default property values for the database, such as hsqldb.default_table_type=cached.
  • script contains compressed, readable SQL DDL statements that define the database and its tables. The script file is executed when starting the database.
  • backup is a compressed snapshot of the data created when starting the database.
  • log is a redo log that consists of an SQL script for all database changes (DDL statements) executed since updating the backup file; the log file is deleted after re-creating the backup file.
Here's WinZip's window for an HSQLDB clone of the Northwind.mdb sample database:

You can use WinZip or any other Windows-based ZIP utility to view the XML files in Internet Explorer. You can view the other files in Notepad or the utility's ASCII viewer. The data, properties, script, backup, and log files are specific to HSQLDB and thus should be considered proprietary. However, section 17 of the OpenDocument 1.0 specification doesn't appear to impose any limitation on the content or structure of binary files.

Open Issue: If *.odb files that contain HSQLDB files with proprietary binary formats qualify as OpenDocument 1.0-compliant, embedding other proprietary, binary-formatted files should receive equal treatment. Replacing HSQLDB with an embedded, compressed Jet *.mdb file might provide many missing features. Adding compressed Excel, Word, or PowerPoint binaries to *.ods files might provide better performance and interesting new features.

HSQLDB Does NOT Store Data in XML Files

Several members of the trade press, including bloggers, have reported that HSQLDB stores table data as XML files. The latest instance is by Lisa Voss in eWeek's "OpenOffice Packs a Powerful New Database Punch" article (October 21, 2005, page 2): "The full version of the HSQL database engine included in OOo Base stores data in XML files." HSQLDB stores data in a proprietary binary format, as noted in the preceding section's description of the data file that's stored in a compressed (ZIP) file.

Here's a capture of the first few records of the Customers table from the 1.5-MB decompressed data file:

Stating that HSQLDB stores data in XML files seems to me to be a gratuitous attempt to represent HSQLDB storage as similar to the OpenDocument 1.0 XML format for other OOo documents. There is no way that anyone who had investigated HSQLDB's storage format could confuse the binary format with XML.

Generating HSQLDB Tables with the Wizard After you create and register a new HSQLDB database, click the Tables and Use Wizard to Create Table icons to start the Table Wizard. This Wizard—a knock-off of Access's Table Wizard—offers 22 business-oriented and 15 personal-style table designs from which to select.

Make sure to set the order of the Selected fields before clicking Next. It's almost impossible to change the field sequence in Table Edit mode.

The next step is to change data types and other field properties as required, and add or delete fields:

HSQLDB supports most SQL-92 data types, as shown in this list:

HSQLDB strings and [VAR]CHAR data types are Unicode by default.

The next Wizard dialog lets you add a primary key or select one of the existing fields as the primary key:

The final step offers the option of taking you directly to the data entry grid or editing the table design. Unfortunately, existing column properties are read-only, so your only option is to add new columns and delete the old ones. Another Access feature that's missing in the HSQLDB table design grid is the ability to alter the ordinal columns position by dragging column icons.

After you've added the tables you need for an applicaton, choose Tools, Relations to establish relationships between primary and foreign keys in the tables. The later "Establishing Relationships between Tables" section describes the Relations window.

Creating and Populating HSQLDB Tables with DDL Batches

You can define and add tables to an HSQLDB database with conventional ANSI SQL DDL statements. If you want to include lower-case letters or special characters in table and field names, you must use quoted identifiers. Open the HSQL Database Engine Documentation pages and click the SQL Syntax (Chapter 9) link to review HSQLDB's SQL-9x implementation. Choose Tools, SQL to open the Execute SQL Statement dialog to execute non-SELECT (a.k.a. "action") queries or commands. Here's an example of an SQL-92 batch to clone the Northwind Orders table:

CREATE CACHED TABLE "Orders"(
  "OrderID" INTEGER IDENTITY PRIMARY KEY,
  "CustomerID" CHAR(5) NULL,
  "EmployeeID" INTEGER NULL,
  "OrderDate" DATE NULL,
  "RequiredDate" DATE NULL,
  "ShippedDate" DATE NULL,
  "ShipVia" INTEGER NULL,
  "Freight" DECIMAL NULL,
  "ShipName" VARCHAR(40) NULL,
  "ShipAddress" VARCHAR(60) NULL,
  "ShipCity" VARCHAR(15) NULL,
  "ShipRegion" VARCHAR(15) NULL,
  "ShipPostalCode" VARCHAR(10) NULL,
  "ShipCountry" VARCHAR(15) NULL,
CONSTRAINT "PK_Orders" UNIQUE ("OrderID"))
You then write conventional INSERT statements to add rows to the table. HSQLDB expects DATE or TIMESTAMP literals in the 'YYYY-MM-DD [HH:MM:MM.######]' format. Obviously, the preceding approach won't appeal to "unexperienced" database users who, by definition, aren't likely to know how to write SQL DDL/DML commands. If you replace the optional (default) CACHED modifier with TEXT, you (theoretically) can specify a TableName.csv file in the same folder as the data source to populate the table when the database loads with the following instruction:
SET TABLE "Orders" SOURCE "Orders;ignore_first=true"

Chapter 6, "Text Tables" describes the syntax for the CREATE TABLE and SET TABLE instructions. The preceding syntax doesn't add rows to the table, despite the appearance of a "Command successfully executed" message. You must manually modify most source *.csv files, use an undocumented SET TABLE syntax, and abandon UTF-8 or Unicode for text fields. The following grid demonstrates how text tables replace accented characters with a symbol (diamond and embedded ?):

Creating and Populating HSQLDB Tables from CSV Files It's a common business task to replace multiple, large spreadsheets with desktop database applications that consist of relational tables, data-entry forms and reports. You can import CSV files into an OOo Calc spreadsheet or open a *.ods file. In either case, the first row of the spreadsheet must contain the column names. You create the table structure and populate it by dragging a Database range from the spreadsheet to an empty area of the Base Tables window. This process is much simpler than the SQL batch approach described in the preceding section.

The drag-and-drop technique is straightforward, if your CSV files don't contain dates. Importing to HSQLDB tables date from CSV files is a can of worms.

Here's the drill for tables with no DATE, TIME, or TIMESTAMP fields:

1. Create and register a new HSQLDB database, if you haven't already done so, and click the Tables icon.

2. Open With OpenOpenOffice.org the *.csv or *.ods file in Windows Explorer. If you open a *.csv file, click OK to accept the default CSV separator, text delimiter, and cell data types:

3. Select the upper-left cell of the Database range, usually A1:

4. Choose Data, Define Range to open the Define Database Range dialog. Type a name for the range, verify the upper-left and lower-right cell coordinates, and click OK.

5. With the entire Database range selected in the spreadsheet, drag the selection to the Base window's Tables pane to open the Copy Table Wizard's first dialog.

6. Change the Table Name from Table1 to the new table name, accept the remaining defaults, and click to open the Apply Columns dialog. (Do not add a new primary key column.)

7. Click the >> buttton to add all columns from the Existing Columns list, and click Next.

8. Set the columns' data types and other properties in the Column Information controls. Note that accepting the default IDENTITY modifier as the Auto-Increment Statement doesn't define ProductID as an AutoField column.

9. Click Create to add the table and data to the current database. If the table doesn't appear in the Tables list, choose View, Refresh tables.

Setting the Primary Key and Specifying an Auto-Increment (Auto-Value) Column

I found that setting a primary key and specifying the column as an identity (Auto-Increment or AutoField) column requires executing two DDL statements, such as these for the Products table, in the Execute SQL Statement dialog:

ALTER TABLE "Products" ADD PRIMARY KEY("ProductID"); ALTER TABLE "Products" ALTER COLUMN "ProductID" INTEGER IDENTITY PRIMARY KEY;

To verify the primary key, choose View, Refresh Tables, right-click the table name in the Tables list, choose Edit, and verify that the row for the primary key column contains a miniature key icon. To verify that the column is an IDENTITY column, open the table in datasheet view, click the new row button and look for the <AutoValue> entry in the otherwise-empty row.

Importing CSV Dates, Times, or Both

OOo 2.0 appears to have problems importing CSV date, time, or date/time values to HSQLDB DATE, TIME, or TIMESTAMP columns. Excel and OOo/SO Spreadsheets store dates as integers (1 = 1/1/1900), time as a decimal fraction (of 24 hours), and date/time data as a decimal number. As an example, 1.5 represents 1/1/1900 12:00:00 PM. Regardless of the date format you apply in Excel or OOo/SO, the value stored in the CSV file is the unformatted numeric value that represents the date and/or time.

HSQLDB expects DATE values to be supplied in 'yyyy-mm-dd' (java.sql.Date) format and TIME values as 'hh:mm:ss' (java.sql.Time) strings. TIMESTAMP fields require 'yyyy-mm-dd hh:mm:ss' format or 'yyyy-mm-dd hh:mm:ss.fffffffff' for microsecond resolution(java.sql.Timestamp). Leading zeros for hour (hh) and nine fractional digits (f) for seconds are required.

Generating Date-Compatible CSV files with Access

Access's Export Text Wizard's Export Specification dialog lets you specify the Date Order and Date Delimiter for CSV or fixed-width text file:

The preceding settings (YMD as the Date Order and - as the Date Delimiter) result in the following partial CSV file:

As far as I've been able to determine, neither Excel nor OOo/SO Calc will export a CSV in the preceding format. However, even this format won't import correctly into an OOo spreadsheet. You must add text delimiters ("s) to the date[time] fields. The yyyy-mm-dd [h]h:mm:ss format makes search and replace to add the "s to the fields.

To ready the CSV file for import to an HSQLDB table, open the CSV file in Notepad, replace ,19 with ,"19 and :00, with :00", to change the CSV field to text. If you don't need time values, replace [space]0:00:00 with nothing. If you need a date/time (TIMESTAMP) value, replace [space]0: with [space]00: and :00" with :00.000000000", as shown here:

Follow the "Creating and Populating HSQLDB Tables from CSV Files" sections' steps 2 through 9, except as follows:

1. In step 2, be sure to change the Fields Column Type from Standard to Text for each of the date[time] fields you've specified.

2. In step 6, substitute TableName1 for Table1 as the table name. You must copy this table to a modified final version.

3. In step 9, accept the default VARCHAR type for date fields.

4. Proceed with the "Make the Final HSQLDB Table Copy" section.

Generating Date-Compatible Fields with Excel or OOo Spreadsheet

An alternative to Access's Text Export Wizard is to create the CSV file with any spreadsheet or database application that offers export-to-CSV capability, and then open the CSV file in OOo Calc, add a text field for each date or date/time column, and then use the TEXT(CellValue, "date format") expression to populate the text column's cells. This method is limited to processing a maximum of 64K rows per table.

[Details on this method and following topics to come later.]

Make the Final HSQLDB Table Copy

Establishing Relationships between Tables

Adding Indexes to Tables

Exporting Table Data

If you use the embedded HSQLDB database and want to export its data to another RDBMS, you must copy each table to an individual Calc worksheet and save the worksheet as a CSV file. Then import the CSV file to the new database. Whether you need to jump through the same hoops required to import CSV files with date or date/time fields to HSQLDB depends on the RDBMS you select.

Technorati:

Saturday, October 15, 2005

OpenOffice Base 2.0 vs. Microsoft Access 2003 - Part I

The Google-Sun Microsystems non-event I discussed earlier this month had one significant side-effect: I decided to download and test-drive the current release candidate (RC2) of OpenOffice[.org] (OOo) 2.0.

Although I spend a substantial part of my working days in Word 2003, I don't consider myself a Word expert. I use Excel less often and PowerPoint only occasionally. I assume that OOo Writer, Calc, and Impress 2.0 have features that satisfy the majority—or at least a significant percentage—of Word, Excel and PowerPoint users, and can open and save documents in current Microsoft Office native formats: *.doc, *.xls, and *.ppt. If not, OOo 2.0—and StarOffice (SO) 8—would be doomed to the purgatory of Office-also-rans, and the Commonwealth of Massachusetts probably wouldn't have mandated the OASIS Open Document 1.0 format that only OOo 2.0 and SO 8 support (so far).

I'm about to start on the 11th edition of my 1,500-pp Special Edition Using Microsoft [Office] Access tome for QUE Publishing, which now has close to one million English copies in print. (The previous 10 editions are: Access 1.0, 1.1, 2.0, 95, 97 (3 editions), 2000, 2002, and 2003.) Obviously, I'm interested to learn what potential effect OOo 2.0/SO 8's upgraded database (Base) feature might have on the sales of Office 12 SKUs that include Access 12 and, thus, the future success of Access books. This post is the first of what might end up as four parts:

  • Part 1 - Database Access, Queries and Simple Reports (this post)
  • Part 2 - Table Design, Datasheet Data Entry, and Data Import/Export
  • Part 3 - Multi-Table Data-Entry Forms and Subforms
  • Part 4 - Multi-Table Reports and Subreports (if possible)

Several reviewers have called OOo Base an "Access clone." Some OOo Base windows and most of the wizards I tried resemble those in Access 2000 or later. However, the term "Access clone" is a stretch—to be charitable. When the Office 12 Beta 1 version becomes available, I'll change from Access 2003 to Access 12 and update previous comparisons where necessary.

Database Access

OOo 2.0's default (embedded) database is HSQLDB 1.8.0.1, an open-source (BSD license) Java database engine based on Thomas Mueller's original Hypersonic SQL Project. Using OOo's Base feature requires installing the Java 2.0 Runtime Engine (which is not open-source) or the presence of an alternative open-source engine, such as Kaffe. The HSQLDB 1.8.0 documentation on SourceForge states the following regarding SQL conformance:

HSQLDB 1.8.0 supports the dialect of SQL defined by SQL standards 92, 99 and 2003. This means where a feature of the standard is supported, e.g. left outer join, the syntax is that specified by the standard text. Many features of SQL92 and 99 up to Advanced Level are supported and here is support for most of SQL 2003 Foundation and several optional features of this standard. However, certain features of the Standards are not supported so no claim is made for full support of any level of the standards.

OOo supports HSQLDB as an embedded database, which means that table, view, query, form, and report definitions, as well as table data, are stored as compressed (ZIPped) content.xml files in the DatabaseName.odb file. Thus OOo *.odb files are similar in concept to Jet *.mdb files and are related to Office 12's new compressed, multiple XML file structure. OOo RC2 doesn't include an HSQLDB sample database with multiple, related tables—such as Access's Northwind.mdb or SQL Server 2005's AdventureWorks—so it's not easy to test multi-table query performance with embedded HSQLDB. Creating HSQLDB tables from existing flat files (text data) isn't a piece of cake, as you'll see in Part II. However OOo Base running under Windows can attach any database for which a current ODBC, OLE DB, or JDBC driver is available. The Database Wizard lets you choose from these data sources:

Selecting ADO, and then clicking Next and Browse opens the standard OLE DB Data Link Properties dialog from which you can select an OLE DB provider and generate a SQL Native Client connection string for SQL Server 2005, such as this:

Provider=SQLNCLI.1;Integrated Security=SSPI;Persist Security Info=False;Initial Catalog=AdventureWorks;Data Source=localhost

However, you receive a "The data content could not be loaded. Invalid Object Name 'TableName'." error message when attempting to open SQL Server 2005 AdventureWorks tables or change from the SQL view to design view of a query. Northwind tables added to an SQL Server 2005 instance don't throw an error, but display multiple copies of the first row. You must specify the SQL Native Client ODBC provider to handle two-part (Schema.Table) table names, which appear like this in the Tables window:

Query Design and Processing

To test OOo Base's query design and processing capabilities, I copied the query that I used to create the report in my recent "Microsoft Staff Moonlighting as Adventure Works Employees" post to the Query window's SQL pane. The only change required was to replace [Group] with "Group", as shown here:

Clicking the Switch Design View (On/Off) button open a three-pane query design window that's almost identical to Access's da Vinci toolset for Access data projects:

Registering (by saving) the *.odb file incorporates a copy of the OLE DB connection string or ODBC data-source name (DSN), query SQL statements, and other OOo objects, such as form and report definitions. Thus *.odb files for databases with attached or linked tables are similar to *.adp Access data project or Jet *.mdb files with linked tables. The query designer adds double-quotes to all table names and aliases, whether or not they're needed.

Creating a Report from a Query

I anticipated that OOo reports would be similar to reports created with Visual Studio 2005's and SQL Server 2005 Reporting Service's HTML table-based Report Designer, rather than Access's banded style. I was surprised to find that the only option in the Reports window was Use Wizard to Create Report. (Unlike the Forms window, there's no Create Report in Design View option.)

You can add grouping levels to a report, but I found grouping on the Group field resulted in a 150+ page report. Apparently, the data source must be ordered first by the grouped field; however, the Wizard's sorting option is disabled when you specify a query as the data source for the report. Sorting on the grouped field is automatic when you use a table as the report's data source, but the resulting report has a first page that only contains a header and footer only with no group data. A page break occurs after each group.

Following is an example of an edited wizard report generated from the ungrouped Adventure Works Employee Roster:

The Report Wizard lets you apply minor variations to a fixed style. However, I wasn't able to top-align the EmployeeID value with the first line of the other fields. You can make design changes to the report by a selection in the last Wizard dialog or by closing the report, right-clicking it's name, and choosing Edit.

Reports as Writer Text Documents

An alternative to OOo Base reports is to generate reports as tables, fields or text in Writer documents. This was the only method to create a report with earlier OOo versions. Open a new Writer window, press F4 to display a list of available data sources, select the data source, and drag the table or query to the document pane, which opens this Insert Database Columns dialog:

Select a text type option—Table, Fields, or Text—add the table or query columns to display, and click OK to copy the table at the document's insertion point:

The process is similar to copying and pasting an Access table or query datasheet to a Word document. The new OOo mail merge feature uses a similar approach with address and related merge data as fields.

Conclusion - Queries and Simple Reports

OOo Base's SQL entry pane and query designer appear to me to be adequate for most projects, assuming that the user has prior experience with writing SQL SELECT queries, using graphical query designers, or both. Microsoft Office's abysmal help system is better—by at least an order of magnitude—than OOo's. OOo Base error messages are cryptic, and clicking the More button offers no more useful assistance with the problem.

I'm mystified by the Report Wizard's approach to designing basic tabular reports, surprised by the lack of sorting options with query data sources, and underwhelmed by the lack of grouping capability with unsorted query result sets. I see no obvious way to add subreports, and OOo Base appears to be missing a Wizard option for creating crosstab (matrix) reports. My conclusion is that OOo Base would be adequate for mail merge and printing simple lists but not reports that have any degree of complexity. OOo Base's Report Wizard isn't close to being a clone of its Acces 2000+ counterpart. On the other hand, I might not be using the report designer window correctly. Suggestions and assistance from experienced OOo users are welcome as comments.

Bear in mind that Microsoft Office Access 2003 as a standalone SKU sells for US$174 to $229 and the upgrade SKU goes for $85 to $114. Even if you buy into OOo Writer, Calc, and Impress, you'll undoubtedly find that using Access rather than OOo Base to create database front ends will save your investment in an Access 2003 license many times over.

Alternatively, spend $99 for VB Express and use the ReportViewer control to quickly create simple tabular and crosstab reports and graphs. VB Express also simplifies creating data entry forms—compared with VS 2002/3—but Access is even simpler. The obvious advantage of the VB Express approach is that distributing the project to multiple users doesn't require the Office Developer Edition's runtime Access license.

Technorati:

Thursday, October 13, 2005

Microsoft Staff Moonlighting as Adventure Works Employees

A bit of trivia (for a change) ... I needed to create a sample data source for a simple smart-client (a.k.a., Windows form) demonstration of the new ReportViewer control that contained more records than Northwind's Employees table. So I executed the following T-SQL query against the AdventureWorks sample database, which returns rows for 295 employees (one employee row is repeated for an additional department):

SELECT Empl.EmployeeID, Cont.LastName,
  Cont.FirstName,
  LEFT(Cont.MiddleName, 1) AS MiddleInitial,
  Dept.GroupName AS [Group],
  Dept.Name AS Department,
  Empl.Title, Cont.Phone,
  Cont.EmailAddress
FROM HumanResources.Employee AS Empl
  INNER JOIN Person.Contact AS Cont
    ON Empl.ContactID = Cont.ContactID
  INNER JOIN HumanResources._
      EmployeeDepartmentHistory AS HREDH
    ON Empl.EmployeeID = HREDH.EmployeeID
  INNER JOIN HumanResources.Department AS Dept
    ON HREDH.DepartmentID = Dept.DepartmentID
ORDER BY Cont.LastName, Cont.FirstName,
  MiddleInitial
Here's the first few rows of a simple report, which uses VBScript expressions to modify and format a simple list report that's grouped by AdventureWorks GroupName categories:

The following simple expressions in the ReportName.rdlc Report Designer file's details row text boxes change the Phone and Email Address field formats:

=IIf(Left(Fields!Phone.Value, 2) = "1 ",
  Fields!Phone.Value,
  "(" & Left(Fields!Phone.Value, 3) &
  ") " & Mid(Fields!Phone.Value, 5))
=Replace(Fields!EmailAddress.Value,
  "@adventure-works", "@a-w")
As I worked on the report layout and experimented with the expressions, I began to recognize some of the names as members of Microsoft's SQL Server group. Here's the end of the Sales and Marketing Group roster:

First to catch my eye was Brian Welcker, Microsoft’s Group Program Manager for SQL Server Reporting Services, moonlighting as Adventure Works' Vice-President of Sales. Syed Abbas is a consultant with Microsoft Consulting Services in New York and A-W's Pacific Sales Manager. Ovidiu Craciun is a member of the SQL Server Management Tools team and, appropriately, A-W's Senior Tool Designer. Stephen (Yuan) Jiang from the SQL Server Storage Engine group is A-W's North American Sales Manager. Tete Mensa-Annan is both an SQL Server Program Manager and A-W Sales Representative. Tsvi Reiter, another A-W Sales Representative, relocated from Redmond to work for Microsoft Business Solutions in Vedbaek, Denmark. Jae Pak, an A-W Sales Representative, is a member of the Performance Monitoring and Analysis Group of the Programmer Productivity Research Center at Microsoft Research. Ramesh Mayyappan is an A-W Information Specialist and the author of Microsoft's "Step-by-Step Guide to Log Shipping for SAP Customers using Microsoft SQL Server 2000." Christian Kleinerman is an A-W Maintenance Supervisor and Program Manager in the SQL Engine team who works on the data access provider (System.Data.SqlServer). Ashvini Sharma is an A-W Network Administrator and Development Lead for SQL Server Integration Services (formerly DTS).

With the exception of Tsvi Reiter and Jae Pak, all the less common names I tested were associated with the SQL Server 2005 team. --rj

Technorati:

P.S. Stay tuned for links to a future Visual Studio Magazine article on programming VS 2005's new ReportViewer control and the downloadable source code for sample VB 2005 project.

Wednesday, October 12, 2005

Watch VSLive! Orlando Keynote and Session Videos

If you've signed up for Fawcette Technical Publications .NETInsight newsletter, you've received a VSLive! Show Daily issue from the Orlando conference on October 11, 12, and 13, 2005. Each issue includes a video of the preceding day's keynote address and videos or slides of selected breakout sessions.

If you aren't a subscriber, go here, scroll to the Newsletters topic in the right column, type your e-mail address, and click Go to get on the .NETInsight and, optionally, other FTP newsletter mailing lists.

If you missed the Show Daily issues, they're available from the following three links (free registration required, subscriptions are optional):

  • Day One—Monday, October 10, 2005: Keynote: "Visual Studio 2005: More Power with Less Code Now" by BJ Holtgrewe, lead product manager for Visual Studio Session: "Building Mission-Critical Software" by Eric Lee, Product Manager, Developer Tools Division, Microsoft Corp. Slides: ".NET Windows Forms Tips and Tricks" by Ken Getz, Senior Consultant, MCW Technologies. Session: "Security Changes in .NET 2.0" by Robert Hurlbut, owner of Hurlbut Consulting Inc. and Microsoft MVP for Security Development. Session: "Advanced VSTO Programming and Deployment" by BJ Holtgrewe, who's also responsible for product planning and marketing of Visual Studio Tools for the Microsoft Office System.

  • Day Two—Tuesday, October 11, 2005: Keynote: "Next-Generation Database Application Development" by Bill Baker, Microsoft's general manager for SQL Server Business Intelligence Session: "AJAX-Style Development With ASP.NET 2.0" by Vishwas Lele, Principal Architect, Applied Information Sciences, and Microsoft Regional Director for the Washington D.C. area. Slides: "Leverage New Features in C#, Framework 2.0" by Richard Hale Shaw, Founder, Richard Hale Shaw Group Session: "What's New in ADO.NET 2.0" by Jennifer Perret, Program Manager for Data Access, SQL Server Product Group, Microsoft Corp. Slides: "Diagnostics and Health Monitoring in ASP.NET 2.0" by Fritz Onion, Founding Partner, Pluralsight Session: "Connect to SQL Server Instances" by Bill Vaughn, president, Beta V Corp.

  • Day Three—Wednesday, October 12, 2005: Keynote 1: "Introducing the Windows Communication Foundation" by Richard Turner, Web Services Strategy, Microsoft Corp. Session: "Add Pizzazz to Your Web Pages With GDI+" by Walt Ritscher, Scandiasoft Slides: "Programming Windows Communication Foundation" by Payam Shodjai, Product Manager, Web Services Strategy Group, Microsoft Corp. Session: "Write High-Performance Stored Procedures" by Bill Vaughn, president, Beta V Corp. Session: "Smart Data Clients 2.0" by Walt Rischer, Scandiasoft.

Videos range in length from about 00:45 to 01:15 or so. Enjoy the sessions.

--rj

Technorati:

Tuesday, October 11, 2005

New SQL Server 2005 XQuery Papers Available

Shankar Pal and 10 other Microsoft folks (including Michael Rys) authored the "XQuery Implementation in a Relational Database" technical paper, which was presented at the 31st Very Large Data Bases Conference in Trondheim, Norway on August 30, 2005. If you've wondered what the XmlOp_Select, XmlOp_Path, XmlOp_Apply, XmlOp_Compare, XmlOp_Constant, XmlOp_Construct, and XmlOp_Function operators do, this paper will satisfy your curiousity. The paper also covers XML operator mapping, plus related and future XQuery work. Bob Beauchemin, who abandoned Developmentor for SQL Server maven Kimberly Tripp's SQLskills.com has written an "XML Indexes in SQL Server 2005" whitepaper for MSDN. Bob's explanations of PRIMARY, PATH, VALUE, and PROPERTY indexes for the xml data type supplement those in my "Exploit Yukon's XML Data Type" article from Visual Studio Magazine. Bob's article is a substantially easier read than Shankar Pal, et al.'s "Indexing XML Data Stored in a Relational Database" paper from last year's Very Large Data Bases symposium. You can download the VB 2005 code for my XQuery.sln WinForms project, which adds two xml columns—one simple and one complex—to the Northwind Orders table and lets you add an XML schema for the documents in each column. Then you can measure the time to create primary or multiple XML indexes on the columns, as shown here: You also can test the effect of indexes on the execution of XQuery expressions by comparing execution times for various index combinations, as illustrated here:

The app contains a high-resolution timer class but accurate comparative tests of index performance require more than the initial 830 rows of the Northwind Orders table.

Chapter 12, "Exploring the xml Data Type" of my Expert One-on-One Visual Basic 2005 Database Programming book has a more sophisticated version of the XQuery project and a utility program for creating an Orders table clone that has a number of basic and namespace-qualified document instances limited only by your available diskspace, patience, or both. --rj

Technorati:

P.S. Thanks to Karen Watterson for the link to Bob Beauchemin's article.

Monday, October 10, 2005

More on Visual Basic 9.0 and LINQ versus SQL

eWeek's Darryl K. Taft takes on the Visual Basic .NET naysayers in his October 10, 2005 "Will VB 9 Win Over the VB 6 Faithful?" piece. (The original headline was "Visual Basic Gets Complicated.") After reviewing VB.COM developers' unsuccessful petition drive to reincarnate VB6 from it's legacy-platform purgatory, Taft makes the following unattributed observation:

Moreover, with the Language Integrated Query framework, which lets users program data and access databases for data without having to know or use SQL, Microsoft is delivering even more to developers. [Emphasis added.]
The preceding statement might be true for C# programmers for whom Anders Hejlsberg has created a set of mysterious LINQ query keywords that appear intended to mask their SQL counterparts. However, the VB query-language implementation is drawn directly from SQL, as illustrated in my September 26, 2005 "Comparing C# 3.0 and VB 9.0 LINQ Syntax" post. There is no way that VB programmers—or C# developers, for that matter—can write effective database queries without SQL competency, whether they adopt LINQ and DLinq or not. --rj Update 10/11/2005: Paul Vick takes Darryl to task for the reference to VB 9, which Paul calls "almost-entirely-vaporware" in his "Forest vs. trees" entry of the same date. There's no question that VB 9.0 is more vaporous than C# 3.0; for example, as mentioned earlier the current VB 9.0 implementation doesn't support DLinq.

Technorati:

Thursday, October 06, 2005

Sun and Google PR Hype: Much Ado About Nothing

Other than a few online journalists-cum-analysts who posted breathless previews of the highly over-hyped Sun Microsystems/Google press conference of 10/4/2005, most experienced tech reporters and bloggers shared ZDNet's Jennifer Guevin's opinion: "Google and Sun deal: That's It?" In other words, "Where's the Meat?" Here's a link to a five-minute video of the non-event, which features an uncomfortable Scott McNealy and a wary Eric Schmidt.

Note: One of the linked posts reports that "Google doesn't even think it's big enough to warrant putting out their own press release." Heres the link to the Google press release.

Full Disclosure: I make my living by consulting on, programming with, and writing about Microsoft operating systems, relational database management systems (including Microsoft Office Access 2003 and earlier), programming platforms (.NET), and languages (Transact-SQL, VB.NET, VB.COM, and VBA).

Network, Cable and Satellite TV Media

What suprised me was the amount of network, cable, and satellite TV preview coverage of this technological non-event. Media coverage of the lack of the press conference's substance or significance in follow-up pieces was conspicious by its absence in most of the following stories:

BBC News gushed: "Google and Sun want Office users," despite no indication that I can find so far that Google has a particular hankering for Microsoft Office users; Google wants all Internet and desktop users. Here's the BBC story's lead:

Google and Sun Microsystems have joined forces to challenge the dominance of Microsoft's Office software.

Google aims to "explore opportunities to promote" Sun's OpenOffice software. Those downloading Sun's Java program will be offered Google's toolbar.

A BBC "Google-Sun alliance hints at future deals" analysis by Jorn Madslien begins wth this overblown conclusion from unnamed sources (a.k.a. "some quarters"):
The fresh alliance between Google and Sun Microsystems is seen in some quarters as perhaps the toughest threat yet to Microsoft's dominance as the world's leader in the personal computer software market. Under the agreement, Sun is being paid an undisclosed sum to add Google's browser toolbar to its Java software. Sun will also buy Google advertising space.

Is Google chief Schmidt just being nice to his former boss? Under the agreement, Sun is being paid an undisclosed sum to add Google's browser toolbar to its Java software. Sun will also buy Google advertising space. Google, in return, has vowed to "explore opportunities to promote and enhance" the Sun-backed open-source software OpenOffice and Sun's Java software, as well as buying some Sun hardware and software.

Google revenue comes from selling advertising space. Sun gains no income from the Java runtime or OpenOffice. Google, who's been on a PR roll since it's original IPO, wins again: Sun might try to convert the heathen, but Google will sell the bibles. One sure-fire revenue source is selling Google Toolbars with customized options to download software—free or otherwise. Sun's only revenue potential is upgrades from OpenOffice to SunOffice subscriptions. Considering Microsoft's problems enticing Office users to upgrade, Jonathan Schwartz, Sun's CEO, has his work cut out for him. Madslien takes a breath with this observation, presumably from "other quarters:"

Tuesday's announcement fell far short of offering a real alternative to Office, which accounts for more than 40% of Microsoft's operating profit - mainly because Google failed to go the whole way and offer OpenOffice as an internet service or as a download for the 80 million people who use Google each month.

... So far, this possibility remains purely hypothetical. Critics observe that Google's chief executive Eric Schmidt was notably keen to stress that Tuesday's agreement did not involve such cooperation with Sun.

Imagine your online-hosted "G[oogle]Office" documents adorned with sponsored links to advertisers of products related to the content of your Open Document Format XML file.

MSNBC offers a link to an AP release with a typically misleading title: "Google steps up offensive against Microsoft: Web-search giant teams with Sun Micro to offer alternative to Office suite." MSNBC's other link to the Seattle Post Intelligencer article—"New alliance sees opportunities vs. Microsoft"—includes this lead:

Google and Sun Microsystems caused a stir Tuesday with the mere hint that they could be positioning themselves to jointly challenge the dominance of Microsoft Corp.'s Office programs. The companies announced that they would "explore opportunities" to promote and distribute each other's technologies -- including Google's search toolbar and OpenOffice.org, an open-source Microsoft Office rival based on Sun technology.
I haven't been able to find any evidence that Google intends to distribute anything from OpenOffice.org.

KNTV, the San Francisco Bay Area NBC affiliate, ran a preview story on the October 3, 2005 morning news and provides a link to an earlier AP article that was updated for share prices. On October 4, a morning news reporter read a sound-byte that mentioned the lack of substance of the press conference.

CBS News offers an abbreviated version of a later AP piece as "Google, Sun Challenge Microsoft's Office". Fox News has a similar version.

KPIX, the San Francisco Bay Areal CBS Affliate, drank the AP KoolAid and classified the wire service's longer "Google Teaming Up With Sun To Take On Microsoft" story as "Top News." The KPIX local news team ran a 3 to 4 minute segment at about 6:40 pm on October 5, 2005 that lead with: "In a few years you won't use Microsoft Office. The reporter went on to say that Google's no longer a just a search engine, but is moving in on Microsoft and posing a threat with the potential to distribute software. In an interview, Tim Miller—vice-president of the 451 Group—opined that Google is the "favorite of young users", "Microsoft takes years to to create a new operating system, and Google does it in months." Miller concluded that "Users will drop Windows and run Google applications from the Internet." The reporter signed off with the conclusion that the Google/Sun agreement will cause "lots of new job to move into the Bay Area."

ABC News posted a brief "The World Today" transcript: "Google-Sun Coalition challenges Microsoft" and a Reuters wirestory: "Sun, Google in software distribution pact." If you call including the Google Toolbar with Java runtime downloads "software distibution," the Reuters head gains a bit of credibility Bloomberg News added its contribution to the "distribution of Sun's OpenOffice.org" rumor in "Google, Sun to Challenge Microsoft With OpenOffice (Update5)." In the second paragraph the unnamed reporter says:

Google will distribute Sun's OpenOffice.org software for personal computers, the companies said at a press conference in Google's home town of Mountain View, California today. Some downloaded Sun programs also will include Google's toolbar.

This sentence appears later in the article: "He [Google CEO Schmidt] wouldn't say if Google's toolbar would link to OpenOffice.org."

CNN Money's Amanda Cantrell wrote an original story: "Google and Sun joining forces: Firms to forge partnership that will offer alternative to Microsoft products." Her well-researched story includes direct quotations from technology and financial analysts, but still carries the "replacement for Microsoft Office theme."

PBS's Robert X. Cringely's "Meeting Over a Cup of Java" column summed up the conference on October 6, 2005: "Sun and Google stood together in front of the press this week and said, well, not much at all beyond Sun bundling Google's toolbar with Java." PBS's Cringely—not InfoWorld's Cringely—went on to analyze the relative positions of Google, Sun, and Microsoft in a Web services-centric future.

Conventional "Old Media" - Trade and Financial Press, Newspapers

Back to traditional web, print, and blog "journalism."

ZDNet's Special Report page has links to mostly wild-eyed prognostication of the end of Microsoft Office's hegemony in the enterprise and Google/Sun "cutting off Microsoft's air supply." George Ou's "Who would actually use web based Office?" and Phil Wainewright's "What Google is really up to" blog posts appear to me to be the most interesting and well-thought-out of this bunch.

EWeek's David Coursey casts a similar jaundiced eye on the press conference in his "The World Wants a Challenging Google-Sun Alliance" op-ed piece. Follow the links in Coursey's piece for more analysis of the Google-Sun agreement. Linux & Open Source columnist Steven J. Vaughan-Nichols' "Google-Sun Partnership Falls Short of Hopes" piece carries this deck: "Opinion: We were hoping for the next great moment in office software. We got 'Google Toolbar as an option in its consumer downloads of the Java Runtime Environment.'"

InfoWorld's Elizabeth Montalbano (IDG New Service) wrote a balanced piece—Google, Sun tout software deal, hint at services—that emphasizes the lack of press conference details on plans for distribution of OpenOffice or any other activities aside from incorporating the Google Toolbar in JRE downloads. On the other hand, Ephraim Schwartz, who writes InfoWorld's "Reality Check" column loses touch with reality in his "Is a Web-based office suite on the way? piece, subtitled "A one-two punch of open source and on-demand from Google and Sun could knock out Microsoft Office." Schwartz quotes Burlington Coat Factory CIO Michael Prince: "Burlington Coat Factory has saved anywhere from 300 percent to 500 percent by using StarOffice instead of Microsoft Office, Prince says, and so far it has worked flawlessly, including reading macros from Excel, PowerPoint, and Word documents." 300 percent to 500 percent of what appears nowhere in the article. The only dollar amount in the piece is Burlington's $3 billion in (presumably annual) revenue. Does anyone copy-edit InfoWorld columnists' contributions? Is Schwartz performing a surreptitious reality check on his readers?

San Jose Mercury News writer Theresa Poletti quotes Forrester Research analyst John Rymer in her "Google, Sun form alliance" article: "The idea that Google will turn around and redistribute OpenOffice is nuts. When they do something, it has to be cool. It has to go further than Microsoft Office. It would have to address the way we work." Mike Langberg rehashes Scott McNealy's anti-Microsoft weltanschauung in "Hinting at a future without Microsoft." Dawn C. Chmielewski adds the seemingly obligatory OpenOffice spin in "Free OpenOffice isn't obscure anymore."

Canada's Globe and Mail reporter Matthew Ingram compares Google's threat to Microsoft with that from Netscape and Java in the late 1990s in "Taking on Microsoft" (Requires paid registration. However, you can read the entire story in the comments to PC World's "Google-Sun Webcast: No Google Office Today" blog item, or read Google's cached version.) Here's the last paragraph of Ingram's October 4, 2005 story:

Although the details of the Sun-Google partnership were less exciting than some of the speculation in advance of the announcement, the reality is that if anyone can make the Internet-based software application business work -- and thereby become the threat that Microsoft has worried about for 10 years -- it is Google. Where Netscape was a small startup trying to survive financially, Google has a market value approaching $100-billion (U.S.) and yet gives its main product away for free. And in contrast to the 1990s, broadband access to the Internet is now widely available, and users are more comfortable with Web-based applications. How does Microsoft compete?

I haven't found any evidence that "users are more comfortable with Web-based" word-processing, spreadsheet, presentation, or database applications. Neither, apparently, has open-source advocate Vaughan-Nichols. InfoWorld's "Microsoft adapting to Web platform, execs say" article quotes Microsoft Chief Technical Officer Ray Ozzie at the Web 2.0 Conference:

Moreover, it's wrong to think that, for example, Office in its entirety will be migrated to the Web, Ozzie said. Some applications lend themselves to a Web treatment, such as e-mail, while others don't, he said.

"What customers are trying to get is a really good user experience for what they're trying to accomplish. For some things -- mail is a great example -- the Web is actually a good mechanism for accomplishing what we might not have in the beginning envisioned might be possible. So some of those things are going to find their way out," he said.

But other applications that are very rich in functionality may not be good candidates for that. "I'm not a big believer that things are going to go all the way one way or all the way the other way," Ozzie said. "Office will change because of the presence of the Internet and its capabilities, but it will be [gradual]."

With Office, Microsoft will steadily try to understand what functionality can be offered via a Web browser and what requires richer client-side software, Ozzie said.

Microsoft undoubtedly will continue to complete, presumably with a substantially upgraded Office 12 suite running under Windows XP or Vista.

Forbes' brief "Sun, Google Deal Announcement Seen As 'Overblown'" piece hits the nail on the head by quoting Standard & Poor's Equity Research: "We believe that this heavily promoted [Sun and Google] announcement is overblown, and does not represent a real definitive product threat to Microsoft's Office." According to Forbes, S&P reiterated a "strong buy" on Microsoft's stock.

New York Times' Dan Mitchell writes "No News Is Good Blogging" and quotes John Battelle as saying the press conference was "a lot of hand-waving about sharing and working together." As another hand-waving example, Java Industry News carried an interview with Sun's Director of Java Desktop Engineering, Thorsten Laux.

The Register ran "Google Sun (Office) not a threat, says MS" and The Scotsman's Bill Magee wrote "Microsoft queries Google-Sun link".

Dave Winer chimes in with a post—"Editorial: Google diversifies, opportunity for upstarts"—that likens Google's current diversification program to repeating Alta Vista's missteps that lead to Google's success. Dave also takes Google to task for Blogster's support of Atom rather than RSS.

Om Malik offers this succinct summary—"Cheap Publicity Ploy:"

The Google-Sun announcement in the end turned out to be nothing…. or as someone just said on the IM. “Let’s do something useless so we can say we did something so Sun can get some press because we’re dying slowly…” Harsh but true, given all the speculation, which was nothing more than chatter of the worst kind. All of us are looking for shapes in shadows, including me….. sigh! call me slow, but I am not sure how AP is coming up with the Open Office conclusion with this JRE announcement.
I'm not sure either, Om, but plenty of media news sites based their stories on the AP piece. I heard only one local Bay Area TV station point out the lack of substance in the non-event of the month. None of my computers have the Sun JRE installed. I haven't found the need to install the JRE, nor have I encountered problems with any sites I visit regularly.

Technorati:

Last updated: November 18, 2005