Monday, February 22, 2010

Who’s Ready to Develop LINQ to Esent?

Ayende Rahien’s Hidden Windows Gems: Extensible Storage Engine post of 12/23/2008 spreads the word that Microsoft has released a managed library for Esent.exe, a robust hierarchical database that’s embedded into Windows versions 2000 and later and is the foundation for the storage engine of Microsoft Exchange.

Update 2/22/2010: Laurion Burchall released the ESENT Managed Interface v1.5 (Stable) to CodePlex on 9/12/2009 under a Microsoft Public License (MS/PL). The releases include:

  • ManagedEsent. A .NET interop API for ESENT. Use this if you want to write an ESENT application.
  • PersistentDictionary. A persistent, generic, .NET dictionary built on top of ESENT. Use this if you want simple data persistence that is compatible with the existing Dictionary/SortedDictionary classes.
  • esedb. A dbm-like module for IronPython that uses ESENT databases. Use this if you are looking for dbm-like storage for IronPython on Windows.

Here are Laurion’s Release Notes:

The 1.5 release of ManagedEsent includes:

  • New API support
    • JetCreateInstance2
    • JetInit2
    • JetIdle
    • JetSetColumns
    • JetCreateTemporaryTable
    • JetGetThreadStats
  • Grbits for new Windows 7 features
  • Bugfixes
  • Performance enhancements.
  • Breaking API change
    • GetColumnDictionary now returns an IDictionary, not a Dictionary.

Laurion writes in a 6 New ESENT features in Windows 7 post of 8/18/2009 to his ESE/ESENT Database Stuff blog:

A quick look at some of the new features that are available in the Windows 7 version of ESENT. You'll need esent.h from the Windows 7 SDK to see these definitions:

  1. Column Compression
  2. 32kb and 16kb page support
  3. JetPrereadKeys
  4. Space Hints
  5. JET_paramWaypointLatency and JET_bitReplayIgnoreLostLogs
  6. JET_bitTermDirty

Laurion includes detailed descriptions of each new feature.

An anonymous commenter to the original OakLeaf post noted on 2/22/2010 the presence of a start of a LINQ to Esent implementation on GitHub:

The early, rough beginnings, from a weekend of hacking...
http://github.com/capo/Linq-2-Esent

Apparently, the commenter is Capo because he added readme content a few minutes after I requested it in a reply. He sounds serious about developing a LINQ to Esent provider.

Matt Honeycutt’s Alternatives to Relational DBs – ESENT post to the Try-Catch-FAIL blog of 9/11/2009 which claims:

Using ESENT from .NET is very easy.  There’s a managed API available here.  Unfortunately, the API does resemble the underlying C API very closely, but that appears to be by design.  Still, the API is quite usable, just a bit verbose.

and his When to consider ESENT post of 9/14/2009 suggests using ESENT:

When I need blazing-fast access

Relational database operate in the millisecond range (if you are lucky).  That sounds fast, but compared to most operations in your application, it’s probably the slowest thing you are doing.  On the other hand, ESENT operates in the microsecond range, and it does so while still providing a lot of the core functionality you would expect from a data store: ACID compliance, transactions, etc.  When milliseconds are unacceptable, ESENT might be a good fit.

When I need to store large objects

Relational databases suck at storing blobs.  It doesn’t matter if they’re blobs of  binary data or blobs of text data, they aren’t good at it.  People have resorted to a variety of hacks, the most popular of which is to serialize the data to the file system and store only the path in the DB.  I don’t like this approach because you lose some of the benefits of a relational DB and now have to make two requests to load your data: one to the DB, and then one to the filesystem. 

ESENT is quite good at storing large objects.  Individual columns can have values of up to 2 GB, and it has been used in applications that have terabyte-sized datasets.

Matt’s Announcing Esenterate – a clean .NET API for ESENT post of 9/22/2009 reports:

I just opened the Esenterate project on Google Code.  The purpose of Esenterate is to provide a clean, .NET-friendly API around ESENT that allows developers to focus on their application instead of persistence.  Eventually I plan for the API to support all major ESENT functionality, but the first release will target simple key/value storage.  No code has been committed yet, but I am working on designing the API.  

As of 2/22/2010, there’s still no code or other content in the Google Code project. Matt’s Relational databases – the hammer-to-a-screw of software development? post of 9/2/2010 indicates he’s a member of the currently fashionable NoSQL movement.

Original Post of 2/23/2010:

You can download the 12/21/2008 beta version of the library from CodePlex. Here’s Esent’s feature list from Laurion Burchall’s ESENT (Extensible Storage Engine) API in the Windows SDK post of 10/23/2008 to the Windows SDK blog:

  • ACID transactions with savepoints, lazy commits and robust crash recovery.
  • Snapshot isolation.
  • Record-level locking — multi-versioning provides non-blocking reads.
  • Highly concurrent database access.
  • Flexible meta-data (tens of thousands of columns, tables and indexes are possible).
  • Indexing support for integer, floating point, ASCII, Unicode and binary columns.
  • Sophisticated index types including conditional, tuple and multi-valued.
  • Individual columns can be up to 2GB in size. A database can be up to 16TB in size.
  • Can be configured for high performance or low resource usage.
  • No administration required (even the database cache size can adjust itself automatically).
  • No download. Your application uses the Esent.dll which comes with the operating system.

C Documentation and .NET Samples

The Extensible Storage Engine white paper documents the C API. Using Extensible Storage Engine explains how to use the following API elements:

Wikipedia’s Extensible Storage Engine entry appears to me to be authoritative and written by someone having substantial experience with the product. The entry includes a brief list of differences between Red and Blue JET.

Ayende’s post includes sample code for storing JSON documents.

A Bit of History

Esent (Extensible Storage Engine for Windows NT) was originally known as "Blue JET" (the Access relational database was "Red JET"). To learn more than you probably want to know about the two JETs see my Red vs. Blue JET Database API Confusion post of 7/23/2006. Access abandoned the “Jet” moniker in favor of “ACE” (for Access Compatibility Engine, although the Access team insists that ACE stands for nothing. Earlier Access teams were equally insistent that initial-letter-capped “Jet” meant nothing.)

Why both databases were called JET (originally an acronym for Joint Engine Technology) is a question that probably only Adam Bosworth (a.k.a. the "Father of Microsoft Access", formerly with Google, and starting a new health-related venture) or Tod Nielsen (early Access marketing honcho, now Borland CEO) can answer.

A Call to Action

What's needed now is LINQ to Esent because Blue Jet doesn't have a query processor. Is Matt Warren listening?

Let’s hear from volunteers who will write a LINQ implementation for Blue Jet! How about an Entity Framework data provider for Essent? I tag Danny Simmons.

Updated 12/23/2008 1:00 PM PST: Wikipedia item added, minor edits and additions

10 comments:

Ayende Rahien said...

Don't even go there.
Seriously.

What we need first is a robust .net API for Esent that doesn't resemble the C api too closely.

Once we have that, we can talk

Anonymous said...

I would like to see a Linq interface on top of Esent. Developing a good object model that lends itself to Linq seems to be a first step.

I'm stunned by how much you know about ESE/Jet Red/Jet Blue history. I've worked on ese for over 10 years and I'm learning things from your blog.

Roger Jennings (--rj) said...

@Laurion:

Thanks for the kind words.

I've worked with Jet since the Cirrus beta and had to explain the difference between Red and Blue Jet, especially when RED was part of a DLL name.

A good object model for Esent would be much appreciated.

Anonymous said...

The early, rough beginnings, from a weekend of hacking...

http://github.com/capo/Linq-2-Esent

Roger Jennings (--rj) said...

@Anon,

If you're Capo, I'd like to see a bit more (than none) content in the Readme on Github.

k?thxbi,

Roger Jennings (--rj) said...

@Capo,

Thanks for the added ReadMe information.

Anonymous said...

Pleasure...

S Mac said...

I've been foolish enough to give this a go!

I have built a layer on top of ESENT that allows adhoc querying, has a rudimentary OO style model for manipulating schema objects. (It actually became quite clear why the old ADO model looked like it did)

The query engine exposes an IDataReader, that has enough schema definition to drive a DataTable population, so I could easily test results with binding to a WPF datagrid...

Still VERY basic! The query plan is far from perfect, but at least you can do n-way joins! I've been lazy with the datatypes etc. etc.

Here's an example query:
new Query("Person", "p")
{
Outputs = { "p.Code", "p.Name", "c.Description" },
Joins = { new InnerJoin("Category", "c") {
On= {new Match("p.Category", "c.Code") }}
}
}


I've stuck it on sourceforge, so if anyone fancies a play, be my guest!!

https://sourceforge.net/projects/esqlnt/

Anonymous said...

I've developed ESENT serialization .NET class library. Take a look, it's freeware and well-documented:
http://const.me/source-code/esent-serialization/

It has some LINQ-like functionality (see the documentation on Recordset generic class, the methods starting with "filter"), and it's OK to use the real LINQ on the IEnumerable returned by all() method.

Anonymous said...

And yeah, my class library is an entity framework data provider, too.