Daily Archives: February 11, 2010

Don’t Get CLOBbered

The subject of direct I/Os came up in an unexpected phone call today.  This was from a friend who had long ago recovered from CLOB abuse and was concerned that he had somehow fallen off the wagon again.

Often new OO developers are tempted to treat the database as little more than a file system.  With wonderful things like CLOBs, BLOBs, and long varchars, who would actually parse their objects into columns?  Why bother with ORM frameworks?  Why take the car apart to put it in the garage?

The answer, of course, lies in the system requirements.  If all you need are LOBs, you probably don’t need a relational database.  And if you need a relational database, you probably shouldn’t use LOBs.

It’s not only an important design issue, but a big performance issue as well.  In DB2, LOBs and other large objects do not go in the bufferpools.  This means every read and write is physical – DB2 must go directly to disk.  Enabling file system caching on large object tablespaces helps some.  But, even with that, heavy use of LOBs is one of the quickest ways to kill OLTP performance.

There is a place in this world for LOBs and other large objects, used carefully and sparingly.  But far too often they’re used for object storage shortcuts.  I’ve had to tune a few systems where CLOBs factored heavily into the design and were among the most frequently-accessed data in the system.  LOB abuse has become so common that the snapshot counters for direct reads and direct writes are among the first I check when measuring a new system.  Sometimes the fix is simple: convert the CLOB types to varchars, with for bit data, if needed. Sometimes, though, the data is so large or the process so entrenched that deeper surgery is required.

This post-op friend of mine had long ago eradicated direct I/Os in the billions, but was now seeing counts in the few hundred thousands range after several hours of heavy system activity.  A quick check of the tablespace snapshot for syscatspace found the expected smoking gun: nearly all of these were in the system catalog.

Several of the system catalog tables (such as syscat.tables, columns, and indexes) do use LOBS.  They are strategically accessed, but can (especially after changes) cause some direct I/O spikes.  There are some tricks you can play to minimize this, but these direct I/Os are typically unimportant in the grand scheme of things.  Just subtract the syscatspace numbers from the DB-level totals and see if what’s left demands a closer look.  This will help you distinguish true LOB abuse symptoms from flashbacks.