Monthly Archives: May 2010

The Friday Fragment

It’s Friday, and time again for a new Fragment: my weekly programming-related puzzle.

This Week’s Fragment

This week’s programming puzzle was sent in by Joel Odom:

Write C code or pseudo-code to swap the values of two integer (int) variables, a and b, without declaring or using any other variables. Do not use any syntactical language tricks that implicitly copy values out of a or b into other memory or registers.  You can use C operators, control flow, and similar statements.

If you want to “play along”, post the solution as a comment or send it via email.   To avoid “spoilers”, simply don’t expand comments for this post.

Last Week’s Fragment – Solution

Last week’s puzzle was the following programming-related cryptogram:

Baae jaanklb xpnrv xksr.  Ki gaq pwr sper xa cpkx, kx kv xa vrwhr gaq urxxrw, ple xa ftrpvr gaq.

Congratulations to Luke for solving this; the plaintext answer is:

Good cooking takes time.  If you are made to wait, it is to serve you better, and to please you.

This is a chapter intro quote in Fred Brooks’ classic The Mythical Man Month, taken from a French restaurant menu.  It reminds us that quality software also requires time and careful preparation.

Go For It

I’ve been exploring new concurrent programming approaches lately, and decided to give Google’s new Go programming language a try.  It’s fast, straightforward, and a good fit wherever you might use C or C++,  such as systems programming.  It borrows a lot from C (including syntax, pointers, and the gcc compiler), but adds significant improvements, including garbage collection and its own flavor of coroutines called goroutines.  “Goroutines” is certainly clever, but I’ll never understand why Google gave its creation the decidedly ungoogleable name of “go” (hence the alternate name, golang: so you can find it).

To get started, I had to check out sources from mercurial and build them, but the instructions were clear and complete (“cookbook”, as we say).  They were written for Linux (complete with sudo apt-get installs for gcc, bison, and other pre-reqs), so I used Ubuntu.  I suppose it’s possible to develop under Windows with Cygwin, but I didn’t try it.

The emphasis on clean, lightweight, and terse code is clear throughout.  For example, it takes a simple approach to common OO mainstays: it uses structs, types, and interfaces instead of classes and inheritance (like C#, structs can have methods); new and make allocators with automatic reference counting; and a built-in map syntax instead of dictionary or hash table classes.  It provides syntactic sugar for common shorthands, such as := to declare and initialize a variable in one step, permission to omit semicolons and parentheses, and code-reducing conveniences like returning multiple values from a function.  It uses Java-style packages and imports for organizing code and supports Python-inspired slices for array references.

Goroutines are surprisingly easy to use.  Any function call can be invoked as a goroutine (forked) simply by prefixing the call with the keyword go.  The function itself can be inner, like a block or anonymous function.  Channels provide the communication and rendezvous vehicle (like a future and queue) via a unique and simple syntax.

There’s a lot missing from the language (much of that deliberate, like classes and inheritance), and it lacks maturity and production readiness.  But it’s easy to appreciate how its primary inventions (goroutines, channels, maps, slices, return lists, and implied interfaces) work together to form a clean, cohesive toolbox for fast concurrent programming.  While I don’t foresee using go for “real work”, I hope to see it influence other environments which often make parallel programming harder than it has to be.

The Friday Fragment

It’s Friday, and time again for a new Fragment: my weekly programming-related puzzle.

This Week’s Fragment

This one’s for Luke, who enjoyed our last cryptogram.  Solve the following programming-related cipher:

Baae jaanklb xpnrv xksr.  Ki gaq pwr sper xa cpkx, kx kv xa vrwhr gaq urxxrw, ple xa ftrpvr gaq.

This uses simple substitution, so just plug at it with pencil and paper, building up the substitution table as you go.  For extra credit, tell us the source of this quote.  If you’re ambitious, try coding a cracker, or at least a cryptogram generator.

If you want to “play along”, post the solution as a comment or send it via email.   To avoid “spoilers”, simply don’t expand comments for this post.

Last Week’s Fragment – Solution

Last week’s challenge was to write code to solve a recent Car Talk Puzzler; that is:

Write code to find two (two-digit) temperatures where the (rounded) Celsius and Fahrenheit readings have the same digits, only reversed.

The conversion to Fahrenheit is 1.8 (or 9/5) times Celsius plus 32.  If you round the result, you find that 16 Celsius is 61 Fahrenheit and 28 Celsius is 82 Fahrenheit: handy numbers to remember when you want a quick approximate conversion from the current temp.

I invited folks to code it in multiple languages to compare, and even try esoteric languages.  Joel coded it in Python and demonstrated the strength of its range function when used with iterators.  He also showed off reversing a string via a “slice with inverse stride”, which sounds like my golf technique.  Wayne’s Smalltalk example demonstrated how it can elegantly iterate over a range using blocks/closures, a technique and style so nice that Ruby ripped it off.  Spencer coded it in Shakespeare but, alas, had problems with the interpreter (it was not to be).  I tried LOLCODE , and likewise had interpreter problems, so I did a quick PHP implementation also.  Gee, I wonder why there aren’t any production-ready LOLCODE or Shakespeare compilers?

Thanks to all for posting; code fragments like these provide good opportunities to compare language features and learn new approaches.  For all these implementations, see the comments in last week’s post.

CLPMinus

There are many great tools for running DB2 commands; I often use TOAD, Control Center (db2ce), and a CLI-based tool I wrote.  And with DB2 9.7, I’ve enjoyed experimenting with the new CLPPlus, IBM’s answer to SQL*Plus for Oracle refugees.  But for those quick commands, I usually just pop open a DB2 Command Window (Windows) or bash shell (Linux) and type “db2” followed by the command.  It works great nearly all the time.  Emphasis on nearly.

Today, Wayne reported how this command taken verbatim from the DB2 documentation (the “File type modifiers for export” section) choked:

db2 export to delfile2 of del modified by timestampformat=”yyyy.mm.dd hh:mm tt” select * from schedule

It failed with the error: SQL3192N  In the filetmod a user specified format “TIMESTAMPFORMAT” beginning with the string “yyyy” is not valid.  And no value for timestampformat worked.

This is a case where bugs in CLP’s command-line parsing (particularly with double quotes) get in the way.  The command works from inside CLP, Command Editor, and other tools, so you can just type db2 or db2ce and hit enter and then run the command (without “db2” in front) from there.  Using admin_cmd from a CLI/ODBC tool also works, like so:

call admin_cmd(‘ export to delfile2 of del modified by timestampformat=”yyyy.mm.dd hh:mm tt” select * from schedule’)

Bugs like this have survived in the db2 command line for awhile (this fails even in 9.7).  I’ll report it, but since CLPPlus is the new kid in town, the old db2 command line probably won’t get as much attention.

The Friday Fragment

It’s Friday, and time again for a new Fragment: my weekly programming-related puzzle.

This Week’s Fragment

A recent Car Talk Puzzler challenged listeners to find two temperatures where the Celsius and Fahrenheit readings had the same two digits, but reversed.  For example, 28 Celsius is 82 Fahrenheit.  It’s an easy problem to solve by hand, but what fun is that?  This is one where writing a bit of code will yield an answer faster than working it out by hand.  Or, better still, offer a chance to try some new coding approaches.  So, our programming version of this puzzle is:

Write code to find two (two-digit) temperatures where the (rounded) Celsius and Fahrenheit readings have the same digits, only reversed.

Use any language you’d like; you may want to try it in multiple languages to compare.  You might even try an esoteric language like LOLCODE or Shakespeare.

If you want to “play along”, post the solution as a comment or send it via email.   To avoid “spoilers”, simply don’t expand comments for this post.  That is, unless you want to see what languages others used to deliberately try something different.  Language wars can be such good sport.

Last Week’s Fragment – Solution

Last week’s fragment was one of those puzzles sometimes given as interview questions for programming and engineering jobs:

You have eight balls; seven weigh the same, but one is lighter than the others.  Using a balance, how can you find the “oddball” in only two weighings?

Congratulations to Wayne who solved it swimmingly, and described the solution nicely:

Weigh three balls against another three balls. If they weigh the same, you simply weigh the other two to find the lighter one. If one set of three balls was lighter than the other though, weigh two of the balls in the lighter set of three against each other. If one is lighter, that’s it, else it’s the third one you didn’t weigh the second time.

I guess that means you’re hired!

DML Insurance

In a chat today, we discussed how to protect data when formulating DML, particularly those SQL update and delete statements that can destroy perfectly good data if done incorrectly.  You may find yourself having to run such SQLs against large data sets (and, yes, even production data), so no amount of care is too much.

One option is to hold a transaction open, run the DML, select the (uncommitted) results, and then commit if good or rollback if not.  That’s clever, but it can hold a lot of locks for a long time, is error-prone, and doesn’t work for all cases.  Another is to keep full backups handy and restore the table(s) if something goes wrong.  This can be time consuming and can harm availability during restore if the table(s) are large.

A better approach is to create exports of exactly the data you will be changing or deleting.  That is, run an export command whose where clause is exactly the same as that in your update or delete statement.  That way, you can restore (via import) just the data you changed.  For DB2, I use IXF exports since they’re the richest.  So, for example, before you:

update company.employees set salary=100000000 where name=’Derek’

You should:

export to empderek.ixf of ixf select * from company.employees where name=’Derek’

If you realized maybe you shouldn’t have done that, you can put it back with:

import from empderek.ixf of ixf insert_update into company.employees

This can be used to recover from deletes as well (hence insert_update).  For example, before you:

delete from wikipedia.articles where title=’Malamanteau’

You should:

export to articles.ixf of ixf select * from from wikipedia.articles where title=’Malamanteau’

MySQL has fewer export/import options, but mysqldump can definitely help.  For example, export with:

mysqldump wikipedia articles –compact –where=”title=’Malamanteau'” > articles.sql

And, after the above delete, you can put it back with:

mysql wikipedia < articles.sql

For ease of reference, I often give these “backup” IXF and SQL files unique, descriptive names. It’s helpful to include the date and some tag that identifies what the data is.  In the above example, I could have used articles-Malamanteau-12-May-2010.ixf, although Malamanteau is hard to spell and probably isn’t even a real word.

Perhaps this simple technique of “export first, DML second” may rescue you from the occasional “oops moment.”

Sorting it Out

In prior posts, I described some of the benefits of DB2’s Self-Tuning Memory Manager (STMM), along with a few caveats.  One caution was around automated sort heaps, since sort requirements can often be unpredictable and confusing.

High-volume OLTP systems can, at times, experience heavy loads of small sorts or sort-like activity (hash joins, grpby’s, etc.).  It’s not uncommon for many of these to overflow (spill), or for a sub-optimal access plan to arise, even when more than enough memory is available for sorts.  Like the proverbial bad check writer, you’re left thinking, “I can’t be overflowing, I still have memory left!”

Sort overflow symptoms include high I/O and CPU utilization, poor response times, high sort overflow snapshot counts, long sort times, and large numbers of rows written on read-only SQLs like SELECTs.  Sort heap tuning is often one of the “last frontiers” in performance work (after fixing low-hanging fruit like SQLs, indexes, and bufferpools), and can be a stubborn one.

From my experience, most sort overflow problems in DB2 9.x fall into these categories:

  1. SQL.  In high-volume OLTP systems, it’s important to avoid sorting as much as possible.  Developers usually recognize that order by clauses require sorts, but can miss more subtle uses of the sort heap.  Count, distinct, and group by clauses also often require sort heap space, as do hash joins.  Statement snapshots and explains will reveal these.  The first step should be to try to rewrite problem SQLs to eliminate the sorts.  But wherever that isn’t possible, try step 2:
  2. DDL.  Adding or changing indexes and clustering may eliminate sorts.  If an existing index is used but sorting still occurs, check to see if adding index or include columns will help, and verify that allow reverse scans is specified when needed.  Again, the only way to know for sure is to measure by snapshots and explains.  In some cases, MDCs can help, but check it first by “doing the math” or running db2advis.  Sometimes the problem is as simple as outdated statistics (among other things, outdated statistics can cause DB2 to request too small a sort heap), so make sure your runstats is occurring as frequently as needed.
  3. Configuration.  And here we come to the STMM benefits and caveats.  You should “go automatic” with sort heap sizes, but be prepared to monitor and adjust as needed.

STMM only tunes shared sort memory, so to enable self-tuning sort memory, you must:

  • Set the instance (DBM)-level sheapthres to zero (0).  You can verify this with: db2 get dbm cfg  | grep -i sheap
  • Set the database-level sheapthres_shr (total amount of sort memory) and sortheap (memory available for each sort) to automatic.  You can verify this with: db2 get db cfg for <database> | grep -i sort

In some cases (such as when you’re experimenting and don’t want to wait on STMM’s tuning cycles), you may want to set an initial value for sortheap.  To do this, specify it alongside the automatic parameter, like so: db2 update db cfg using sortheap <size> automatic immediate.   But the initial value matters little after STMM “warms up” and then saves its settings.

When monitoring a database with sort overflow problems, keep an eye on the percentage of overflows (under 3% is a good target) and on the current sortheap and sheapthres_shr sizes.  You can view these with the snapshot monitors (db2 get snapshot for all on <database> | grep -i sort) and database configuration (db2 get db cfg for <database> show detail | grep -i sort).   But you may find, as I often have, that frequent, small overflows are occurring even when the sort memory areas have stretched to more than enough space.

Why this happens is a mystery to me.  A parallel symptom I often see is that the sort heap size reported in explain (db2exfmt) outputs is often far smaller than the current sortheap size.  At this point, the only choices are to leave sort heaps at automatic and accept some overflows, or abandon automatic and set a fixed size.  When setting large fixed sizes, remember that an overflowed sort writes the entire sort heap to tempspace.  So, a very large sort heap may coax the optimizer into relying more on sorts, only to have some of these very large sorts spill everything.

IBM is considering allowing ceiling and/or floor values for certain automatic parameters.  I do hope this gets added in a future DB2, and that sort heaps are among the first that can be configured this way.  Until then, DBAs will be left to sort this out on their own.

The Friday Fragment

It’s Friday, and time again for a new Fragment: my weekly programming-related puzzle.

This Week’s Fragment

This week’s fragment follows the same line as last week; that is, it’s a simple puzzle sometimes given as an interview question for programming and engineering jobs.  I once worked at a company that used this one.  Such puzzles are usually very poor predictors of job performance, but they’re interesting nonetheless.  So here goes…

You have eight balls; seven weigh the same, but one is lighter than the others.  Using a balance, how can you find the “oddball” in only two weighings?

If you want to “play along”, post the solution as a comment or send it via email.   To avoid “spoilers”, simply don’t expand comments for this post.  It should be a ball.

Last Week’s Fragment – Solution

Last week’s fragment was borrowed from a recent Car Talk episode.  A listener there wrote how he didn’t get a programming job partly because he failed to solve the following interview puzzle within the allotted five minutes:

You have a four ounce glass, a nine ounce glass and an unlimited supply of water, which can be poured and dumped as much as needed.  Can you measure exactly six ounces of water in as few steps as possible?

Kudos to both Spencer and Luke for solving this one quickly (within the five minutes) by just talking through it.  It can be done in eight steps, and the key is recognizing that the six ounces clearly must end up in the nine-ounce glass.  So you have to work toward pouring exactly three ounces out of that glass.  Get one ounce in the four-ounce glass and you have your three-ounce destination.  Here are the steps.

  1. Fill the nine-ounce glass.
  2. Fill the four-ounce glass from the nine-ounce glass.  That leaves five ounces in the niner.
  3. Dump the four-ounce glass.
  4. Fill the four-ounce glass again from the nine-ounce glass.  That leaves one ounce in the nine-ounce glass.
  5. Again dump the four-ounce glass.
  6. Pour that one ounce from the nine-ounce glass into the four-ounce glass.
  7. Fill the nine-ounce glass again.
  8. Fill the four-ounce glass from the nine-ounce glass.   You will have poured off three ounces leaving six ounces.  Voila!

Yes You May

I’ve whined a bit lately about having to jump through syntactic hoops to get past new security restrictions.  Yet one of today’s DB2 barriers was a functional change, and not security-related.

I have this long-standing habit of using IXF export and import to quickly copy tables and shuffle data around.  So much so that commands like the following just flow from the subconscious:

db2 import from myfile.ixf of ixf create into mytable

Trouble is, create (and its variants) have been deprecated since DB2 9.5, so this command fails with an SQL3311 error in my DB2 9.7 environments.  The help text for that error (db2 ? sql3311) provides the work-around: add the forcecreate modifier.  That is:

db2 import from myfile.ixf of ixf modified by forcecreate create into mytable

I understand the reasoning for the change: new features like XML, MDC, and table partitioning have outgrown IXF.  But breaking compatibility of frequently-used commands is just plain cruel punishment toward old guys like me.  Yet since “modified by forcecreate” is the new “please”, I’m sure I’ll eventually learn to say it the first time.