Among all the measures used to compare languages and platforms, parallel processing doesn’t often top the list. True, in most programming tasks, multiprocessing is either handled under the hood or is unnecessary. Machines are usually fast enough that the user isn’t waiting, or the thing being waited on (such as I/O) is external, so asynchronous call-outs will do just fine.
But occasionally you really have to beat the clock, and when that happens, good multi-thread support becomes a must-have. Often in this case, green threads don’t go far enough: you really need native OS threads to make the most of those CPU cores just sitting there.
Such was the case last night with a data load task. This code of mine does its share of I/O, but also significant data munging over some large data sets. Running it against a much smaller subset of the data required nearly 279 seconds: not a good thing. Fortunately, I had coded it in Ruby, so splitting up the work across multiple worker threads was fairly easy. Elapsed run time of the threaded version: 294 seconds. Oops, wrong direction.
The standard Ruby runtime uses green threads, and I could see from Task Manager that only one CPU was busy. So my guess is all threading did was add internal task switch latency to a single-threaded, CPU-intensive process.
But JRuby now uses native threads, so I downloaded the latest version and ran my code (unmodified!) against it. Elapsed time: 20 seconds, keeping all 4 CPUs busy during that time. Just wow. Since that’s far better than a 4x improvement, I suspect JRuby is generally faster across the board, not just for threading.
I believe dynamic languages are superior to static ones, but too many dynamic language environments lack robust native thread support. Fortunately, JRuby has native threads, giving a quick path to parallel programming and threadspeed when it’s needed.