Tag Archives: JRuby

The Friday Fragment

It’s Friday, and time again for the Friday Fragment: our weekly programming-related puzzle.

This Week’s Fragment?

There’s no new fragment for this week.  The Friday Fragment will take a break while a couple of interesting projects keep me busy.  But somehow, some way, some Friday, it’ll be back to remind us all what just a fragment of code can accomplish.

Last Week’s Fragment – Solution

Last week’s challenge was to outsmart a call center math PhD who was just trying to get rid of me for awhile:

Write code to do ten threaded runs of the basic Leibniz formula to calculate pi to nine correct digits (one billion iterations) in significantly less time than ten successive runs.

I offered my single-threaded Ruby implementation as a starting point.

Threading this in Ruby was quick: just change the single-threaded (successive) calls to this:

     threads = []
     10.times do
       threads << Thread.new do
         puts Pi.new(1000000000).calculate
       end
     end
     threads.each { |thr| thr.join }

The updated Pi class is here.

I added some timing code and compared this to the single-threaded version, both running on my quad core machine.  A typical single-threaded run under the standard VM took over 166 minutes.

As I wrote before, multi-threaded code doesn’t buy much when run under the standard (green threads) VM. It’s easy to see why – it doesn’t keep my four CPUs busy:

And so it actually ran slower than the single-threaded version: 185.9 minutes.

To get the real benefits of threading, we have to run in a native threads environment.  The threaded version running under JRuby kept all four CPUs busy until complete:

This completed in 23.4 minutes, leaving me time to call back before the end of my lunch break.

This simple fragment does a nice job of demonstrating the benefits of native threads and multi-core machines in general, and JRuby in particular.

Threadspeed

Among all the measures used to compare languages and platforms, parallel processing doesn’t often top the list.  True, in most programming tasks, multiprocessing is either handled under the hood or is unnecessary.  Machines are usually fast enough that the user isn’t waiting, or the thing being waited on (such as I/O) is external, so asynchronous call-outs will do just fine.

But occasionally you really have to beat the clock, and when that happens, good multi-thread support becomes a must-have.  Often in this case, green threads don’t go far enough: you really need native OS threads to make the most of those CPU cores just sitting there.

Such was the case last night with a data load task.  This code of mine does its share of I/O, but also significant data munging over some large data sets.  Running it against a much smaller subset of the data required nearly 279 seconds: not a good thing.  Fortunately, I had coded it in Ruby, so splitting up the work across multiple worker threads was fairly easy.  Elapsed run time of the threaded version: 294 seconds.  Oops, wrong direction.

The standard Ruby runtime uses green threads, and I could see from Task Manager that only one CPU was busy. So my guess is all threading did was add internal task switch latency to a single-threaded, CPU-intensive process.

But JRuby now uses native threads, so I downloaded the latest version and ran my code (unmodified!) against it.  Elapsed time: 20 seconds, keeping all 4 CPUs busy during that time.  Just wow. Since that’s far better than a 4x improvement, I suspect JRuby is generally faster across the board, not just for threading.

I believe dynamic languages are superior to static ones, but too many dynamic language environments lack robust native thread support.  Fortunately, JRuby has native threads, giving a quick path to parallel programming and threadspeed when it’s needed.