Tag Archives: Ruby

The Friday Fragment

It’s Friday, and time again for the Friday Fragment: our weekly programming-related puzzle.

This Week’s Fragment?

There’s no new fragment for this week.  The Friday Fragment will take a break while a couple of interesting projects keep me busy.  But somehow, some way, some Friday, it’ll be back to remind us all what just a fragment of code can accomplish.

Last Week’s Fragment – Solution

Last week’s challenge was to outsmart a call center math PhD who was just trying to get rid of me for awhile:

Write code to do ten threaded runs of the basic Leibniz formula to calculate pi to nine correct digits (one billion iterations) in significantly less time than ten successive runs.

I offered my single-threaded Ruby implementation as a starting point.

Threading this in Ruby was quick: just change the single-threaded (successive) calls to this:

     threads = []
     10.times do
       threads << Thread.new do
         puts Pi.new(1000000000).calculate
       end
     end
     threads.each { |thr| thr.join }

The updated Pi class is here.

I added some timing code and compared this to the single-threaded version, both running on my quad core machine.  A typical single-threaded run under the standard VM took over 166 minutes.

As I wrote before, multi-threaded code doesn’t buy much when run under the standard (green threads) VM. It’s easy to see why – it doesn’t keep my four CPUs busy:

And so it actually ran slower than the single-threaded version: 185.9 minutes.

To get the real benefits of threading, we have to run in a native threads environment.  The threaded version running under JRuby kept all four CPUs busy until complete:

This completed in 23.4 minutes, leaving me time to call back before the end of my lunch break.

This simple fragment does a nice job of demonstrating the benefits of native threads and multi-core machines in general, and JRuby in particular.

The Friday Fragment

It’s Friday, and time again for the Friday Fragment: our weekly programming-related puzzle.

This Week’s Fragment

It was my Friday lunch break, and since I wanted get a problem with a recent online order corrected before the weekend, I called the merchant’s 800 number.  After a long hold time, I finally got a representative on the phone who, horror of horrors, was a mathematics PhD working a call center job.  As I cranked up the phone volume, she told me she was swamped with calls and the correction I needed was going to take awhile.  She’d prefer I just go away.

“Can you call back later?” she asked.
“Sure. When?”
“In a while.”
“How long?”
“A while.”
“Can you give me an idea?” I asked.
“Tell ya’ what.  Calculate pi using the Leibniz formula to nine correct digits.  Use the basic algorithm with no series accelerations or transforms.  Do it 10 times.  When you’re done, call me back.”
“OK,” I mumbled and hung up.

What to do?  The coding part was easy, and there are even Leibniz implementations all over the net.  But nine correct digits would take good big number support and a billion iterations of basic Leibniz.  Ten successive runs of that would go well past the end of my lunch break.

Fortunately, I have a quad core machine, had seen first-hand the benefits of threading things such as this, and even wrote about it just yesterday.  So I wrapped some threads around a basic implementation, ran it, and called back.  Problem solved.

You never know when you, too, might have a math PhD on the other end of a customer service call.  So to help you prepare for that, I invite you to this exercise:

Write code to do ten threaded runs of the basic Leibniz formula to calculate pi to nine correct digits (one billion iterations) in significantly less time than ten successive runs.

To play along, post your code as a comment or send it via email.  If you’d like, you can build atop this single-threaded Ruby implementation.

Last Week’s Fragment – Solution

This week, we complete a genetic algorithm solution to our recent pet store problem:

You’re given $100 and told to spend it all purchasing exactly 100 animals at the pet store, buying at least one of each animal.  Dogs cost $15, cats cost $1, and mice are 25 cents each.  Finish implementing our genetic algorithm solution by doing the natural selection and termination.

Natural selection can be done simply by sorting the current population by fitness and taking the top N individuals, like so:

     /**
      * Compare fitness levels for sorting.
      */
     public function compareFitness($purchase1, $purchase2) {
         if ($purchase1->getFitness() == $purchase2->getFitness())
             return 0;
         return ($purchase1->getFitness() > $purchase2->getFitness()) ? -1 : 1;
     }
 
     /**
      * Select a subset of the given population for breeding a new generation.
      * Return this subset ordered by the most fit first.
      */
     public static function select($population, $size) {
         usort($population, array('PetPurchase', 'compareFitness'));
         return array_slice($population, 0, $size);
     }

And we terminate by breeding until we either find a perfect fit or hit a generation cap:

     /**
      * Find the most fit solution by breeding the given number of generations,
      * each with the specified population size.
      * Return an array with the most fit and the number of generations created.
      */
     public static function findMostFit($generations, $populationSize) {
         $population = self::createPopulation($populationSize);
         $mostFit = new PetPurchase();
         for ($i=0; $i<$generations; $i++) {
             $population = self::select($population, $populationSize);
             if ($population[0]->getFitness() > $mostFit->getFitness())
                 $mostFit = $population[0];
             if ($mostFit->getFitness() == 100)
                 break;
             $population = self::breed($population);
         }
         return array($mostFit, $i);
      }

My completed PetPurchase class is here.  I also coded a Ruby version here.  Here’s an example run:

      $result = PetPurchase::findMostFit(20000, 30);
      $result[0]->printMe();
      echo $result[1].' generations';

With a result:

Dogs Cats Mice Count Cost Fitness
3 41 56 100 100.000 100.000

1510 generations

As in this case, genetic algorithms often aren’t the fastest or most efficient way to get a result.  They’re really a (slightly) directed random walk, and they’re not good for well-understood solutions that don’t justify the overhead of a genetic approach.  But for cases where fitness is well-understood but the solution isn’t, genetic algorithms can be a valuable tool.

Hopefully this simple example clearly demonstrates the evolutionary programming approach behind cool things like BoxCar 2D.

Threadspeed

Among all the measures used to compare languages and platforms, parallel processing doesn’t often top the list.  True, in most programming tasks, multiprocessing is either handled under the hood or is unnecessary.  Machines are usually fast enough that the user isn’t waiting, or the thing being waited on (such as I/O) is external, so asynchronous call-outs will do just fine.

But occasionally you really have to beat the clock, and when that happens, good multi-thread support becomes a must-have.  Often in this case, green threads don’t go far enough: you really need native OS threads to make the most of those CPU cores just sitting there.

Such was the case last night with a data load task.  This code of mine does its share of I/O, but also significant data munging over some large data sets.  Running it against a much smaller subset of the data required nearly 279 seconds: not a good thing.  Fortunately, I had coded it in Ruby, so splitting up the work across multiple worker threads was fairly easy.  Elapsed run time of the threaded version: 294 seconds.  Oops, wrong direction.

The standard Ruby runtime uses green threads, and I could see from Task Manager that only one CPU was busy. So my guess is all threading did was add internal task switch latency to a single-threaded, CPU-intensive process.

But JRuby now uses native threads, so I downloaded the latest version and ran my code (unmodified!) against it.  Elapsed time: 20 seconds, keeping all 4 CPUs busy during that time.  Just wow. Since that’s far better than a 4x improvement, I suspect JRuby is generally faster across the board, not just for threading.

I believe dynamic languages are superior to static ones, but too many dynamic language environments lack robust native thread support.  Fortunately, JRuby has native threads, giving a quick path to parallel programming and threadspeed when it’s needed.

PHP Ecterators

“…When in the why and the wherefore is neither rhyme nor reason?”

Shakespeare, The Comedy of Errors II.2

When it comes to the common workhorse task of iterating over a collection, there are nearly as many approaches as there are languages. And these varied implementations often demonstrate the relative strengths or weaknesses of each platform.  In languages with dynamic typing and closures, iterating over collections is typically succinct and elegant. Lacking these benefits, you get code-bloat train wrecks like C++ iterators and templates.

Fortunately, good approaches often get reused. Both Ruby and Smalltalk have what I alone call the ecterators: those venerable iterator methods whose names, inspired by Alice’s Restaurant, end in “ect”: select, reject, collect, inject, and detect.  In a single line of code, you can apply a code block across a collection and get the result.  I often think in terms of the ecterators, so I miss them when working in a language that doesn’t have them.

Such was the case recently when working in PHP.  Modern versions of PHP have helpful functions like array_filter, array_reduce, and array_map, and with PHP 5.3’s lambdas/blocks and closures, you can create a Ruby-like feel.  But those non-rhyming names and inconsistent parameters just don’t do it for me.  You might say PHP lacks some reason and a lot of rhyme.  So I took a moment to write some simple alias functions, like so:

        function select($array, $callback) {
            return array_filter($array, $callback);
        };
        function collect($array, $callback) {
            return array_map($callback, $array);
        };
        function inject($array, $initial, $callback) {
            return array_reduce($array, $callback, $initial);
        };

So now I can write familiar code:

        $numbers = array(1, 2, 3, 4, 5);
        $even_numbers = select($numbers, function ($ea) { return $ea % 2 == 0; });
        $squares = collect($numbers, function ($ea) { return $ea * $ea; });
        $sum = inject($numbers, 0, function ($sum, $ea) { return $sum + $ea; });

There, reason and rhyme.