Monthly Archives: March 2011

Trolling

Every couple of years or so yet another patent troll impacts my work.  Whether it’s Data Treasury, LML, or other “companies” I can’t mention, these folks patent the obvious early enough (often while an idea is half-baked) and sit on it.  They then wait for others to build and/or buy products remotely similar to the vague patent, and then sue in East Texas courts to shake down licensing fees.

It’s a ridiculously bad situation that could actually get worse as the Supreme Court considers a case that could hold a programmer liable for writing code while being vaguely aware of neighboring patents.  Since so many obvious and trivial ideas in software have already been patented, the only safe harbor for some may be to stop programming.  For many of us, that’s like abandoning breathing.

Hence the constant cries for software patent reform.  Congress is currently taking up the issue, but, as in past attempts, the strong lobbies that are pushing it in the wrong direction may kill it.  And since legislative and judicial wheels turn at a glacier’s pace compared to the rate of technology advance, don’t hold your breath.

A better way to light a candle in this darkness is to simply defensively publish new ideas before a troll can patent them.  This has the extra benefit of sharing those ideas that really shouldn’t be regarded as competitive advantage.  The downside, of course, is that you may often have to publish something you’d really rather keep as a trade secret.  But that’s the price you pay to avoid the risk of trolls.

There’s clearly a place for those (rare) worthy and innovative software patents that offer a unique competitive advantage.  But you won’t find those with patent trolls nor in East Texas courts.

I’ve often said (and firmly believe) that an active, effective software developer produces at least one patentable idea each week.  It’s just what we do, but that’s also a commentary on the relatively low standard for US software patents. One primary reason we don’t have billions of software patents at the USPTO is that filing for patent protection is an expensive and inefficient process.  So a company’s patent portfolio is usually more a measure of its rapacity or the size of its legal staff than the degree of innovation.

I’m hopeful that the current woeful US software patent situation will one day be corrected, and the end result will be better protection, innovation, and collaboration: even beyond what’s currently found in FOSS work and licensing.  Until then, I’m afraid we’ll have to endure some stagnation while the trolls have their way.

The Friday Fragment

It’s Friday, and time again for the Friday Fragment: our weekly programming-related puzzle.

This Week’s Fragment?

There’s no new fragment for this week.  The Friday Fragment will take a break while a couple of interesting projects keep me busy.  But somehow, some way, some Friday, it’ll be back to remind us all what just a fragment of code can accomplish.

Last Week’s Fragment – Solution

Last week’s challenge was to outsmart a call center math PhD who was just trying to get rid of me for awhile:

Write code to do ten threaded runs of the basic Leibniz formula to calculate pi to nine correct digits (one billion iterations) in significantly less time than ten successive runs.

I offered my single-threaded Ruby implementation as a starting point.

Threading this in Ruby was quick: just change the single-threaded (successive) calls to this:

     threads = []
     10.times do
       threads << Thread.new do
         puts Pi.new(1000000000).calculate
       end
     end
     threads.each { |thr| thr.join }

The updated Pi class is here.

I added some timing code and compared this to the single-threaded version, both running on my quad core machine.  A typical single-threaded run under the standard VM took over 166 minutes.

As I wrote before, multi-threaded code doesn’t buy much when run under the standard (green threads) VM. It’s easy to see why – it doesn’t keep my four CPUs busy:

And so it actually ran slower than the single-threaded version: 185.9 minutes.

To get the real benefits of threading, we have to run in a native threads environment.  The threaded version running under JRuby kept all four CPUs busy until complete:

This completed in 23.4 minutes, leaving me time to call back before the end of my lunch break.

This simple fragment does a nice job of demonstrating the benefits of native threads and multi-core machines in general, and JRuby in particular.

Out of Whack

Another programmer gone bad story:

Man Arrested for ‘Whac-A-Mole’ Sabotage

Mallet-wielding children in arcades and amusements parks could be holding their weapons over mole holes in vain, waiting for a weasel that just won’t pop. If you run across a Whac-A-Mole arcade cabinet that becomes a dud out of nowhere, it might not be the arcade’s fault. A game programmer in Florida has been running a scheme since 2008, infecting Whac-A-Moles with a computer virus.

A Florida local news video on the story is available here.

This sort of thing can happen just about anywhere.  How to avoid it?  More on that later.

Berry Half Mary

I was so impressed with last year’s Berry race (I ran the 10K then), that I decided then to return this year and run the half marathon this time.  I registered in December to get the discount and guarantee a spot but, of course, had no way of knowing then what conditions would be like.

I woke repeatedly through the night from a virus and, around 3 am, checked the latest forecast and weather conditions.  45 degrees, 100% chance of rain (about an inch over the duration of the race), 10-15 MPH winds, and stomach funk.  This was not going to be easy, so  I began adjusting expectations.  My goal was sub-2 hrs and, since I’d done 1:51 through 1:53 in training (and even 1:46 for 13.1 on a treadmill), sub-2 was well within reach.  But not under these conditions.  I talked myself into 2:10 and tried to get some sleep.

Rain and winds accompanied me throughout the dark drive to Rome, but the weather calmed as I approached.  I took the shuttle to packet pickup inside the Ford Dining Hall and then made my way to the starting line.  The rain began diminishing until, finally, a few minutes after the start of the race, it stopped entirely and didn’t return until well after the race!  Awesome!

Although this year’s races were full/sold out, participation didn’t look near the 2,000 runners there last year.  I’m not good at estimating crowds, but I suspect some folks stayed home expecting pouring rain.

The things that make this race great were well in place: beautiful college campus (the world’s largest) with stone buildings and tree-lined roads, plenty of volunteers to keep things flowing smoothly, off-site parking and shuttle buses that ran like clockwork, students and volunteers along the course to offer encouragement and humor, and top-notch equipment.  The motivational signs were back and well-placed.  For example, just before mile 10 was this gem: “There’s no such thing as bad weather, only soft people,” and mile 11 had: “Remember, running is fun!”

There were some rough spots in the half marathon course, like large rocks embedded in the gravel roads that made for bad footing.  Hopefully, they can re-route to avoid these places when WinShape construction is complete next year.

At last, I made it to the end and got the nice finisher’s medal.  I came in at 2:05:56, well within my adjusted goal, but leaving plenty of room for some future sub-2s.  It was nice to see some familiar faces cross the line, including Ray Rhodes who ran a couple miles beyond the finish to meet his marathon training schedule.

I didn’t see Jeff Galloway (to be starting the race and handing out awards) nor the deer.  Perhaps both were too fast for me or had sense enough to stay out of the rain.  Unfortunately, I couldn’t stick around for the awards ceremony to hear Jeff at that time.  If he returns next year, I’ll try to make the pre-race dinner to hear him speak.

Early March can be a dicey time for racing (“in like a lion, out like a lamb”), but this Berry race is a must-do event.  Count me in for 2012, and I’ll watch for the early registration late this year.

The Friday Fragment

It’s Friday, and time again for the Friday Fragment: our weekly programming-related puzzle.

This Week’s Fragment

It was my Friday lunch break, and since I wanted get a problem with a recent online order corrected before the weekend, I called the merchant’s 800 number.  After a long hold time, I finally got a representative on the phone who, horror of horrors, was a mathematics PhD working a call center job.  As I cranked up the phone volume, she told me she was swamped with calls and the correction I needed was going to take awhile.  She’d prefer I just go away.

“Can you call back later?” she asked.
“Sure. When?”
“In a while.”
“How long?”
“A while.”
“Can you give me an idea?” I asked.
“Tell ya’ what.  Calculate pi using the Leibniz formula to nine correct digits.  Use the basic algorithm with no series accelerations or transforms.  Do it 10 times.  When you’re done, call me back.”
“OK,” I mumbled and hung up.

What to do?  The coding part was easy, and there are even Leibniz implementations all over the net.  But nine correct digits would take good big number support and a billion iterations of basic Leibniz.  Ten successive runs of that would go well past the end of my lunch break.

Fortunately, I have a quad core machine, had seen first-hand the benefits of threading things such as this, and even wrote about it just yesterday.  So I wrapped some threads around a basic implementation, ran it, and called back.  Problem solved.

You never know when you, too, might have a math PhD on the other end of a customer service call.  So to help you prepare for that, I invite you to this exercise:

Write code to do ten threaded runs of the basic Leibniz formula to calculate pi to nine correct digits (one billion iterations) in significantly less time than ten successive runs.

To play along, post your code as a comment or send it via email.  If you’d like, you can build atop this single-threaded Ruby implementation.

Last Week’s Fragment – Solution

This week, we complete a genetic algorithm solution to our recent pet store problem:

You’re given $100 and told to spend it all purchasing exactly 100 animals at the pet store, buying at least one of each animal.  Dogs cost $15, cats cost $1, and mice are 25 cents each.  Finish implementing our genetic algorithm solution by doing the natural selection and termination.

Natural selection can be done simply by sorting the current population by fitness and taking the top N individuals, like so:

     /**
      * Compare fitness levels for sorting.
      */
     public function compareFitness($purchase1, $purchase2) {
         if ($purchase1->getFitness() == $purchase2->getFitness())
             return 0;
         return ($purchase1->getFitness() > $purchase2->getFitness()) ? -1 : 1;
     }
 
     /**
      * Select a subset of the given population for breeding a new generation.
      * Return this subset ordered by the most fit first.
      */
     public static function select($population, $size) {
         usort($population, array('PetPurchase', 'compareFitness'));
         return array_slice($population, 0, $size);
     }

And we terminate by breeding until we either find a perfect fit or hit a generation cap:

     /**
      * Find the most fit solution by breeding the given number of generations,
      * each with the specified population size.
      * Return an array with the most fit and the number of generations created.
      */
     public static function findMostFit($generations, $populationSize) {
         $population = self::createPopulation($populationSize);
         $mostFit = new PetPurchase();
         for ($i=0; $i<$generations; $i++) {
             $population = self::select($population, $populationSize);
             if ($population[0]->getFitness() > $mostFit->getFitness())
                 $mostFit = $population[0];
             if ($mostFit->getFitness() == 100)
                 break;
             $population = self::breed($population);
         }
         return array($mostFit, $i);
      }

My completed PetPurchase class is here.  I also coded a Ruby version here.  Here’s an example run:

      $result = PetPurchase::findMostFit(20000, 30);
      $result[0]->printMe();
      echo $result[1].' generations';

With a result:

Dogs Cats Mice Count Cost Fitness
3 41 56 100 100.000 100.000

1510 generations

As in this case, genetic algorithms often aren’t the fastest or most efficient way to get a result.  They’re really a (slightly) directed random walk, and they’re not good for well-understood solutions that don’t justify the overhead of a genetic approach.  But for cases where fitness is well-understood but the solution isn’t, genetic algorithms can be a valuable tool.

Hopefully this simple example clearly demonstrates the evolutionary programming approach behind cool things like BoxCar 2D.

Threadspeed

Among all the measures used to compare languages and platforms, parallel processing doesn’t often top the list.  True, in most programming tasks, multiprocessing is either handled under the hood or is unnecessary.  Machines are usually fast enough that the user isn’t waiting, or the thing being waited on (such as I/O) is external, so asynchronous call-outs will do just fine.

But occasionally you really have to beat the clock, and when that happens, good multi-thread support becomes a must-have.  Often in this case, green threads don’t go far enough: you really need native OS threads to make the most of those CPU cores just sitting there.

Such was the case last night with a data load task.  This code of mine does its share of I/O, but also significant data munging over some large data sets.  Running it against a much smaller subset of the data required nearly 279 seconds: not a good thing.  Fortunately, I had coded it in Ruby, so splitting up the work across multiple worker threads was fairly easy.  Elapsed run time of the threaded version: 294 seconds.  Oops, wrong direction.

The standard Ruby runtime uses green threads, and I could see from Task Manager that only one CPU was busy. So my guess is all threading did was add internal task switch latency to a single-threaded, CPU-intensive process.

But JRuby now uses native threads, so I downloaded the latest version and ran my code (unmodified!) against it.  Elapsed time: 20 seconds, keeping all 4 CPUs busy during that time.  Just wow. Since that’s far better than a 4x improvement, I suspect JRuby is generally faster across the board, not just for threading.

I believe dynamic languages are superior to static ones, but too many dynamic language environments lack robust native thread support.  Fortunately, JRuby has native threads, giving a quick path to parallel programming and threadspeed when it’s needed.