Tag Archives: PHP

The Pareto Lamp

I was asked over the Thanksgiving break to create an online store for soccer fundraising.  This planned Christmas Catalog was behind schedule and needed something fast.  With turkey on deck, there was no time to code anything from scratch, so I jumped into Softaculous and installed the highly-rated OpenCart. Configuration was quick, and I soon had the 25 product store online and ready for business.

It can be just that easy in the LAMP Stack world.

That’s largely because most of what we do on the web has already been done countless times before. There’s rarely good reason to code a custom solution when open source options are readily available. That’s a very good thing, because an occupational hazard for code monkeys like me is that we often get asked to help with “computer things” like this.  We like to be able to knock out these common problems quickly and reserve most of our time for the uncharted territories that require more invention and programming horsepower.

I did have to browse the OpenCart source, but that was only to answer my questions about how to configure some things (like sales tax) that didn’t work quite right at first.  Since it follows a familiar MVC structure, navigating the PHP was straightforward.  In the end, I didn’t have to write a single line of code.

Nearly 80% of the server-side web is written in PHP.  And certainly less than 20% of overall programming work goes toward maintaining it.  I’m thankful for the LAMP stack and the Pareto principle that powers it, and will keep OpenCart and other ready Softaculous solutions handy.

The Friday Fragment

It’s Friday, and time again for the Friday Fragment: our weekly programming-related puzzle.

This Week’s Fragment

It was my Friday lunch break, and since I wanted get a problem with a recent online order corrected before the weekend, I called the merchant’s 800 number.  After a long hold time, I finally got a representative on the phone who, horror of horrors, was a mathematics PhD working a call center job.  As I cranked up the phone volume, she told me she was swamped with calls and the correction I needed was going to take awhile.  She’d prefer I just go away.

“Can you call back later?” she asked.
“Sure. When?”
“In a while.”
“How long?”
“A while.”
“Can you give me an idea?” I asked.
“Tell ya’ what.  Calculate pi using the Leibniz formula to nine correct digits.  Use the basic algorithm with no series accelerations or transforms.  Do it 10 times.  When you’re done, call me back.”
“OK,” I mumbled and hung up.

What to do?  The coding part was easy, and there are even Leibniz implementations all over the net.  But nine correct digits would take good big number support and a billion iterations of basic Leibniz.  Ten successive runs of that would go well past the end of my lunch break.

Fortunately, I have a quad core machine, had seen first-hand the benefits of threading things such as this, and even wrote about it just yesterday.  So I wrapped some threads around a basic implementation, ran it, and called back.  Problem solved.

You never know when you, too, might have a math PhD on the other end of a customer service call.  So to help you prepare for that, I invite you to this exercise:

Write code to do ten threaded runs of the basic Leibniz formula to calculate pi to nine correct digits (one billion iterations) in significantly less time than ten successive runs.

To play along, post your code as a comment or send it via email.  If you’d like, you can build atop this single-threaded Ruby implementation.

Last Week’s Fragment – Solution

This week, we complete a genetic algorithm solution to our recent pet store problem:

You’re given $100 and told to spend it all purchasing exactly 100 animals at the pet store, buying at least one of each animal.  Dogs cost $15, cats cost $1, and mice are 25 cents each.  Finish implementing our genetic algorithm solution by doing the natural selection and termination.

Natural selection can be done simply by sorting the current population by fitness and taking the top N individuals, like so:

     /**
      * Compare fitness levels for sorting.
      */
     public function compareFitness($purchase1, $purchase2) {
         if ($purchase1->getFitness() == $purchase2->getFitness())
             return 0;
         return ($purchase1->getFitness() > $purchase2->getFitness()) ? -1 : 1;
     }
 
     /**
      * Select a subset of the given population for breeding a new generation.
      * Return this subset ordered by the most fit first.
      */
     public static function select($population, $size) {
         usort($population, array('PetPurchase', 'compareFitness'));
         return array_slice($population, 0, $size);
     }

And we terminate by breeding until we either find a perfect fit or hit a generation cap:

     /**
      * Find the most fit solution by breeding the given number of generations,
      * each with the specified population size.
      * Return an array with the most fit and the number of generations created.
      */
     public static function findMostFit($generations, $populationSize) {
         $population = self::createPopulation($populationSize);
         $mostFit = new PetPurchase();
         for ($i=0; $i<$generations; $i++) {
             $population = self::select($population, $populationSize);
             if ($population[0]->getFitness() > $mostFit->getFitness())
                 $mostFit = $population[0];
             if ($mostFit->getFitness() == 100)
                 break;
             $population = self::breed($population);
         }
         return array($mostFit, $i);
      }

My completed PetPurchase class is here.  I also coded a Ruby version here.  Here’s an example run:

      $result = PetPurchase::findMostFit(20000, 30);
      $result[0]->printMe();
      echo $result[1].' generations';

With a result:

Dogs Cats Mice Count Cost Fitness
3 41 56 100 100.000 100.000

1510 generations

As in this case, genetic algorithms often aren’t the fastest or most efficient way to get a result.  They’re really a (slightly) directed random walk, and they’re not good for well-understood solutions that don’t justify the overhead of a genetic approach.  But for cases where fitness is well-understood but the solution isn’t, genetic algorithms can be a valuable tool.

Hopefully this simple example clearly demonstrates the evolutionary programming approach behind cool things like BoxCar 2D.

SMS

Whenever I serve as coach or team manager, I find it helpful to send out text messages for quick, last minute updates or reminders. To do this, I create email distribution lists with addresses based on the SMS gateways that each carrier supports (for example, Verizon’s is vtext.com).  Each new season brings new cell phone numbers that must be converted to email addresses.

The only challenge is determining which carrier “owns” a given cell phone number.  I’ve used the brute-force approach of trying all carriers for each number, but it’s much better to nail down the right one to begin with.  At times, I’ve asked folks to tell me their carriers, but that’s somewhat a pain.  Lately, I’ve relied on web sites like fonefinder.net to do the lookup for me.  Due to local number portability (LNP) and carrier changes, this is imperfect, but it’s close enough, and the rejected emails can be tweaked as needed.

I wrote a quick front-end to Fone Finder to convert a batch of phone numbers.  If you’re interested, check it out.  Due to the large number of crawlers and form spammers that hit this blog site, I’ll probably disable it soon, but feel free to grab the source code and run it yourself.

Rewire, then Plugin

I thought it would be nice to add a Nike+ widget to this blog, so I did a bit of searching.  It didn’t take long to find the cool Nike+ WordPress Plugin by Mark Rickert.  However, upon installing and configuring it, I found it failed during the curl authentication calls with:

Can not retrieve data from Nike.com.
Error: there is no user information in the request

No worries, my Nike+ data is publicly accessible, so this type of authentication is really unnecessary and only slows the retrieval time. So I edited the plugin code (nikePlus.php) to change from using up-front authentication to passing my ID on each call.

While I was in there, I also did a bit of customization and cleanup: removed unimportant details, tweaked heading levels, trimmed some things to improve performance, etc.  But I deliberately kept it simple.  That’s the beauty of these plug-ins: they’re easy to customize and tune.  Scroll down to the “Nike+ Stats” and “Recent Runs” in the right-hand column and you’ll see it.

Hats off to Mark for a handy plugin which not only was a nice add to this site, but sparked my interest in possible further Nike+ coding.

The Friday Fragment

It’s Friday, and time again for the Friday Fragment: our weekly programming-related puzzle.

This Week’s Fragment

It’s NFL wildcard weekend, and analysts are all over the place in their predictions.  It’s clear who the top NFL teams are (Falcons, Pats), but less clear for teams with fewer wins, and when comparing teams that haven’t played each other in awhile.  There’s no RPI for football, but a dominance-directed graph (a.k.a., tournament graph) can provide some basic rankings.  Over the next two weeks, we’ll see if we can build such a ranking system.  Let’s start with the approach:

Propose an algorithm to use a tournament graph to compute power rankings.  Start with simple list of wins and losses (Team A beat Team B, Team B beat Team C, etc.) and propose a way to rank the teams.

To play along, write a description of the solution or pseudo code and post it as a comment or send it via email.  To avoid “spoilers”, simply don’t expand comments for this post.

Last Week’s Fragment – Solution

Last week’s puzzle continued with our new year’s calendar programming:

Write code to print out a month’s calendar given a month and year.

There are many implementations already available (including C code for the unix cal utility), but this was an opportunity to work it out for yourself.  I coded a Q&D web implementation in PHP; here’s the gist:

   STATIC $day_names = array('Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat');
   $days_in_month = date('t', mktime(0, 0, 0, $month, 1, $year));
   $day = 1 - date('w', mktime(0, 0, 0, $month, 1, $year));
   echo '<table border="1">';
   for ($i=0; $i<7; $i++) {
      echo '<tr border="1">';
      for ($j=0; $j<7; $j++) {
         if ($i==0)
            printf('<td border="1"><b>%s</b></td>', $day_names[$j]);
         else {
            printf('<td border="1">%s</td>',
                     ($day > 0 && $day <= $days_in_month) ? $day : '&nbsp;');
            $day++;
         }
      }
      echo '</tr>';
   }
   echo '</table><p>&nbsp;</p>';

I used the classic PHP date function tricks to get the days in month and starting day of week.  It’s running at http://derekwilliams.us/fragments/cal.php and the complete source code is here.

PHP Ecterators

“…When in the why and the wherefore is neither rhyme nor reason?”

Shakespeare, The Comedy of Errors II.2

When it comes to the common workhorse task of iterating over a collection, there are nearly as many approaches as there are languages. And these varied implementations often demonstrate the relative strengths or weaknesses of each platform.  In languages with dynamic typing and closures, iterating over collections is typically succinct and elegant. Lacking these benefits, you get code-bloat train wrecks like C++ iterators and templates.

Fortunately, good approaches often get reused. Both Ruby and Smalltalk have what I alone call the ecterators: those venerable iterator methods whose names, inspired by Alice’s Restaurant, end in “ect”: select, reject, collect, inject, and detect.  In a single line of code, you can apply a code block across a collection and get the result.  I often think in terms of the ecterators, so I miss them when working in a language that doesn’t have them.

Such was the case recently when working in PHP.  Modern versions of PHP have helpful functions like array_filter, array_reduce, and array_map, and with PHP 5.3’s lambdas/blocks and closures, you can create a Ruby-like feel.  But those non-rhyming names and inconsistent parameters just don’t do it for me.  You might say PHP lacks some reason and a lot of rhyme.  So I took a moment to write some simple alias functions, like so:

        function select($array, $callback) {
            return array_filter($array, $callback);
        };
        function collect($array, $callback) {
            return array_map($callback, $array);
        };
        function inject($array, $initial, $callback) {
            return array_reduce($array, $callback, $initial);
        };

So now I can write familiar code:

        $numbers = array(1, 2, 3, 4, 5);
        $even_numbers = select($numbers, function ($ea) { return $ea % 2 == 0; });
        $squares = collect($numbers, function ($ea) { return $ea * $ea; });
        $sum = inject($numbers, 0, function ($sum, $ea) { return $sum + $ea; });

There, reason and rhyme.

Defaulting Less Security

Today, a friend reported that one of the apps I provide as a community service was down.  Its WebCalendar component complained that magic_quotes_gpc was no longer enabled, which I quickly confirmed by dropping in a phpinfo() call.  The remedy was also quick and easy: add a local php.ini with:

magic_quotes_gpc = On

This automatically adds slashes to escape quotes and other characters in GET, POST, and Cookie strings (hence “gpc”).  The PHP code then removes these via stripslashes and similar techniques.

Magic_quotes_gpc is no longer considered a good way to guard against SQL injection attacks, so the PHP Security Consortium and others now recommend against it.  I suspect this is why my hosting service changed their global setting to off, but security-wise, that was a step in the wrong direction.

Many PHP apps that support magic quotes are coded to work even if it’s turned off, and the get_magic_quotes_gpc()  ? stripslashes template for doing this is seemingly everywhere.  Fortunately, WebCalendar checks for this, but many apps don’t.  Disabling magic quotes was probably done to force apps to change, but it’s more likely apps will continue to work and developers and admins won’t realize that they are suddenly far more susceptible to injection attacks.  A better approach would have been to just let it die with the 5.3 upgrade.

The Right Retool for the Job

Facebook’s publication of the HipHop transformer and runtime raised the question of whether PHP is really the right language for a web site that has scaled to massive volumes.   The same was asked of Twitter when parts of its Ruby code were rewritten in Scala.

Of course it is.

Proper language choice is based on “the best tool for the job.”  In the case of Facebook, Twitter, and many others like it, “the job” was to get a web site up fast and very quickly evolve it.  Without PHP, Facebook probably wouldn’t have been created.  Productivity, web-orientation, innovation, and even fun trumped raw performance, so PHP and Ruby won the day for those two sites.  No-one would code a web site in C++, just as no-one would code a real-time system in PHP.

If productivity and agility are your primary concerns, pick a language and platform that makes it fast and fun.  I recommend a dynamic, object-oriented language.  If it’s a web app, consider Seaside.  If productivity isn’t important but commodity skill sets and ubiquity are your primary concerns, you might choose Java: the COBOL of the 21st century.  Just don’t be a lemming, letting TPCI ranking become a deciding factor.

And don’t make the decision based on hearsay and perceptions of performance.  Today’s performance tuning and scaling tools (profilers, monitors, load balancers, and even static analyzers) make easy to identify the true bottlenecks and fix those, leaving the non-time-critical majority of the system as is.  Speculative performance tuning can be counter-productive, so if it ain’t broke, don’t fix it.

HipHop is a welcome addition to the PHP universe: the ease and productivity of PHP with the speed of C++.  Hopefully it will mean that PHP developers no longer have the specter of scalability issues hanging over them.  Now, if we could just get some overdue cleanup and deprecation (eliminating redundancy in favor of cleaner object-oriented design), life would be grand.  But that’s another story.

I’d like to try out HipHop, but frankly, I don’t need to.  With the possible exception of a CiviCRM site that I pushed to the limit, my PHP sites just don’t need additional horsepower, and I certainly don’t have to worry about reducing CPU utilization on a few thousand servers.  Obscurity isn’t always a bad thing.