Monthly Archives: August 2010

Going to Xtreams

Perhaps a blog named “WriteStreams of Consciousness” has an implied obligation to cover I/O streams.  And since streams are among the most essential programming building blocks, I’m always interested in better mousetraps there.

Smalltalk is home to perhaps the first truly elegant streams implementation, particularly when compared to other approaches developed around that time, such as those in C and C++.  You can’t get much simpler than ‘myfile.txt’ asFilename readStream contents to open and read a file, and yet there’s significant power in the classes behind that.  But new designs for stream libraries since followed, including pluggable/chainable I/O stream architectures, such as those found in Java, C#, and even advanced parallel stream processing frameworks.

Not to be outdone, Michael Lucas-Smith and Martin Kobetic have recently developed a new pluggable stream framework for Smalltalk called Xtreams, and I couldn’t resist giving it a try.  In the authors’ own words,

Xtreams is an abstract producer/consumer pipeline over arbitrary source and destination types… you get a unified API for accessing files, sockets, pipes, strings, collections and many many other kinds of things.

It’s a Google Code project, but the code is in the Cincom Public Repository: just three mouse clicks away after downloading and starting VisualWorks.  It consists of several packages as described on the project page, along with SUnit test cases.  I loaded the base set of packages and walked through the examples in Michael’s basic primer and in the various package comments, while reading the wiki documentation.

Immediately, I found the basic API improvements (over conventional streams) refreshing.  But the real beauty lies in “stacking” streams, collections, and even block closures.  For example, this stacks an encoding stream atop a write stream atop a byte array to convert your string to UTF-8 encoded bytes:

(ByteArray new writing encoding: #utf8) write: ‘Hello’; conclusion

Or, to do Base-64 encoding (for sending binary data as text):

String new writing encodingBase64 write: myBytes

For arbitrary transformations, you can provide your own transforming: block, as in this example:

“Multiply each pair of input elements together and return the result”
((Array withAll: (1 to: 20)) reading transforming: [ :in : out | out put: in get * in get ]) rest

And why limit ourselves to strings and arrays on the inside (as terminals)?  Here’s streaming over a collection:

(1 to: 10000) reading ++ 1000; read: 5

… and a block closure:

| a b | a := 0. b := 1.   “Fibonacci”
[ | x | x := a. a := b. b := x + a. x ] reading read: 20

Stacking sources, terminals, and transforms: it’s nearly as much fun as the Mousetrap or Incredible Machine games, but without the Rube Goldberg chunkiness.

I only scratched the surface in playing around, but it’s a nice package, which is being further refined and extended.  If you’re a VisualWorks Smalltalker, give it a try.

While Smalltalk is an important incubator for significant technology innovation, it’s certainly not a “by the masses” language.  So if you’re hesitant to jump into a Smalltalk image and code away, just read the summary of what Xtreams does and ask, “can my streams framework do that?”  Hopefully we’ll soon see this kind of power and design elegance in other languages.

DB2fer

Code page issues have become unexpectedly common now that recent versions of DB2 LUW default to UTF-8 / 1208 for XML data type support.  In recent days, two separate projects hit errors like the following:

SQL0302N The value of a host variable in the EXECUTE or OPEN statement is out of range for its corresponding use.

The root cause was that, with code page 1208, certain extended ASCII values were each converted from one byte to two: “two bytes for the price of one.”  This stretching data overflowed columns sized to expect one byte per character.

So if you get SQL0302 or similar errors, you can easily check the code page with: get db cfg for DBNAME | grep -i code (or get db cfg for DBNAME | find /i “code”).  The quick fix is to specify a code page like 1252 during database creation: create database DBNAME using codeset ibm-1252 territory us.  I do not recommend changing  the db2codepage registry variable for this problem.

However, code page 1252  prevents you from using XML data types.  So if this is an issue, there are at least two other options:

  • If the data you’re storing is really binary data, define the column with for bit data.  No codepage conversion will occur, and the data will typically come back to your application as raw bytes, not encoded strings.
  • Expand the size of the column to accommodate some “two for one” conversions.  Only a few extended ASCII characters get this conversion, but unless you go at least twice as large, this becomes a managed risk of how many of these you’ll get.

Where Was I?

I hate meta-posts (things like blogging about blogging).  Here comes one anyway.

A few astute readers noticed the sound of silence here: that I neglected this blog for a month now.

Where was I?

In a word: soccer.  Club soccer is in full swing now for my two L kids, with practices four nights a week, team manager duties, and pre-season tournaments every weekend (gone are the good old days when August was still a summer break month).  Family activities and summer reading filled the remaining free hours nicely.  Blog-worthy things continued to happen, but lost was the time in the evenings and weekends to distill and post them.

Where was I?

As I recall, this blog is at least partly about technical problems I encounter and solutions.  Yet absence makes the heart grow forgetful, especially against the backdrop of a (justified) social media pendulum swing.  Leo Laporte’s twitter feed stopped working and no-one noticed.  Frank Ryan drove off a cliff tweeting while driving.  Matt Richtel’s head hurts.  And I didn’t feel so good (about social media) myself.

Many questions answer themselves just by being asked.  Does the entire intertube and Library of Congress need to know what I’m having for dinner?  Ya’ think?  I quickly abandoned Twitter and cut back my Facebook posts because of my low signal-to-noise ratio.  Perhaps it’s just that, in Proverbs 10:19 fashion, as we get older, we “edit ourselves” more (yes, knowledge speaks, wisdom listens).  But, taken in proper measure, there is value in the “reality web.”

For example, I learn about cool things done by far-flung friends.  I get meaningful technology details that never make it to those one page summaries on corporate sites or in glossy print.  Even subtle, simple things like Garmin streams from fellow runners provide balance and motivation.  And when googling for references and fixes, the best ones now typically come from blogs.  Who needs you, Experts Exchange?

Perhaps it’s that debt of gratitude that motivates most.  After gleaning so much helpful information from other blogs and streams, justice demands giving back a little.  And the emails I get in response to even the most unlikely posts help.  So I continue, knowing that WriteStreams sometimes are read.