Brian Foote, University of Illinois
In Praise of "Cut 'n' Paste"
"Cut 'n' Paste", the practice of creating a new software component or application by making a source copy of an existing body of code and "beating on it" mercilessly until it meets some new set of requirements, is often treated with disdain and contempt by the traditional software engineering community and (post-)modern object-oriented methodologists alike. Despite this, the ubiquity and enduring popularity of this approach begs the question: What are these people doing right?
Evidence suggests that cut 'n' paste is a nearly essential, and nearly universal, practice when new framework applications are constructed, be they for GUI code, Eclipse plug-ins, or what have you. Adaptations to complex applications such as compilers often demand an incremental, test-as-you-go approach as well.
The advantages to this approach, naturally, are the complements of those associated with factoring out duplication. A copy provides a safe sandbox off to the side of a vital development stream. Changes made to copies cannot disrupt production code, and like duplicate genes, copies can evolve new functionality without jeopardizing existing functionality in the original. Mere copies can avoid, for a time, the intricacies associated with engineering away duplicate functionality.
Indeed, expedient code copying has long been observed to be characteristic of an early, exploratory phase in the evolution of object-oriented components and systems, but its virtual indispensability to framework developers has been less widely acknowledged.
Big Bucket of Glue
Back in 1994, a small group of undergraduates toiling in a round, windowless room at the National Center of Computing Applications at the University of Illinois at Urbana-Champaign produced what was certainly the decade's, and perhaps the century's most revolutionary application: Mosaic. It's praises have by-and-large been duly sung, but one of its lesser known, and most remarkable attributes was its architecture: it is was what I call a "Big Bucket of Glue". The Mosaic application actually contained very little new code. It was, instead, constructed of a thin veneer of C adhesive code that bound together a rag-tag collection of existing applications. It was a harbinger of things to come.
With the increasing popularity of scripting languages with names beginning with letters like "P", this style / school of design has become one of our current era's most frequently deployed systems level architectures. Drawing on an increasingly broad, if not rich legacy of existing applications and components of all sizes, shapes and varieties, architects increasing slather on the glue, and ship these "Big Buckets of Glue" while their competitors are still inking high-road cartoons on their whiteboards.
While the "Big Ball of Mud" has remained the design of choice for application developers, the "Big Bucket of Glue" is increasingly the weapon of choice for system's level integrators. Indeed, the wildly popular Eclipse platform can be seen as a collection of black-box Java components bound together with reflective XML glue.
The End-to-End Principle and Programming Language Design
Earlier this year, I finally came across one of the greatest "classic" papers I'd never read: "End-to-End Arguments in System Design" by Saltzer, Reed, and Clark. The end-to-end argument was developed in the realm of internet protocol design. Succinctly put (and this format allows for no alternative) it argues that facilities, such as error checking, which one might be tempted to place at the lower levels of a system or network, might actually be redundant, and therefore useless as the lower levels, when they must be included in the upper levels of the system as well.
We argue that this argument applies as well to programming language design as it does to networking. For instance, facilities that some languages provide for security, protection, synchronization, namespace management, or type checking must often be re-implemented in applications programs in a variety of different ways, many at odds with the manner in which these facilities were provided in these underlying languages. Such duplication raises the question of why these now redundant facilities need be present in these programming languages at all.
This argument suggests a simpler, more dynamic approach to programming language design, based on an end-to-end analysis of the kinds of facilities users really use, may be better suited to the demands of our clustered, networked, threaded multi-core 21st world.