Clever ideas that failed – Software Engineering Tips

The cleverness of an idea is proportionate to its odds of failure. Brian Kernighan once said that “debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.” I have built many things that I wasn’t smart enough to debug, so now I record them to remind myself not get carried away with something for the sake of being “clever”. A few of my best disasters are below.

1. The distributed scheduled job system with process migration

The problem: Lots of small housekeeping jobs that had to run periodically.

The idea: Write each job as a .Net assembly, use Attributes to flag methods as either “Generators” and “Processors”, execute the “Generator” on the machine closest to the source of data, then migrate the process and dataset over to the machine to run the “Processor” closest to where the data would be used. EG: The “Generator” method was executed on the database server to get a mailing list, and then the “Processor” was run on the mail-server to send newsletters.

Icing on the cake: Applications that required housekeeping tasks could include the task’s code in their own binary, and the scheduler would use .Net reflection to spelunk through everything installed on the machine to find them.

Number of implementations written from scratch: 2. The first was completely distributed and used Multicast IP to synchronize each redundant slave. The second used a central dispatcher that controlled slave machines with WCF.

Current status: The scheduler and dispatcher program (called “Master Control Program”) has to be restarted every couple of days because of a race-condition that I am too stupid to debug. The slave service (called “Sark”, see if you catch the reference) that takes jobs from the MCP and runs them on the “Game Grid” has a memory leak that makes it gobble up to 4GB of RAM after a few days. The profiling code that was supposed to discover the best place to run each stage of the job by statistical analysis was never completed. Early experiments with profiling indicated that the scheduler would always favor the machine with the fastest CPU anyway.

What I should have done: Just write small console apps for each job and run them from cron/Windows Scheduled Tasks. My idea was a classic case of over-thinking the problem. There was no benefit to be had from running a “Generator” on the machine closest to the data and the “Processor” on the machine closest to where it’d be used.

Sins committed:

Premature optimization
False economy
Hiding code
Writing my own Scheduled Tasks service
Writing my own platform

2. The Cache-scalar-values-from-the-database-but-auto-refresh-them class

The problem: We stored lookup tables in a database for values that changed infrequently, and I believed it would improve performance if we cached their values on the client

The idea: Create a class that wrapped a single scalar value, along with the table, row and column it came from, then auto-subscribe each instance to a singleton “Manager” class that polled the database every X seconds for any changes and updated the database whenever the programmer assigned a new value to the wrapper.

Current status: Written, unit tested, debugged, and completely unused.

What I should have done: Just populate a DataTable and be done with it. Even better, just store those lookup values in a formatted and version controlled text file

Sins committed:

Using the Singleton pattern
“Magic” functionality
Polling a database for seasonal changes

3. Teh Ultimate Data Migration/Manipulation Tool

The problem: We had an upcoming project to replace a legacy app, and we’d have to migrate data from one to the other.

The idea: Define an API that abstracted data and the methods to transform it from one structure to another, implement the API in any program that had its own persisted data structure, then write a Universal Migration Tool that loaded these dynamically by inspecting .Net assemblies.

Icing on the cake: Along the way I somehow decided that “migrating data to the printer” was a worthwhile abstraction, and mandated that all programs that had printed output had to implement the API.

Current status: Finally finished stripping this cockamamie dependency out of all the programs it unjustly contaminated

What I should have done: Write a standalone migration tool for just one pair of systems at a time. Defining a mapping between two different systems is hard, even when the information you want to migrate is the same-in-kind, so trying to abstract the concept of migration into a neat little API is even harder. Yet I went ahead with defining the API before the new app had even been developed yet, so I didn’t even know what both sides of the problem were. Then trying to abstract printed reports into this scheme just made it reach out and slime some unrelated projects that should’ve never been involved.

Sins committed:

Architecture astronomy
Conglomeration of unrelated functions
Defining an API before the problem was known

4. Replace “Undo” with time travel

The problem: A property-management application had time as a major data type, making it something like a fusion of calendar and accounting package. Or at least that’s how I saw it.

The idea: A major UI element was the date, which was editable. By default it showed the current date, but if you changed it to any time in the past or the future the UI would display an editable version of the system’s state as it was then or was predicted to be. If you wanted to undo something you’d already entered then you’d revert the date to that time and delete it there. If you wanted to terminate a lease at the end of its term you’d set the date to the future and delete the lease there. The concept of “undo” and “what if” would both be put under the same abstraction.

Icing on the cake: I brought this abstraction down to the database level, writing dozens of PostgreSQL functions to return computed table views of the past or the future.

Current status: Abandoned “beta”

What I should have done: As Sarah Conner would say: the future is what we make it, and it’s not a good idea to leave it to the machines. There’s nothing wrong with providing simulation or prediction tools in an accounting package as long as the UI is unmistakably explicit about what you’re looking at and edits made to those predictions don’t affect the real data. Also, “Undo” is not the same concept as going back in time and terminating an unwanted lease/paragraph/mother of the resistance leader/etc. One of the problems with the application was that it tried to predict what your income would be if you set the date to the future, but that isn’t what happens in real life; income is affected by a thousand different things that happen in the present and are not predictable. That’s why we have accounting software in the first place.

What I’m proud of: Predicting the user interface for Apple’s “Time Machine” six years earlier

Sins committed:

Paradigm shifts without a clutch
Conflating predictions with fact
Extending the database beyond storage