Home‎ > ‎

How to get a night's sleep without shit happening

  • Train and 'splain
    Show other people how to restart servers and services, document common fixes bullet-point style and in large type, explain how things work, and make sure everyone's mobile phone gets alerts. More important than anything else is to braindump until they understand why and how. Then they stand a better chance of figuring it out on their own before they wake you up

  • Never create a service to do the same thing as Scheduled Tasks or Cron
    Your version will crash

  • Don't create Catch-n-Restart contraptions
    If your service is crashing because of a bug then don't create or use a contraption that detects a crash and restarts the service. It is better to just keep fixing bugs until it doesn't crash all the time anymore. That said, however...

  • Don't forget to set "Restart the Service"
    In Windows Task Scheduler, it's the second tab of the service's properties

  • Don't call it a backup server, call it a distributed platform
    If the "backup" server isn't in constant use, you'll never know it rusted-up months ago. Hardware failures should be seen as reductions of capacity, not the start of a fail-over sequence. Nobody respects backup servers. A service isn't a service unless it can spread its load

  • Validate every input
    Even if it's coming from a trusted component. Especially if it's coming from a user. Favor whitelists over blacklists. Try to define what you're expecting, and preemptively reject anything that doesn't fit. This may mean lots of 2am calls in the beginning as you discover legitimate side-cases the hard way, but the long-term payoff is worth it. Use contracts or assertions. When in doubt, throw exceptions

  • Obey the Pauli Exclusion Principle
    Do not mix devel, test, and production data together in the same store

  • Favor third-party caching
    Don't muck about with caching techniques within your application, they will make your code fragile. Move that out to a vendor or open-source component, such as by using REST-style services and instances of Squid, Memcache and so-on

  • Write to a temp path first, then move to the final location
    Use Path.GetTempFilename() and write to that, close it, then File.Move() to its proper name and location

  • Database connections are like cupboard doors
    You should close them as soon as you've got what you want. Most frameworks will do automatic connection pooling for you to reduce the overhead to almost nothing. Leaving them open gets you into all kinds of grief when transactions and DataReaders get entangled

  • Don't stash database connections up your ass
    Your data model classes don't need their own connection, or even a handle to a big shared connection. Give them methods for importing DataRows, to either create a new instance or update the state of an existing instance

  • Rotate your log files daily
    Use "Hostname.Date.txt"-style filenames, use a logging class that starts a new file every midnight, and organize them into separate directories per service or major function

  • One program, one function
    Convergence is for printers and fax machines. If you have one program that both settles the credit card sale and issues refunds, then you'll get woken up at 2am because a bug in the refund code is preventing the company from settling

  • Use Queues
    Such as MSMQ or Amazon SQS. Throw all of your business events in the queue (orders, ship notifications, calendar appointments, etc). Have something pull notifications off the queue from within a transaction, so if the process crashes then the notification goes back into the queue to be tried again

  • Save everything that comes off the queue
    Serialize each message to plain, escaped-and-delimited text and save it to its own file. Either use a timestamp in the filename or some other unique identifier. Now write a program that can read a whole folder full of these files and re-submit them to the same queue for reprocessing

  • Pipes are for command lines, not business processes
    When a business process has stages (fetch, import, validate, charge, print, ship, confirm, etc), then save state at each stage to a file or a database. Make each stage run as a scheduled task every 10 minutes, do its thing, and dump its results in another location for the next stage to pick-up. Do not try to pipe everything around in memory; it will crash and you won't be able to recover what it dropped. Also, do not use a fancy trigger mechanism to initiate the next stage of a process; it will break and nobody will know how to nudge it

  • Don't go more than 1 level of abstraction above the framework
    Languages and frameworks keep evolving newer and better abstractions, and you should keep moving up with them. But don't build more than 1 layer above the tools. If 1 layer is not enough, then look for a third-party abstraction from a reliable vendor. If that's still not enough then buy an insurance policy before you go ahead and build it anyway

  • When in Rome, don't speak latinized Japanese
    If you are faking your favorite features from some other language--even if that foreign language is superior--then you are begging to be woken up every night. Lambdas, anonymous methods, even OOP may be unsupported in your project language, but do not try to simulate them unless you want to explain how your homebrew implementation works to every maintenance programmer who inherits it

  • Save all of your trouble tickets and turn them into unit tests
    Even if you don't build failing tests before you write code, at the very least you should build a catalog of inputs that crashed your system in the past. Set these up to run on your devel copy automatically after each source checkin (ie: construct a build server). At the very least, you don't want the ignominy of being called out twice for the same bug. This can be easy if you use a queue to submit all major inputs to the system: use that queue-resubmission program to torture your QA copy with the same messages that had crashed production
Comments