Event-oriented database

Problem: The state of the database needs to be replicated under non-realtime conditions where the order of updates can't be guaranteed. This might occur in Master-to-Master replication scenarios with delayed reconciliation, or when using a database manager that doesn't support replication, or recovery scenarios that must cope with missing logs or the delayed delivery of logs ("best possible recovery"). Database may also need to track the full history of changes to each entity

Pattern: Use a highly normalized schema oriented around the principle of storing events rather than states. The tables are all INSERT-only, permission is denied for UPDATE and DELETE. A meta_transactions table tracks when each record was inserted, and all records are submitted via a queue.

Example: An example database schema
  • orders
    • order_id
    • soldToCustomer_id
    • orderDate
    • transaction_id

  • orderStates
    • orderState_id
    • order_id
    • version
    • soldToAddress_id
    • shipMethod
    • taxRate
    • totalDue
    • transaction_id

  • orderStateLines
    • orderStateLine_id
    • orderState_id
    • sku
    • quantity
    • price
    • transaction_id

  • payments
    • payment_id
    • order_id
    • payment_type
    • amount (unsigned decimal)
    • transaction_id

  • ledgerEntries
    • ledgerEntry_id
    • order_id
    • payment_id (nullable)
    • amount (signed decimal)
    • transaction_id

  • orderReleases
    • orderRelease_id
    • orderState_id
    • releaseTime
    • transaction_id

  • orderShipments
    • orderShipment_id
    • orderRelease_id
    • employee
    • shipmentTime
    • transaction_id

  • trackingNumbers
    • orderShipment_id
    • carrier
    • trackingNumber
    • generatedTime
    • transaction_id

  • cancellations
    • cancellation_id
    • order_id
    • cancelTime
    • transaction_id

  • meta_transactions
    • transaction_id
    • tableAffected
    • serverTime

Discussion: Replication of databases is often accomplished with log shipping, but requires that the order of updates is guaranteed. If a "DELETE" or "UPDATE" command is replicated before the "INSERT", for example, then the replica will be left in an inconsistent or faulted state. This can make it impossible to recover the "best possible" state of a system if any log files are missing.

 The above example illustrates a basic order management system with a means of handling changes to the state of an order in an INSERT-only environment. Since we can't UPDATE the order total and DELETE line items, we use an intermediate table called orderStates to record new versions of the order. Payments are recorded in a separate table and ledgerEntries are written so that the balance of an order can be recorded without having to UPDATE a "balanceDue" column. Tables like orderReleases and orderShipments illustrate the principle of recording events rather than updating state: state has to be calculated, so an order with an orderRelease record but no shipment record would be considered in the "processing" state, and the creation of an orderShipment record is evidence that the order is now in the "shipped" state.

 Because records can't be deleted, a principle of "write more facts" has to be used instead. So rather than delete an order, a record is written to "cancellations". The design must also seek to minimize the need to delete records, which the emphasis on recording events rather than state helps with: everything that happens is a historical fact, even the decision to reverse a decision is an event.

 Views or triggers can be used to simplify the reading of the order's state at any given time, but each insertion must force the state to be re-calculated from scratch.

 All records are submitted to the database via a queue, such as MSMQ, and inserted by a process that can also create/ship logs for replication. The meta_transactions table is maintained by the queue-reading process, and provides some administrative support. Polling or using a mechanism like SqlDependency to subscribe to updates on the meta_transactions table can provide a simple way for a client application to display updates to the database in near realtime.

 Primary keys are either natural keys or GUIDs, but never Identity/Autoincrement columns because their values are dependent on the order of insertion, and that order can't be guaranteed in our problem scenario.
Comments