Waterfall queries

Problem: The criteria and arguments for a frequently used query change often, but are difficult to comprehend when expressed in SQL

Pattern: Query is defined as a series of nested filters and data sources, expressed in XML or other text-based configuration language. The progression from one filter to another resembles a waterfall, as the query adds distinct conditions one level at a time.

Example:

<CatalogView Name="Bob's Discount Outlet">
    <ExcludeOutOfStock>
        <UnderreportInventory Margin="3 units">
            <Discount Amount="10%">
                <NamedProductGroup ID="Specials"/>
            </Discount>

            <LimitByProfit Margin="5%">
                <ExcludeBrand BrandName="Nike">
                    <ProductLine ID="Shoes"/>
                    <ProductLine ID="T-Shirts"/>
                </ExcludeBrand>

                <ProductLine ID="Shorts"/>
                <ProductLine ID="Socks"/>
            </LimitByProfit>
        </UnderreportInventory>
    </ExcludeOutOfStock>
</CatalogView>

Discussion: Each filter is defined in code as an expression tree. Using LINQ-to-SQL, the "Discount" filter would look like this:

public class Discount : AModule
{
    public Discount(XElement configNode) 
        : base(configNode)
    {
        Trace.Assert(configNode.Attribute("Amount") != null, "Amount attribute must be set");

        _amount = (100 - decimal.Parse(configNode.Attribute("Amount").Value.Replace("%",""))) / 100;
    }

    private decimal _amount;

    public IEnumerable<InventoryLine> Lines()
    {
        return from line in Source.Lines()
               select new InventoryLine {
                   sku = line.sku,
                   quantity = line.quantity,
                   price = line.price * _amount
               };
    }
}

 As the configuration for the query is parsed, modules like these are chained together. The Source member--inherited from AModule and populated by the base constructor--refers to the next stage above the current, or null in the case of a data source.

 AModule would be something like this:

public abstract class AModule
{
    public AModule(XElement configNode)
    {
        this.Config = configNode;

        if (this.Elements().Count() > 0)
            this.Source = new CatalogVIew(this.Elements());
        else
            this.Source = FindModuleImplementation(this.Element().Name, this.Element());
    }

    public XElement Config { get; private set; }

    public AModule Source { get; private set; }

    public abstract IEnumerable<InventoryLine> Lines();
}

 The FindModuleImplementation method may use reflection or some other type resolving mechanism to create an instance of a specific filter.

 With LINQ-To-SQL, the waterfall query can be translated into a single SQL query, but if LINQ-To-SQL isn't available then the same pattern can be implemented with continuations rather than expression trees and still be effective. For example, the Lines() function from the Discount module can be written as follows.

public IEnumerable<InventoryLine> Lines()
{
    foreach (InventoryLine line in Source.Lines())
        yield return new InventoryLine {
            sku = line.sku,
            quantity = line.quantity,
            price = line.price * _amount
        };
}

 By using yield return the function behaves as a coroutine, allowing partial evaluation to be performed.
Comments