A Day In The Lyf

…the lyf so short, the craft so longe to lerne

Cohesion

Wherefore Design?

When I first started programming, I naturally assumed that programming was the hardest part of programming. What a fool I was.

There are a number of things that good developers do that, at first glance, appear to slow down how fast they write code. A great deal of time is spent communicating with users and other developers; tests are written; code that already appears to work is refactored. And a great deal of time is spent in the ivory tower world of design. Perhaps not up-front (I prefer the agile motto of designing all the time), but design is an activity that takes time. If all these tasks take so much time away from writing code (which, on the surface of things, seems to be the job of developers), then why bother?

It turns out that programming is the easiest part of programming. It’s the same mistake managers make when they revolt against their developers pairing up to solve a problem. Writing code is not about typing. It’s not even so much about writing code. It’s about designing a system to meet the customers’ needs, and it’s about ensuring that the system will continue to be able to meet the customers’ needs in the future.

Unfortunately, the first large system I wrote was with a team where everybody, myself included, thought that programming was about writing code. I got an enormous amount of code written in a very short amount of time. A year and a half later, the system still had not gone live, and I had completely lost confidence in my ability to make a change without breaking some essential functionality. This is a common phenomenon, which I can informally depict as a productivity curve:

I followed the red line for the system. I think most developers probably do. The problem is that initial surge of productivity. It’s intoxicating.

Avoiding the red curve is what separates good developers from everybody else.

Design is one of those extra things good developers do in an attempt to follow the black line in the graph above. You miss out on the initial addictive surge, but good developers recognize it as fools’ gold anyhow.

Remember Alice? It’s a song about Alice.

In his study of structural design, Larry Constantine identified cohesion as the central metric in creating good designs. The object-oriented evolution did little to change this fact—indeed, object-orientation is a natural result of trying to increase cohesion over procedural code. If you want good design, then you need to understand cohesion. A system without cohesion is a real Nugger-Tugger

Non-cohesive designs ramble, and they say the same thing over and over again, just in different places. They never really do what they’re supposed to. It’s like listening to Arlo Guthrie sing Alice’s Restaraunt. You really can get anything you want at Alice’s restaurant.

Uncle Bob helped restate cohesion for the OO world as the Single Responsibility Principle, which basically states that a class should have only one reason to change. Constantine’s definition included any module, so it’s fair to say the same thing about methods.

Cohesion is about minding your own business, and it is the design principle that explains why Feature Envy stinks. For example, imagine an order needing to calculate a subtotal. Here’s one way to do it:

public class Order
{
    // ...

    public Money Subtotal
    {
        get
        {
            Money subtotal = Money.Zero;
            foreach (LineItem item in items)
            {
                subtotal += item.Product.UnitPrice * item.Quantity;
            }
            return subtotal;
        }
    }
}

There are a couple of things wrong with this code. First the feature envy—the Subtotal property seems more interested in LineItem’s data than its own class’s data. Clearly, Order has more than one reason to change. It has to change if any of Order’s responsibilities change, and it just might have to change if LineItem’s responsibilities change.

Second, notice how the Subtotal property is actually reaching into the Product’s data as well—yet another reason to change. This is an example of violating the Law of Demeter. LoD purists like to spout out a very precise definition for Demeter. I prefer to think of it as the “two dot” rule. If I see two dots in a single expression, I probably need to rethink what I’m doing. I’ve now coupled myself both to the LineItem class (which is ok, since Order contains LineItem objects) as well as the Product class (which is unnecessary).

This is important: increasing cohesion reduces coupling. Constantine saw them as two sides of the same coin.

Let’s try again:

public class Order
{
    // ...

    public Money Subtotal
    {
        get
        {
            Money subtotal = Money.Zero;
            foreach (LineItem item in items)
            {
                subtotal += item.Subtotal;
            }
            return subtotal;
        }
    }
}

public class LineItem
{
    // ...

    public Money Subtotal
    {
        get { return Product.UnitPrice * Quantity; }
    }
}

Now Order is minding its own business. In programming, spreading ignorance is A Good Thing.

Different levels of cohesion

Rather than speaking of modules which either have or lack cohesion, Constantine identified different levels of cohesion. While I don’t consider it too important to know them by name, I do find it helpful to reflect on a few of my mistakes in reference to those concepts.

Logical cohesion

Sometimes, we tend to group functionality into a class simply because it naively seems to go together. In that first system I mentioned above, I wrote a class called Validation. As you may expect, Validation has many reponsibilities. in this case, it validated addresses, phone numbers, accounts, and a host of other unrelated things. Validation suffers from what Constantine called logical cohesion.

More common (and less damaging) examples of logical cohesion are found in languages like C# and Java, and are hard to get around due to language limitations. Consider the System.Math class in .NET. What’s its purpose? Well, it calculates arc-cosines and square roots and ceilings and exponents, as well as rounding and truncating. The same thing is true of the utility or helper classes that predominate in the mainstream world. Find yourself doing the same thing over and over again with text? Create a StringUtils class that does it for you.

Those methods really belong on the string and float classes, not tucked away in some utility classes. More powerful languages allow you to do this. Want to add a natural log method to Float? Here’s Ruby code that does just that:

class Float
  def ln
    # ... code that performs a natural logarithm
  end
end

Float already exists—it ships with Ruby. That’s ok; Ruby has no problem with you adding methods to classes that already exist. Rumor has it that the next version of C# may let you do the same thing. It will be interesting to gauge the mainstream response to this feature.

Temporal and Procedural cohesion

Constantine identified two levels of cohesion, temporal and procedural, that have to do with putting unrelated tasks together simply because they happen to occur at more or less the same time. The only difference between the two is that procedural cohesion requires some procedural relationship between the two elements. When I first started programming, I was extraordinarily guilty of creating unnecessary procedural cohesion, due to some delusional belief that minimizing the number of loops would improve performance.

As an example of procedural cohesion, let’s look at an alternative implementation of Order:

public class Order
{
    private Money subtotal;
    private Money tax;
    private Money shipping;
    private double taxPercent;
    private Money total;

    // ...

    public void CalculateTotals()
    {
        subtotal = Money.Zero;
        tax = Money.Zero;

        foreach (LineItem item in items)
        {
            subtotal += item.Subtotal;
            tax += item.Tax;
        }

        shipping = ShippingFor(subtotal);
        if (ChargeTaxOnShipping(state))
        {
            tax += shipping * taxPercent;
        }
        total = subtotal + shipping + tax;
    }

    public Money Subtotal
    {
        get { return subtotal; }
    }

    public Money Tax
    {
        get { return tax; }
    }

    public Money Shipping
    {
        get { return shipping; }
    }

    public Money Total
    {
        get { return total; }
    }
}

The code in CalculateTotals is very procedural in nature. What’s worse, the method does not have a single responsibility, and suffers from procedural cohesion as a result (notice how we’re now talking about the method’s cohesion, instead of the class’s—Constantine was writing before object-orientation was all the rave). Notice in particular the ugly dependence between tax and shipping. We’re calculating some of the tax in the loop along with the subtotal, but we may have to add more after the loop if the state charges tax on shipping.

Getting rid of the cohesion problem is easy:

public class Order
{
    // ...

    public Money Subtotal
    {
        get
        {
            Money subtotal = Money.Zero;
            foreach (LineItem item in items)
            {
                subtotal += item.Subtotal;
            }
            return subtotal;
        }
    }

    public Money Tax
    {
        get
        {
            Money tax = Money.Zero;
            foreach (LineItem item in items)
            {
                tax += item.Tax;
            }
            if (ChargeTaxOnShipping(state))
            {
                tax += Shipping * TaxPercent;
            }
            return tax;
        }
    }

    public Money Shipping
    {
        get { return ShippingFor(Subtotal); }
    }

    public Money Total
    {
        get { return Subtotal + Shipping + Tax; }
    }
}

Much nicer. Now each method has only one responsibility, and is easy to understand. However, it’s annoying to have to duplicate those loops everywhere. Some of the nicer languages have internal iterators or closures to help. C# 2.0 has similar constructs, but they’re a bit clumsy to use because of all the static typing noise:

public class Order
{
    // ...

    public Money Subtotal
    {
        get
        {
            return Sum(items, delegate(LineItem item)
                { return item.Subtotal; });
        }
    }

    public Money Tax
    {
        get
        {
            Money tax = Sum(items, delegate(LineItem item)
                { return item.Tax; });
            if (ChargeTaxOnShipping(state))
            {
                tax += Shipping * TaxPercent;
            }
            return tax;
        }
    }

    public Money PaymentTotal
    {
        get
        {
            return Sum(payments, delegate(Payment payment)
                { return payment.Amount; });
        }
    }

    private delegate Money MonetaryPropertyDelegate<T>(T item);

    private Money Sum<T>(ICollection<T> collection,
        MonetaryPropertyDelegate<T> moneyGetter)
    {
        Money result = Money.Zero;
        foreach (T item in collection)
        {
            result += moneyGetter(item);
        }
        return result;
    }
}
Advertisements

Written by Brandon Byars

February 21, 2007 at 3:32 pm

Posted in Design

Tagged with

%d bloggers like this: