A Day In The Lyf

…the lyf so short, the craft so longe to lerne

Archive for the ‘.NET’ Category

Throw Out Those Utility Classes

How many times have you written an xxxUtils class, where xxx is some framework supplied class that you can’t extend or subclass? I always seem to end up with several in any decent sized project, StringUtils, DateUtils, DictionaryUtils, etc. In most cases, these classes are the result of language limitations. In Ruby and Smalltalk, for example, what would be the point of a StringUtils class when you could simply add methods to the String class directly? But C# and Java make String sealed (final) so you can’t even subclass it.

Utility classes like these tend to suffer from logical cohesion. In spite of the friendly-sounding name, logical cohesion is actually a fairly weak form of cohesion; it’s just a loose jumble of functions that have something in common. It can in no way be considered object-oriented.

Our DictionaryUtils makes an interesting case study because it was small. It only did two things: compared two dictionaries key-by-key for equality (useful in testing), and converting the entries to a Ruby-esque string. That last method made me a little jealous of how convenient Hashes are in Ruby:

middlestate:~ bhbyars$ irb
>> {'a' => 1, 'b' => 2, 'c' => 3}
=> {"a"=>1, "b"=>2, "c"=>3}

For the non-Ruby readers, I just created a 3-element Hash in one line. The command-line interpreter spit out a string representation. Our DictionaryUtils.ConvertToText could manage that last bit, but I wanted to be able to create hashtables as easily in C# as I could in Ruby. Naturally, that meant a third method on DictionaryUtils. Or did it?

C# on Hash

DictionaryUtils.Create seemed bloviated and ugly as soon as I first wrote it, so I quickly scratched it out and started a new class:

public class Hash
{
    public static Hashtable New(params object[] keysAndValues)
    {
        if (keysAndValues.Length % 2 != 0)
            throw new ArgumentException(“Hash.New requires an even number of parameters”);

        Hashtable hash = new Hashtable();
        for (int i = 0; i < keysAndValues.Length; i += 2)
        {
            hash[keysAndValues[i]] = keysAndValues[i + 1];
        }
        return hash;
    }
}

This allowed me to create small loaded Hashtables in one line, which was convenient, especially for test methods (although the syntax isn’t as explicit as Ruby’s). I then decided to merge the static DictionaryUtils methods into Hash, as instance methods. First, of course, I had to make Hash an actual dictionary implementation. This was trivial:

private IDictionary proxiedHash;

public Hash(IDictionary dictionary)
{
    proxiedHash = dictionary;
}

public bool Contains(object key)
{
    return proxiedHash.Contains(key);
}

public void Add(object key, object value)
{
    proxiedHash.Add(key, value);
}

public void Clear()
{
    proxiedHash.Clear();
}

// etc…

Then I changed the return value of Hash.New to a Hash instead of a Hashtable. The last line became return new Hash(hash) instead of return hash.

Next I moved the ConvertToText method, which, as an instance method, conveniently mapped to ToString.

public override  string ToString()
{
    SeparatedStringBuilder builder = new SeparatedStringBuilder(", ");
    ICollection keys = CollectionUtils.TryToSort(Keys);
    foreach (object key in keys)
    {
        builder.AppendFormat("{0} => {1}", Encode(key), Encode(this[key]));
    }
    return "{" + builder.ToString() + "}";
}

private object Encode(object value)
{
    if (value == null)
        return "<NULL>";

    IDictionary dictionary = value as IDictionary;
    if (dictionary != null)
        return new Hash(dictionary).ToString();

    if (value is string)
        return "\"" + value + "\"";

    return value;
}

The SeparatedStringBuilder class is a StringBuilder that adds a custom separator between each string. It’s very convenient whenever you’re a building a comma-separated list, as above. It’s proven to be handy in a variety of situations. For example, I’ve used it to build a SQL WHERE clause by making ” AND ” the separator. It’s included with the code download at the bottom of this article.

Notice, also, that we’re still using a CollectionUtils class. Ah, well. I’ve got to have something to look forward to fixing tomorrow…

The DictionaryUtils.AreEqual method conveniently maps to an instance level Equals method:

public override bool Equals(object obj)
{
    IDictionary other = obj as IDictionary;
    if (other == null) return false;
    Hash hash = new Hash(other);
    return hash.ToString() == ToString();
}

public override int GetHashCode()
{
    return proxiedHash.GetHashCode();
}

The syntax is much cleaner than the old DictionaryUtils class. It’s nicely encapsulated, fits conveniently into the framework overrides, and is object-oriented, allowing us to add other utility methods to the Hash class easily. It’s especially nice for testing, since the Equals method will work against any dictionary implementation, not just Hashes:

Assert.AreEqual(Hash.New(“address”, customer.Address), propertyBag);

The approach was simple, relying on proxying for fulfilling the IDictionary implementation (I’m probably abusing the word “proxying,” since we’re not doing anything with the interception. Really, this is nothing more than the Decorator design pattern). That was easy only because the framework actually provided an interface to subtype; the same isn’t true of String and Date. However, it isn’t true of StringBuilder either; if you look at the code, SeparatedStringBuilder looks like a StringBuilder, it talks like a StringBuilder, and it quacks like a StringBuilder, but there is no syntactic relationship between them since StringBuilder is sealed and doesn’t implement an interface. While the need for SeparatedStringBuilder may represent a special case, I think I’d prefer creating similar-looking objects rather than relying on a framework-provided xxx and a custom built xxxUtils class. Proxying, as used by Hash, generally makes such implementations trivial and clean, leaving you to spend your time developing what you really want without making the API unnecessarily ugly.

All the code needed to compile and test the Hash class can be found here.

Advertisements

Written by Brandon Byars

August 28, 2007 at 11:40 pm

Posted in .NET, Design

Tagged with

Using Higher Order Functions in Windows Forms Applications

My wife is in the middle of a research project comparing diet to the age of reproduction in African house snakes. She has to collect quite a bit of data, and when I finally looked at the spreadsheets she was maintaining, I was ashamed that I had not written something for her earlier.

This was really the first Windows Forms application that I’ve had the opportunity to do in years (my UI’s aren’t very inspiring). However, I have to maintain a couple at work that were primarily written by former colleagues, and I’ve always been a bit dismayed at the enormous amount of duplication that the standard event-driven application generates.

Despite the fact that the application I wrote for my wife was nothing more than a one-off application, one which you don’t expect to have to maintain, I focused on eliminating the duplication I see in the Windows applications at work. The result isn’t something that I would even begin to consider done for a corporate application, but I found the duplication removal techniques worth writing about. The code can be found here.

The biggest gains in removing duplication, and the ones most readers are likely to be least familiar with, are the use of higher order functions. My impression is that most C# developers aren’t very comfortable with higher order functions. Actually, I think that’s probably true for most developers working within mainstream commercially developed (Microsoft, Borland, Sun) languages. They’re simply not emphasized enough.

For example, all the forms had a ListView to display the data. All of them had to define the column header names and the data that goes in each row. It looked something like this:

protected override void AddHeaders()
{
    AddHeader(“Weight”);
    AddHeader(“Length”);
    AddHeader(“HL”);
    AddHeader(“HW”);
}

protected override void AddCells()
{
    AddCell(Weight);
    AddCell(Length);
    AddCell(HeadLength);
    AddCell(HeadWidth);
}

Having the subclass define the column header names and the data that goes in each row didn’t bother me. What did bother me was having to specify the order that the headers and data needed to be shown in two different place. However, while the header names were static, the data would be different for each invocation. The result was to specify the order only once, in an associative array (I used .NET 2.0’s generic Dictionary, which seemed to maintain the order I entered the items). The key would be the column name, and the value would be a function to retrieve the data value.

// The superclass for all Forms…
public class SnakeForm : Form
{
    protected delegate object GetterDelegate(object value);

    private IDictionary associations;

    protected virtual void AddListViewAssociations(
        IDictionary associations)
    {
        throw new NotImplementedException(“Override…”);
    }

    protected virtual IEnumerable ListViewHeaders
    {
        get
        {
            foreach (string header in associations.Keys)
                yield return header;
        }
    }

    protected virtual IEnumerable ListViewValues(object value)
    {
        foreach (GetterDelegate getter in associations.Values)
            yield return getter(value);
    }

    protected virtual void AddCells(object source)
    {
        foreach (object value in ListViewValues(source))
            AddCell(value);
    }

    private void SnakeForm_Load(object sender, EventArgs e)
    {
        associations = new Dictionary();
        AddListViewAssociations(associations);
        AddHeaders();
    }

    private void AddHeaders()
    {
        foreach (string header in ListViewHeaders)
            AddHeader(header);
    }

    private void AddHeader(string name)
    {
        ColumnHeader header = new ColumnHeader();
        header.Text = name;
        lvData.Columns.Add(header);
    }
}

The important things to note are that the subclass is passed, in a template method, a collecting parameter, associations, each entry of which represents a column name along with a way of retrieving the value for a row in that column. The delegate used to retrieve the value can be passed a single state parameter, which is needed by the report forms that need to pass in the source object for each row. Given that information, the superclass can manage most of the work. (AddListViewAssociations would have been abstract, except for the fact that Visual Studio’s designer doesn’t much care for abstract classes.)

For example, here is the information for the measurement form that was first given to show the problem:

protected override void AddListViewAssociations(
    IDictionary associations)
{
    associations.Add(“Weight”, delegate { return Weight; });
    associations.Add(“Length”, delegate { return Length; });
    associations.Add(“HL”, delegate { return HeadLength; });
    associations.Add(“HW”, delegate { return HeadWidth; });
}

One of the benefits of removing the ordering duplication is that the column names now sit beside the functions for retrieving the values, making it easier to understand. Notice that the GetterDelegate definition actually accepts an object parameter. C#’s anonymous delegate syntax lets you ignore unused parameters, making for a somewhat more readable line.

One of the forms shows the information about feedings per snake, and needed that parameter. Below is the entire implementation of the form (aside from the designer-generated code).

// ReportForm is a subclass of SnakeForm
public partial class FeedingBySnakeReport : ReportForm
{
    public FeedingBySnakeReport()
    {
        InitializeComponent();
    }

    protected override void AddListViewAssociations(
        IDictionary associations)
    {
        associations.Add(“Snake”, delegate(object obj)
            { return ((FeedingReportDto)obj).Snake; });
        associations.Add(“Diet”, delegate(object obj)
            { return ((FeedingReportDto)obj).Diet; });
        associations.Add(“Date”, delegate(object obj)
            { return ((FeedingReportDto)obj).Date; });
        associations.Add(“Weight”, delegate(object obj)
            { return ((FeedingReportDto)obj).Weight; });
        associations.Add(“Food Weight”, delegate(object obj)
            { return ((FeedingReportDto)obj).FoodWeight; });
        associations.Add(“Ate?”, delegate(object obj)
            { return ((FeedingReportDto)obj).Ate; });
        associations.Add(“%BM”, delegate(object obj)
            { return ((FeedingReportDto)obj).PercentBodyMass; });
        associations.Add(“Comments”, delegate(object obj)
            { return ((FeedingReportDto)obj).Comments; });
    }

    protected IEnumerable GetReportValues()
    {
        FeedRepository repository = new FeedRepository();
        return repository.FeedingsBySnake(Snake);
    }
}

In case you’re wondering what this form does, it allows you to select a snake, or all snakes, and see the feeding information in the ListView. It also lets you export all the data to a CSV file. Not bad for 30 lines of code.

Another thing that bothered me about all the event handlers was how similar they looked. The workflow was abstracted in the superclass into a HandleEvent method:

protected delegate void EventHandlerDelegate();

protected virtual void HandleEvent(EventHandlerDelegate handler)
{
    Cursor = Cursors.WaitCursor;
    try
    {
        handler();
    }
    catch (Exception ex)
    {
        ShowError(ex.Message);
    }
    finally
    {
        Cursor = Cursors.Default;
    }
}

HandleEvent takes a function that handles the meat of the event handler and wraps it within the code that’s common to all event handlers. Here’s a couple examples:

// In DataEntryForm, an abstract superclass, and subclass of SnakeForm
private void btnSave_Click(object sender, EventArgs)
{
    HandleEvent(delegate
    {
        if (!IsOkToSave())
            return;

        Save();
        AddRow(null);
        FinishListViewUpdate();
        Reset();
    });
}

// In ReportForm, an abstract superclass, and subclass of SnakeForm
private void btnShow_Click(object sender, EventArgs e)
{
    HandleEvent(delegate
    {
        lvData.Items.Clear();

        // GetReportValues() is a template method defined in the subclasses.
        IEnumerable reportValues = GetReportValues();
        foreach (object record in reportValues)
            AddRow(record);
    });
}

Managing the ListView proved to be fertile territory for removing duplication through higher order functions. For example, I used the first row’s data to set the column alignments automatically—if it looked like a number or date, right-align the data; otherwise left-align it.

private void SetAlignments(object record)
{
    int i = 0;

    // A bit hackish, but the report dtos currently provide strings only…
    foreach (object value in ListViewValues(record))
    {
        if (IsNumber(value) || IsDate(value))
            lvData.Columns[i].TextAlign = HorizontalAlignment.Right;
        else
            lvData.Columns[i].TextAlign = HorizontalAlignment.Left;

        i += 1;
    }
}

private bool IsNumber(object value)
{
    try
    {
        double.Parse(value.ToString().Replace(”%”, ””));
        return true;
    }
    catch
    {
        return false;
    }
}

private bool IsDate(object value)
{
    try
    {
        DateTime.Parse(value.ToString());
        return true;
    }
    catch
    {
        return false;
    }
}

Look how alike IsNumber and IsDate look. We can simplify:

private delegate void ParseDelegate(string text);

private bool IsNumber(object value)
{
    return CanParse(value, delegate(string text)
        { double.Parse(text.Replace(”%”, ””)); });
}

private bool IsDate(object value)
{
    return CanParse(value, delegate(string text) { DateTime.Parse(text); });
}

private bool CanParse(object value, ParseDelegate parser)
{
    try
    {
        parser(value.ToString());
        return true;
    }
    catch
    {
        return false;
    }
}

I used a similar trick to auto-size the column widths in the ListView based on the width of the largest item. Here’s the refactored code:

private delegate string GetTextDelegate(int index);

private void AutoSizeListView()
{
    int[] widths = new int[lvData.Columns.Count];
    FillSizes(widths, delegate(int i) { return lvData.Columns[i].Text; });

    foreach (ListViewItem item in lvData.Items)
    {
        FillSizes(widths, delegate(int i) { return item.SubItems[i].Text; });
    }

    for (int i = 0; i < lvData.Columns.Count; i++)
    {
        if (!IsHidden(lvData.Columns[i]))
        {
            lvData.Columns[i].Width = widths[i] + 12;
        }
    }
}

private void FillSizes(int[] widths, GetTextDelegate text)
{
    using (Graphics graphics = CreateGraphics())
    {
        for (int i = 0; i < lvData.Columns.Count; i++)
        {
            SizeF size = graphics.MeasureString(text(i), lvData.Font);
            if (size.Width > widths[i])
                widths[i] = (int)size.Width;
        }
    }
}

private bool IsHidden(ColumnHeader header)
{
    return header.Width == 0;
}

If this were a more long-lived application, I really should have bit the bullet and created my own ListView subclass. The methods above reek of Feature Envy.

Being able to treat functions as first-class objects is extremely useful. For some reason, it doesn’t get the attention it deserves in most development books. And it’s often somewhat obscured by intimidating sounding names like “lambda expressions” thanks to its roots in lambda calculus. However, much of what I was able to do in this application was possible only because I was able treat functions as data and pass them as parameters. And it was helped by the fact that I didn’t have to explicitly define each function as a method; I could create them anonymously like any other data object (although C#’s anonymous delegate syntax is somewhat obscured by the static typing).

Written by Brandon Byars

July 17, 2007 at 12:22 am

Posted in .NET, Design

Tagged with

C# Execute Around Method

Kent Beck called one of the patterns in Smalltalk Best Practice Patterns “Execute Around Method.” It’s a useful pattern for removing duplication in code that requires boilerplate code to be run both before and after the code you really want to write. It’s a much lighter weight method than template methods (no subclassing), which can accomplish the same goal.

As an example, I’ve written the following boilerplate ADO.NET code countless times:

public DataTable GetTable(string query, IDictionary parameters)
{
    using (SqlConnection connection = new SqlConnection(this.connectionString))
    {
        using (SqlCommand command = new SqlCommand(query, connection))
        {
            connection.Open();
            foreach (DictionaryEntry parameter in parameters)
            {
                command.Parameters.AddWithValue(
                    parameter.Key.ToString(), parameter.Value);
            }

            SqlDataAdapter adapter = new SqlDataAdapter(command);
            using (DataSet dataset = new DataSet())
            {
                adapter.Fill(dataset);
                return dataset.Tables0;
            }
        }
    }
}

public void Exec(string query, IDictionary parameters)
{
    using (SqlConnection connection = new SqlConnection(this.connectionString))
    {
        using (SqlCommand command = new SqlCommand(query, connection))
        {
            connection.Open();
            foreach (DictionaryEntry parameter in parameters)
            {
                command.Parameters.AddWithValue(
                    parameter.Key.ToString(), parameter.Value);
            }

            command.ExecuteNonQuery();
        }
    }
}

Notice that the connection and parameter management overwhelms the actual code that each method is trying to get to. And the duplication means I have multiple places to change when I decide to do something differently. However, since the using block encloses the relevant code, a simple Extract Method refactoring is not as easy to see.

Here’s the result of applying an Execute Around Method pattern to it.

private delegate object SqlCommandDelegate(SqlCommand command);

public DataTable GetTable(string query, IDictionary parameters)
{
    return (DataTable)ExecSql(query, parameters, delegate(SqlCommand command)
    {
        SqlDataAdapter adapter = new SqlDataAdapter(command);
        using (DataSet dataset = new DataSet())
        {
            adapter.Fill(dataset);
            return dataset.Tables0;
        }
    });
}

public void Exec(string query, IDictionary parameters)
{
    ExecSql(query, parameters, delegate(SqlCommand command)
    {
        return command.ExecuteNonQuery();
    });
}

private object ExecSql(string query, IDictionary parameters,
    SqlCommandDelegate action)
{
    using (SqlConnection connection = new SqlConnection(this.onnectionString))
    {
        using (SqlCommand command = new SqlCommand(query, connection))
        {
            connection.Open();
            foreach (DictionaryEntry parameter in parameters)
            {
                command.Parameters.AddWithValue(
                    parameter.Key.ToString(), parameter.Value);
            }

            return action(command);
        }
    }
}

Much nicer, no?

Written by Brandon Byars

June 11, 2007 at 11:46 pm

Posted in .NET, Design Patterns

Tagged with

.NET Database Migrations

Pramod Sadalage and Scott Ambler have suggested using a series of numbered change scripts to version your database. Start with a base schema, and every subsequent change gets its own change script, grabbing the next number. That version number is stored in a table in the database, which makes it easy to update—you just run all change scripts, in order, greater than the version stored in your database.

The Ruby on Rails team implemented this technique in their migrations code. It’s quite elegant. This blog uses a Rails application called Typo; here’s one of its migrations:

class AddArticleUserId < ActiveRecord::Migration
  def self.up
    add_column :articles, :user_id, :integer

    puts "Linking article authors to users"
    Article.find(:all).each do |a|
      u=User.find_by_name(a.author)
      if(u)
        a.user=u
        a.save
      end
    end
  end

  def self.down
    remove_column :articles, :user_id
  end
end

That migration is called 3_add_article_user_id.rb, where 3 is the version number. Notice that it’s written in Ruby, not in SQL. It adds a column called user_id to the articles table and updates the data. The data update is particularly interesting—we get to use the ActiveRecord O/RM code instead of having to do it in SQL (although you can use SQL if you need to). The Rails migration code can also rollback changes; that’s what the down method is for.

The problem I’ve always had with this scheme is that we have many database objects that I’d like to version in their own files in our source control system. For example, here’s our directory structure:

db/
  functions/
  migrations/
  procedures/
  triggers/
  views/

We have several files in each directory, and it’s convenient to keep them that way so we can easily check a subversion log and see the history of changes for the database object. For us to use the migrations scheme above, we’d have to create a stored procedure in a migration, and later alter it in a separate migration. Since the two migrations will be in separate files, our source control wouldn’t give us a version history of that stored procedure.

We came up with a hybrid solution. Schema changes to the tables use a migration scheme like Rails. Database objects are versioned in separate files. Both the schema changes and the peripheral database object changes are updated when we update the database.

For this to work, we have to be a little careful with how we create the database objects. We want them to work regardless of whether we’re creating them for the first time or updating them, which means ALTER statements won’t work. The solution is simply to drop the object if it exists, and then create it. This is a fairly common pattern.

I wrote an NAnt and MSBuild task to do the dirty work. It runs both the schema migrations and the database object updates. Both are optional, so if migrations are all you want, that’s all you need to use. It expects all migrations to be in the same directory, and match the pattern 1.comment.sql, where 1 is the version number. It will be stored in a database table whose default name is SchemaVersion, with the following structure:

CREATE TABLE SchemaVersion (
  Version int,
  MigrationDate datetime,
  Comment varchar(255)
)

I’ve only tested it on SQL Server, but I think the task should work for other DBMS’s as well (it uses OLEDB). Migrations can contain batches (using the SQL Server GO command) and are run transactionally. Unlike the Rails example, the .NET migrations use SQL, and I don’t yet have any rollback functionality.

You can include any extra SQL files you want in the DatabaseObjects property. Both NAnt and MSBuild have convenient ways to recursively add all files matching an extension.

Here’s an NAnt example:

<target name="migrate" description="Update the database">
    <loadtasks assembly="Migrations.dll" />
    <migrateDatabase
        connectionString="${connectionString}"
        migrationsDirectory="db/migrations"
        commandTimeout="600"
        batchSeparator="go">
        <fileset>
            <include name="db/functions/**/*.sql"/>
            <include name="db/procedures/**/*.sql"/>
            <include name="db/triggers/**/*.sql"/>
            <include name="db/views/**/*.sql"/>
        </fileset>
    </migrateDatabase>
</target>

And here it is using MSBuild:

<ItemGroup>
    <DatabaseObjects Include="db/functions/**/*.sql"/>
    <DatabaseObjects Include="db/procedures/**/*.sql"/>
    <DatabaseObjects Include="db/triggers/**/*.sql"/>
    <DatabaseObjects Include="db/views/**/*.sql"/>
</ItemGroup>

<Target Name="dbMigrate">
    <MigrateDatabase 
        ConnectionString="$(ConnectionString)"
        MigrationsDirectory="db/migrations"
        DatabaseObjects="@(DatabaseObjects)"
        CommandTimeout="600"
        TableName="version_info" />
</Target>

The source code and binaries can be found here.

Written by Brandon Byars

April 14, 2007 at 10:35 pm