A Day In The Lyf

…the lyf so short, the craft so longe to lerne

Posts Tagged ‘Ruby

Orthogonality

Orthogonality means that features can be used in any combination, that the combinations all make sense, and that the meaning of a given feature is consistent, regardless of the other features with which it is combined. The name is meant to draw an explicit analogy to orthogonal vectors in linear algebra: none of the vectors in an orthogonal set depends on (or can be expressed in terms of) the others, and all are needed in order to describe the vector space as a whole.

– Michael Scott (Programming Language Pragmatics)

I’ve used Delphi and Visual Basic at previous jobs. I disliked both of them. VB has this annoying distinction between objects and primitives, so I’d consistently type

Dim server
server = "localhost"

Dim catalog
catalog = CreateObject("MyCompany.MyObject")

…only to be greeted by an ugly Windows popup box that Object doesn't support this property or method. The problem, as all you grizzled VB programmers no doubt spotted immediately, is the last line should start with the keyword Set. VB requires you to prefix assignments on “objects” with Set. But if you try to put a Set in front of assignments on what VB considers primitives (like the first assignment above), you get an error that reads Object required: '[string: "localhost"]'.

Delphi likewise frustrated me with mundane annoyances:

DoOneThing();

if value = goal then
    DoSomething();
else
    DoSomethingElse();

The code above doesn’t compile, and it won’t compile until you remove the semicolon from the call to DoSomething(). Semicolons complete a statement, and the statement is the entire if-else clause.

These problems in VB and Delphi are related to the concept of orthogonality mentioned in the opening quote. VB doesn’t let you compose assignment to objects the same way it lets you compose assignment to primitives. Delphi doesn’t let you end an if clause the same way it lets you end an else clause. These inconsistencies encourage even experienced programmers to make silly syntax mistakes and make the language harder to use.

What is orthogonality?

The key principles I extracted from Michael Scott’s quote listed above are consistency and composability. Composability means that features can be combined, and consistency stands in for the Principle of Least Surprise—features act how you expect they would, regardless of how they’re being combined. VB’s assignments lack consistency. Delphi’s semicolon parsing doesn’t act consistently when composed within a surrounding statement.

Scott claims that a highly orthogonal language is easier to understand and easier to use. Nowhere does he mention that it’s easier to implement. I’m sure the Delphi grammar was simplified considerably by refusing to allow the DoSomething statement to end in a semicolon when contained within an outer if-else clause. It’s also likely that the implementation of Visual Basic was simplified by having the programmer tell the compiler whether an assignment referred to an object or a primitive.

I suspect many non-orthogonal aspects of languages are there to make them easier to implement. However, some languages that are trivially easy to implement can also be amazingly orthogonal. Lisp is a prime example; it is a highly orthogonal language, and yet an entire Lisp interpreter written in Lisp fits on just one page of the Lisp 1.5 Programmer’s Manual.

I found it instructive to list out syntactic constructs that make languages less orthogonal. It’s amazing how mainstream most of them are:

Statements

Statements aren’t necessary, and a number of languages avoid them altogether. Having only expressions makes the language easier to work in. Compare the inconsistency of Delphi’s statement parsing to the composability of Ruby’s expressions:

puts if x == 0
  "Zero"
else
  "Not Zero"
end

The if clause is an expression; it returns the last value evaluated (e.g., either “Zero” or “Not Zero”). And that value can be composed within another expression.

So what happens when you don’t have a reasonable return value? Smalltalk always returns self, which is convenient because it allows method chaining. In the Ruby example above, the entire expression returns the result of the call to puts, which happens to be nil.

The beauty of expressions is that the composability doesn’t just go one level deep:

x = if a == 1; 1; else; 2 end

y = x = if a == 1; 1; else; 2; end

puts y = x = if a == 1; 1; else; 2; end

puts "returns nil" if (puts y = x = if a == 1; 1; else; 2; end).nil?

As the example above shows, orthogonality doesn’t guarantee good code. However, it allows the language to be used in unanticipated ways, which is A Good Thing. Moreover, since everything is an expression, you can put expressions where you wouldn’t normally expect them. For example, in Ruby, the < operator represents inheritance when the receiver is a class, but the superclass could be an expression:

class Test < if x == 1; OneClass; else; TwoClass
end

Pushing a language like this, only to find that it’s turtles all the way down, is a signpost for orthogonality.

Primitives

We already saw the clumsiness of VB primitives, but most mainstream languages share a similar problem. Java, for example, has a confusing dichotomy of longs and Longs, the first a primitive and the second a full-fledged object. C has stack-allocated primitives, which are freed automatically when they fall out of scope, and heap-allocated variables, which you have to free yourself. C# has value types, which force another abstraction – boxing – into the programmer’s lap.

private static object DBNullIf(int value, int nullValue)
{

    return (value != nullValue) ? (object)value : DBNull.Value;

}

The code above should be a head-scratcher. Isn’t object the superclass of all other types? And if so, why do we have to explicitly cast our int variable to an object to make this code compile? Why don’t we have to do the same for the DBNull?

I mentioned above that ease of implementation is a common source of non-orthogonality. With primitives, we can see another, more legitimate reason: performance. There is a cost to keeping primitives out of the language. Below, we’ll see several more non-orthogonal language features that make performance easier to optimize.

Nulls

Nulls have been problematic enough that an entire design pattern has been created to avoid them. Access violations and object reference exceptions are some of the most common programmer errors. Most languages divide object references into two types: those that point to an object, and those that point to nothing. There’s nothing intrinsically non-orthogonal about that division, except that in most languages the references that point to nothing work differently. Instead of returning a value, they throw an exception when dereferenced.

In Ruby, the division is still more or less there — some references point to objects, and some point to nothing — but both types of references still return a value. In effect, Ruby has built the Null Object pattern into the language, as the references that point to nothing return a nil value. But, like everything else in Ruby, nil is an object (of type NilClass), and can be used in expressions:

1.nil?      # returns false
nil.nil?    # returns true

You never get an NullReferenceException in Ruby. Instead, you get a NoMethodError when you try to call a method on nil that doesn’t exist, which is exactly the same error you’d get if you called a method on any object that didn’t exist.

Magic Functions

Most object-oriented languages have certain functions that aren’t really methods. Instead, they’re special extensions to the language that make the class-based approach work.

public class Magic
{
    public static void ClassMethod()
    {
    }

    public Magic()
    {
    }

    public void NormalMethod()
    {
    }
}

Magic.ClassMethod();
Magic magic = new Magic(5);
magic.NormalMethod();

Notice the context shift we make when constructing new objects. Instead of making method calls via the normal syntax (receiver.method()), we use the keyword new and give the class name. But what is the class but a factory for instances, and what is the constructor but a creation method? In Ruby, the constructor is just a normal class-level method:

regex = Regexp.new(phone_pattern)

Typically, the new class-level method allocates an instance and delegates to the newly created instance’s initialize method (which is what most Ruby programmers call the “constructor”). But, since new is just a normal method, if you really wanted to, you could override it and do something different:

class Test
  def self.new
    2 + 2
  end
end

Test.new    # returns 4!!

Operators also tend to be magic methods in many languages. Ruby more or less treats them just like other methods, except the interpreter steps in to break the elegant consistency of everything-as-a-method to provide operator precedence. Smalltalk and Lisp, two other highly orthogonal languages, do not provide operator precedence. Operators are just normal methods (functions in Lisp), and work with the same precedence rules as any other method.

So here we have yet another reason, on top of ease of language implementation and performance, to add non-orthogonality to a language. Ruby adds operator precedence, even though it adds an element of inconsistency, presumably because it makes the language more intuitive. Since intuitiveness is one of the purported benefits of orthogonality, there is a legitimate conflict of interest here. I think I would prefer not having operator precedence, and leaving the language consistent, but it seems more of a question of style than anything else.

Static and sealed Methods

Class-based object-oriented languages impose a dichotomy between classes and instances of those classes. Most of them still allow behavior to exist on classes, but that behavior is treated differently than instance behavior. By declaring class methods as static, you’re telling the compiler that it’s free to compute the address of this function at compile-time, instead of allowing the dynamic binding that gives you polymorphism.

Ruby gives you some degree of polymorphism at the class level (although you can’t call superclass methods, for obvious reasons):

class Animal
  def self.description
    "kingdom Animalia"
  end
end

class Person < Animal
  def self.description
    "semi-evolved simians"
  end
end

In most mainstream languages, not even all instance methods are polymorphic. They are by default in Java, although you can make them statically-bound by declaring them final. C# and C++ take a more extreme approach, forcing you to declare them virtual if you want to use them polymorphically.

Behavior cannot be combined consistently between virtual and sealed (or final) methods. It’s a common complaint when developers try to extend a framework only to find out that the relevant classes are sealed.

Instance Variables

Only the pure functionalists can get by without state; the rest of us need to remember things. But there’s no reason why the mechanism of retrieving stored values has to be treated differently from the mechanism of calling a parameter-less function that computes a value, nor is there a reason that the mechanism for storing a value has to be different from the mechanism of calling a setter function to store the value for you.

It is common OO dogma that state should be private, and if you need to expose it, you should do so through getters and setters. The evolution of the popular Rails framework recently reaffirmed this dogma. In prior versions, sessions were exposed to the controllers via the @session instance variable. When they needed to add some logic to storing and retrieving, they could no longer expose the simple variable, and refactored to a getter/setter attribute access. They were able to do so in a backward-compatible way, by making the @session variable a proxy to an object that managed the logic, but it was still a process that a more orthogonal language wouldn’t have required. The language should not force you to distinguish between field access and method access.

Both Lisp and Eiffel treat instance variables equivalently to function calls (at least when it comes to rvalues). Lisp simply looks up atoms in the environment, and if that atom is a function (lambda), then it can be called to retrieve the value no differently than if the atom is a variable containing the value. Eiffel, an object-oriented language, declares variables and methods using the same keyword (feature), and exposes them – both to the outside world and to the class itself – the same way (Bertrand Meyer called this the Uniform Access principle):

class POINT
feature
    x, y: REAL
            -- Abscissa and ordinate

    rho: REAL is
            -- Distance to origin (0,0)
        do
            Result := sqrt(x^2 + y^2)
        end

Once you get passed Eiffel’s tradition of YELLING TYPES at you and hiding the actual code, Uniform Access makes a lot of sense. Instance variables are just like features without bodies. C# has properties, which provide similar benefits, but force you to explicitly declare them. Instead of a getRho and setRho method, you can have a property that allows clients to use the same syntax regardless of whether they’re using a property or a field. Because Ruby allows the = symbol as part of a method name, it allows a similar syntax.

However, the separation between variables and properties is superfluous. For example, there’s no need for them to have separate access levels. If other classes need the state exposed, then declare it public. If the language doesn’t offer instance variables, then you’re simply exposing a property method. If you run into the same problem that Rails ran into, and suddenly need to add behavior around exposed state, refactoring should be easy. Just add a private property method that is now the state, and leave the public property method.

So, in my hypothetical language, we might have the following:

class Controller
  public feature session
end

And when you feel you need to add behavior around the exposed session dictionary, it should be easy:

class Controller
  public feature session
    # added behavior goes here
    private_session
  end

  private feature private_session
end

One thing that this hypothetical syntax doesn’t allow is separate access levels for getting and setting, but it shows the basic idea.

Inconsistent Access Levels

Since we’re on the subject of access levels, Ruby’s private access level is not very orthogonal at all. Unlike C++ derived languages, Ruby’s private is object-level, not class-level, which means that even other instances of the same class can’t directly access the private method. That’s a reasonable constraint.

However, instead of making object-level private access level orthogonal, the implementors simply disallowed developers to specify the receiver for private methods. This undoubtedly made implementing object-level private access much easier. Unfortunately, it means that you can’t even use self as the receiver within the object itself, which makes moving a method from public or protected to private non-transparent, even if all references to the method are within the class itself:

class TestAccessLevels
  def say_my_name
    puts self.name
  end

  private
  def name
    "Snoopy"
  end
end

# The following line throws an exception
TestAccessLevels.new.say_my_name

Second class types

Having special types, like null, is a special case of having primitives, and it’s a common performance optimization. Being a first-class type has a well-defined meaning. Specifically, its instances can:

  • be passed to functions as parameters
  • be returned from functions
  • be created and stored in variables
  • be created anonymously
  • be created dynamically

It’s becoming increasingly common for modern languages to move towards first-class functions. In the Execution in the Land of the Nouns, Steve Yegge parodied the typical transmogrifications Java programmers had accustomed themselves to in order to sidestep the language’s lack of first-class functions. Java’s cousin (descendant?), C#, has more or less had them since .NET 2.0 in the form of anonymous delegates.

What neither Java nor C# have are first-class classes. Both have reflection, and even allow you to create new types at runtime (painfully…). But, because both languages are statically typed, you can’t access these runtime-created types the same way you can access normal types. Assuming Foo is a runtime-created type, the following code won’t compile:

Foo foo = new Foo();

The only way to make it work is by heavy use of reflection. Ruby, on the other hand, makes it trivially easy to add types at runtime. In fact, all types are added at runtime, since it’s an interpreted language.

Single Return Values

Since you can pass more than one argument to functions, why should you only be allowed to return one value from functions? Languages like ML add orthogonality by returning tuples. Ruby works similarly, unpacking arrays automatically for you if you use multiple lvalues in one expression.

def duplicate(value)
  [value, value]
end

first, second = duplicate("hi there")

This feature allows you to combine lvalues and rvalues more consistently.

Inconsistent Nesting

Nesting is the standard programming trick of limiting scope, and most languages provide blocks that you can nest indefinitely if you want to limit scope. For example, in C#:

public void TestNest()
{
    string outer = "outer";
    {
        string middle = "middle";
        {
            string inner = "inner";
        }
    }
}

In addition to blocks, Ruby allows you to nest functions, but the nesting doesn’t work in an orthogonal way:

def outer
  inner_var = "inner variable"

  def inner
    "inner method"
  end
end

outer
puts inner
puts inner_var

In this example, the last line will throw a NameError, complaining that inner_var is undefined. This is as we should expect – since it’s defined inside an inner scope from where we’re calling it, we should not be able to access it. However, the same is not true for the inner method defined in the call to outer. Despite the fact that it’s defined within a nested scope, it actually has the same scoping as outer.

Ruby’s scoping gets even weirder:

def outer
  begin
    inner_var = "inner"
  end

  inner_var
end

puts outer

This code works, printing “inner” to the console. It shouldn’t.

JavaScript similarly suffers from strange scoping rules. Because all variables have function scope, and not block scope, a variable is available everywhere within a function regardless of where it is defined:

function testScoping(predicate) {
  if (predicate) {
    var test = "in if block";    
  }
  alert(test);    // works, even though it's in an outer block
}

Fixing your mistakes

Speaking of JavaScript, few things bug me more about programming languages than those that try to fix your mistakes for you. JavaScript automatically appends semicolons for you at the end of a line if you forget to. Most of the time, that works fine, but every now and then it creates a ridiculously hard bug to diagnose:

return
{
  value: 0
}

The return statement above looks like it returns an object with a single property. What it really does, though, is simply return. The semicolon was appended on your behalf at the end of the return keyword, turning the following three lines into dead code. It is for this reason that Douglas Crockford recommends always using K&R style braces in JavaScript in JavaScript: The Good Parts.

Summary

Orthogonality makes the language easier to extend in ways that the language implementors didn’t anticipate. For example, Ruby’s first-class types, combined with its duck-typing, allows the popular mocking framework Mocha to have a very nice syntax:

logger = stub
logger.expects(:error).with("Execution error")

do_something(:blow_up, logger)

The fact that classes in Ruby are also objects means that the same syntax works for classes:

File.stubs(:exists?).returns(true)

I picked on JavaScript a couple times, but it did a better job of any other language I know of with literals. JavaScript has literal numbers and strings, like most languages. It has literal regular expressions and lists like Perl and Ruby. But where it really shines is in it’s literal object syntax. Objects are basically dictionaries, and Ruby and Perl have hash literals, but JavaScripts objects include function literals:

function ajaxSort(columnName, sortDirection) {
    setNotice('Sorting...');
    new Ajax.Request("/reports/ajaxSort", {
            method: "get",
            parameters: { sort: columnName, sortDir: sortDirection },
            onSuccess: function (transport) { 
                $("dataTable").innerHTML = transport.responseText;
            },
            onComplete: function () { clearNotice(); }
        }
    );
}

In time, that syntax was leveraged to form the JSON format.

In a more dramatic example, when object-orientation became all the rage, Lisp was able to add object-orientation to the language without changing the language. The Common Lisp Object System (CLOS) is written entirely in Lisp. The reason Lisp was able absorb an entire new paradigm is largely due to the language’s orthogonality. Not all function-like calls work the same way; some are called “special forms” because they act differently (for example, by providing lazy evaluation of the arguments). However, Lisp allows the programmer to create their own special forms by writing macros, which are themselves written in Lisp.

It helps to have turtles all the way down.

Written by Brandon Byars

July 21, 2008 at 10:36 pm

Posted in Languages

Tagged with , , , , , , , ,

Code Generation and Metaprogramming

I wanted to expand upon an idea that I first talked about in my previous post on Common Lisp. There is a common pattern between syntactic macros, runtime metaprogramming, and static code generation.

Runtime metaprogramming is code-generation. Just like C macros. Just like CL macros.

Ok, that’s a bit of an overstatement. Those three things aren’t really just like each other. But they are definitely related—they all write code that you’d rather not write yourself. Because it’s boring. And repetitious. And ugly.

In general, there are three points at which you can generate code in the development process, although the terminology leaves something to be desired: before compilation, during compilation (or interpretation), and during runtime. In the software development vernacular, only the first option is typically called code-generation (I’ll call it static code generation to avoid confusion). Code generation during compilation goes under the moniker of a ‘syntactic macro,’ and I’m calling runtime code generation ‘runtime metaprogramming.’

Since the “meta” in metaprogramming implies writing code that writes code, all three forms of code generation can be considered metaprogramming, which is why I snuck the “runtime” prefix into the third option above. Just in case you were wondering…

Static Code Generation

Static code generation is the easiest to understand and the weakest of the three options, but it’s often your only option due to language limitations. C macros are an example of static code generation, and it is the only metaprogramming option possible with C out-of-the box.

To take an example, on a previous project I generated code for lazy loading proxies in C#. A proxy, one of the standard GoF design patterns, sits in between a client and an object and intercepts messages that the client sends to the object. For lazy loading, this means that we can instantiate a proxy in place of a database-loaded object, and the client can use it without even knowing that it’s using a proxy. For performance reasons, the actual database object will only be loaded on first access of the proxy. Here’s a truncated example:

public class OrderProxy : IOrder
{
    private IOrder proxiedOrder = null;
    private long id;
    private bool isLoaded = false;

    public OrderProxy(long id)
    {
        this.id = id;
    }

    private void Load()
    {
        if (!isLoaded)
        {
           proxiedOrder = Find();
           isLoaded = true;
        }
    }

    private IOrder Find()
    {
        return FinderRegistry.OrderFinder.Find(id);
    }

    public string OrderNumber
    {
        get
        {
           Load();
           return proxiedOrder.OrderNumber;
        }
        set
        {
           Load();
           proxiedOrder.OrderNumber = value;
        }
    }

    public DateTime DateSubmitted
    {
        get
        {
           Load();
           return proxiedOrder.DateSubmitted;
        }
    }
}

This code is boring to write and boring to maintain. Every time the interface changes, a very repetitious change has to be made in the proxy. To make it worse, we have to do this for every database entity we’ll want to load (at least those we’re worried about lazy-loading). All I’d really like to say is “make this class implement the appropriate interface, and make it a lazy-loading proxy.” Fortunately, since the proxy is supposed to be a drop-in replacement for any other class implementing the same interface, we can use reflection to query the interface and statically generate the proxy.

There’s an important limitation to generating this code statically. Because we’re doing this before compilation, this approach requires a separated interfaces approach, where the binary containing the interfaces is separate from the assembly we’re generating the proxies for. We’ll have to compile the interfaces, use reflection on the compiled assembly to generate the source code for the proxies, and compile the newly generated source code.

But it’s do-able. Simply load the interface using reflection:

public static Type GetType(string name, string nameSpace, string assemblyFileName)
{
    if (!File.Exists(assemblyFileName))
        throw new IOException("No such file");

    Assembly assembly = Assembly.LoadFile(Path.GetFullPath(assemblyFileName));
    string qualifiedName = string.Format(“{0}.{1}”, nameSpace, name);
    return assembly.GetType(qualifiedName, true, true);
}

From there it’s pretty trivial to loop through the properties and methods and recreate the source code for them on the proxy, with a call to Load before delegating to the proxied object.

Runtime Metaprogramming

Now it turns out that when I wrote the code generation code above, there weren’t very many mature object-relational mappers in the .NET space. Fortunately, that’s changed, and the code above is no longer necessary. NHibernate will lazy-load for you, using a similar proxy approach that I used above. Except, NHibernate will write the proxy code at runtime.

The mechanics of how this work are encapsulated in a nice little library called Castle.DynamicProxy. NHibernate uses reflection to read interfaces (or virtual classes) and calls DynamicProxy to runtime generate code using the Reflection.Emit namespace. In C#, that’s a difficult thing to do, which is why I wouldn’t recommend doing it unless you use DynamicProxy.

This is a much more powerful technique than static code generation. For starters, you no longer need two assemblies, one for the interfaces, and one for the proxies. But the power of runtime metaprogramming extends well beyond saving you a simple .NET assembly.

Ruby makes metaprogramming much easier than C#. The standard Rails object-relational mapper also uses proxies to manage associations, but the metaprogramming applies even to the model classes themselves (which are equivalent to the classes that implement our .NET interfaces). The truncated IOrder implementation above showed 3 properties: Id, OrderNumber, and DateSubmitted. Assuming we have those columns in our orders table in the database, then the following Ruby class completely implements the same interface:

class Order < ActiveRecord::Base
end

At runtime, The ActiveRecord::Base superclass will load the schema of the orders table, and for each column, add a property to the Order class of the same name. Now we really see the power of metaprogramming: it helps us keep our code DRY. If it’s already specified in the database schema, why should we have to specify it in our application code as well?

Syntactic Macros

It probably wouldn’t make much sense to generate lazy-loading proxies at compile time, but that doesn’t mean syntactic macros don’t have their place. Used appropriately, they can DRY up your code in ways that even runtime metaprogramming cannot.

Peter Seibel gives a good example of building a unit test framework in Common Lisp. The idea is that we’d like to assert certain code is true, but also show the asserted code in our report. For example:

pass ... (= (+ 1 2) 3)
pass ... (= (+ 1 2 3) 6)
pass ... (= (-1 -3) -4)

The code to make this work, assuming report-result is implemented correctly, looks like this:

(defun test-+ ()
  (report-result (= (+ 1 2) 3) '(= (+ 1 2) 3))
  (report-result (= (+ 1 2 3) 6) '(= (+1 2 3) 6))
  (report-result (= (+ -1 -3) -4) '(= (+ -1 -3) -4)))

Notice the ugly duplication in each call to report-result. We have the code that’s actually executed (the first parameter), and the quoted list to report (the second parameter). Runtime metaprogramming could not solve the problem because the first parameter will be evaluated before being passed to report-result. Static code-generation could remove the duplication, but would be ugly. We could DRY up the code at compile time, if only we had access to the abstract syntax tree. Fortunately, in CL, the source code is little more than a textual representation of the AST.

Here’s the macro that Seibel comes up with:

(defmacro check (&body forms)
  `(progn
    ,@(loop for f in forms collect `(report-result ,f ',f))))

Notice how the source code within the list (represented as the loop variable f) is both executed and quoted. The test now becomes much simpler:

(defun test-+ ()
  (check (= (+ 1 2) 3))
  (check (= (+ 1 2 3) 6))
  (check (= (+ -1 -3) -4)))

Summary

Finding ways to eliminate duplication is always A Good Thing. For a long time, if you were programming in a mainstream language, then static code generation was your only option when code generation was needed. Things changed with the advent of reflection based languages, particularly when Java and C# joined the list of mainstream languages. Even though their metaprogramming capability isn’t as powerful as languages like Smalltalk and Ruby, they at least introduced metaprogramming techniques to the masses.

Of course, Lisp has been around since, say, the 1950’s (I’m not sure how long macros have been around, however). Syntactic macros provide a very powerful way of generating code, even letting you change the language. But until more languages implement them, they will never become as popular as they should be.

Written by Brandon Byars

March 29, 2008 at 6:00 pm

Auto-merging fixes

Paul Gross recently blogged about a rake task to automatically merge changes to the trunk if the change was made in a branch. This seemed like a useful feature, even though we don’t use rake.

Fixing productions bugs and merging is no fun, but why not take some of the pain out of the process? Depending on the developer and their mood, we either fix the bug in the trunk and merge to the branch, or fix it it the branch and merge to the trunk. Where it really gets ugly is when we have to merge two release branches back, because we make our release branch a few days before actually pushing it to production. Any urgent bug fix requiring a merge during that time has to be merged both to the new release branch as well as to the release branch currently in production.

Using Paul’s code as a base, I wrote automerge.rb, which, by default, either merges to the trunk (if you’re in a branch), or merges to the latest release branch (if you’re in the trunk). Alternatively, you can pass a release number, and automerge.rb will merge to that release branch. In all cases, you have to have the working copy checked out on your machine, and, if you’re on Windows, you need to make sure to put patch in your path.

The script assumes that your directory structure looks something like the following:

/trunk
 /branches
   /Release-2.8
   /Release-2.9

The Release-major-minor format is just our branch naming standard; it’s easy to change.

if !ARGV[0].nil?
  dest_dir = "branches/Release-#{ARGV[0]}"
  replace_pattern = /(trunk|branches).*$/i
elsif %x[svn info].include? "branches"
  dest_dir = "trunk"
  replace_pattern = /branches.*$/i
elsif %x[svn info].include? "trunk"
  pattern = /^Release-(\d+).(\d+)\/$/

  branches = %x[svn list http://devtools01:8080/svn/lazarus/branches]
  releases = branches.split.find_all {|branch| branch =~ pattern}

  # sort them in reverse order by major, minor
  releases.sort! do |first, second|
    first_major, first_minor = pattern.match(first)[1..2]
    second_major, second_minor = pattern.match(second)[1..2]

    if first_major != second_major
      second_major.to_i <=> first_major.to_i
    else
      second_minor.to_i <=> first_minor.to_i
    end
  end

  dest_dir = "branches/#{releases[0]}"
  replace_pattern = /trunk/
end

puts "Merging changes into #{dest_dir}.  Don't forget to check these in."
dest_path = Dir.pwd.gsub(replace_pattern, dest_dir)
puts dest_path
system "svn diff | patch -p0 -d #{dest_path}"

The only tricky part here is figuring out the latest release branch, which is done by using the svn list command followed by a custom sort.

Written by Brandon Byars

October 22, 2007 at 8:06 pm

Posted in Configuration Management, Ruby

Tagged with ,

C# Enum Generation

Ayende recently asked on the ALT.NET mailing list about the various methods developers use to provide lookup values, with the question framed as one between lookup tables and enums. My own preference is to use both, but keep it DRY with code generation.

To demonstrate the idea, I wrote a Ruby script that generates a C# enum file from some metadata. I much prefer Ruby to pure .NET solutions like CodeSmith—I find it easier and more powerful (I do think CodeSmith is excellent if there is no Ruby expertise on the team, however). The full source for this example can be grabbed here.

The idea is simple. I want a straightforward and extensible way to provide metadata for lookup values, following the Ruby Way of convention over configuration. XML is very popular in the .NET world, but the Ruby world views it as overly verbose, and prefers lighter markup languages like YAML. For my purposes, I decided not to mess with markup at all (although I’m still considering switching to YAML—the hash of hashes approach describes what I want well). Here’s some example metadata:

enums = {
  'OrderType' => {},
  'MethodOfPayment' => {:table => 'PaymentMethod',},
  'StateProvince' => {:table => 'StateProvinces',
                      :name_column => 'Abbreviation',
                      :id_column => 'StateProvinceId',
                      :transformer => lambda {|value| value.upcase},
                      :filter => lambda {|value| !value.empty?}}
}

That list, which is valid Ruby code, describes three enums, which will be named OrderType, MethodOfPayment, and StateProvince. The intention is that, where you followed your database standards, you should usually be able to get by without adding any extra metadata, as shown in the OrderType example. The code generator will get the ids and enum names from the OrderType table (expecting the columns to be named OrderTypeId and Description) and create the enum from those values. As StateProvince shows, the table name and two column names can be overridden.

More interestingly, you can both transform and filter the enum names by passing lambdas (which are like anonymous delegates in C#). The ‘StateProvince’ example above will filter out any states that, after cleaning up any illegal characters, equal an empty string, and then it will upper case the name.

We use a pre-build event in our project to build the enum file. However, if you simply overwrite the file every time you build, you may slow down the build process considerably. MSBuild (used by Visual Studio) evidently sees that the timestamp has been updated, so it rebuilds the project, forcing a rebuild of all downstream dependent projects. A better solution is to only overwrite the file if there are changes:

require File.dirname(__FILE__) + '/enum_generator'

gen = EnumGenerator.new('localhost', ‘database-name’)
source = gen.generate_all(‘Namespace', enums)

filename = File.join(File.dirname(__FILE__), 'Enums.cs')
if Dir[filename].empty? || source != IO.read(filename)
  File.open(filename, 'w') {|file| file << source}
end

I define the basic templates straight in the EnumGenerator class, but allow them to be swapped out. In theory, the default name column and the default lambda for generating the id column name given the table name (or enum name) could be handled the same way. Below is the EnumGenerator code:

class EnumGenerator
  FILE_TEMPLATE = <<EOT
//------------------------------------------------------------------------------
// <auto-generated>
//     This code was generated by a tool from <%= catalog %> on <%= server %>.
//
//     Changes to this file may cause incorrect behavior and will be lost if
//     the code is regenerated.
// </auto-generated>
//------------------------------------------------------------------------------

namespace <%= namespace %>
{
    <%= enums %>
}
EOT

  ENUM_TEMPLATE = <<EOT
public enum <%= enum_name %>
{
<% values.keys.sort.each_with_index do |id, i| -%>
    <%= values[id] %> = <%= id %><%= ',' unless i == values.length - 1 %>
<% end -%>
}

EOT

  # Change the templates by calling these setters
  attr_accessor :enum_template, :file_template

  attr_reader :server, :catalog

  def initialize(server, catalog)
    @server, @catalog = server, catalog
    @enum_template, @file_template = ENUM_TEMPLATE, FILE_TEMPLATE
  end
end

The code generation uses erb, the standard Ruby templating language:

def transform(template, template_binding)
  erb = ERB.new(template, nil, '-')
  erb.result template_binding
end

template_binding describes the variables available to use in the template in much the same way that Castle Monorail’s PropertyBag describes the variables available to the views. The difference is that, because Ruby is dynamic, you don’t have to explictly add values to the binding. The rest of the code is shown below:

def generate(enum_name, attributes)
  table = attributes[:table] || enum_name
  filter = attributes[:filter] || lambda {|value| true}
  values = enum_values(table, attributes)
  values.delete_if {|key, value| !filter.call(value)}
  transform enum_template, binding
end

def generate_all(namespace, metadata)
  enums = ''
  metadata.keys.sort.each {|enum_name| enums << generate(enum_name, metadata[enum_name])}
  enums = enums.gsub(/\n/m, "\n\t").strip
  transform file_template, binding
end

private
def enum_values(table, attributes)
  sql = get_sql table, attributes
  @dbh ||= DBI.connect("DBI:ADO:Provider=SQLNCLI;server=#{server};database=#{catalog};Integrated Security=SSPI")
  sth = @dbh.execute sql
  values = {}
  sth.each {|row| values[row['Id']] = clean(row['Name'], attributes[:transformer])}
  sth.finish

  values
end

def get_sql(table, attributes)
  id_column = attributes[:id_column] || "#{table}Id"
  name_column = attributes[:name_column] || "Description"
  "SELECT #{id_column} AS Id, #{name_column} AS Name FROM #{table} ORDER BY Id"
end

def clean(enum_value, transformer=nil)
  enum_value = '_' + enum_value if enum_value =~ /^\d/
  enum_value = enum_value.gsub /[^\w]/, ''
  transformer ||= lambda {|value| value}
  transformer.call enum_value
end

Caveat Emptor: I wrote this code from scratch today; it is not the same code we currently use in production. I think it’s better, but if you find a problem with it please let me know.

Written by Brandon Byars

October 21, 2007 at 9:54 pm

Posted in .NET, Code Generation, Ruby

Tagged with ,

TDD’ing a Markov Chain

In The Practice of Programming, Brian Kernighan and Rob Pike develop a simple algorithm to scramble text, yet do so in a way that helps prevent turning the output into gibberish. The idea is simple to understand:

  • Parse the input into a list of prefixes. A prefix of two words long seems to work well.
  • For each prefix in the input text, keep a list of suffixes, where a suffix is the word immediately following the prefix. If the same suffix exists in the input multiple times for a prefix, it should be listed multiple times in the suffix list.
  • To create the output text, starting with an initial prefix, randomly select a suffix for that prefix. Then slide your prefix over one word so that it now includes the last word of the previous prefix and the new suffix. Rinse, lather, repeat.

Kernighan and Pike suggest using sentinel values to help start and end the process. The algorithm is a simple example of a Markov chain algorithm.

I decided to give a Ruby implementation a go in a TDD fashion to see if I could learn something. I had the advantage of having already seen a few implementations by Kernighan and Pike, but to keep it somewhat fair, I didn’t consult the book during the implementation.

If you want to follow along with the code, you can download it here. To give you a sense of where we’re going, here’s the first stanza of Edgar Allen Poe’s The Raven, after scrambling:

Once upon a bust of Pallas just above my chamber door; -
This it is, and this mystery explore; -
‘Tis the wind and nothing more.

As always, the hardest part is just getting started. On string algorithms, I find starting with an empty string keeps me in my comfort zone:

class EmptyStringMarkovTest < Test::Unit::TestCase
  def setup
    @markov = Markov.new ''
  end

  def test_speak
    assert_equal '', @markov.speak
  end
end

Test::Unit, like most members of the xUnit family that work in reflection-equipped languages, relies on a simple naming convention—all methods starting with ‘test_’ are executed. It fails, of course, because Markov doesn’t exist yet. This should do the trick:

class Markov
  def initialize(text)
  end

  def speak
    ''
  end
end

In my experience, developers who struggle to understand TDD, or don’t want to understand TDD, look at the above code and wonder what the point is. I haven’t really tested anything, because I haven’t really implemented anything. However, I’ve designed the simplest thing I can think of that gets me what I want. Design is hard, and simple design is even harder (don’t confuse simple with easy).

The code above isn’t hard at all. This is the whole point of TDD, but many people struggle to grasp it. Some people, including those in the BDD camp, point out that many people misunderstand TDD because it has the word ‘test’ it. Were I writing a TDD logo, it’d look something like this: test DRIVEN development.

I know I need to parse the input into a list of prefixes. I’ll try that next.

def test_prefixes
  assert_equal ["#{Markov::SENTINEL} #{Markov::SENTINEL}"],
    @markov.prefixes
end

Surrounding the original text by sentinel values is one of those tricks that, once you see it, makes perfect sense. Of course, having read Kernighan & Pike’s examples, I’d already seen the trick. Since I’m in a dynamic language, I can run the test right now and see it fail with a NoMethodError. However, I think it’s more valuable to go ahead and add the missing constant and method, and make it fail because the implementation is wrong, not just because it’s missing. I added the following to Markov:

attr_reader :prefixes
SENTINEL = ""

def initialize(text)
  @prefixes = []
end

If you’re not a Ruby programmer, the attr_reader method dynamically creates a prefixes getter method that returns the instance variable of the same name (the @ sign indicates an instance variable, and the [] indicates an empty array). This fails, of course, but it’s easy to fix using Ruby’s string interpolation:

def initialize(text)
  @prefixes = ["#{SENTINEL} #{SENTINEL}"]
end

Yes, I know, I’m cheating. However, my tests pass, and I’d prefer having the tests force me to move away from hard-coding results in the system under test. To do that, I probably need to move away from the empty string:

class OneWordMarkovTest < Test::Unit::TestCase
  def setup
    @markov = Markov.new 'word'
  end

  def test_speak
    assert_equal 'word', @markov.speak
  end
end

I gave this test its own fixture. This is something Dave Astels and the BDD community recommend; if your tests require different state than tests you’ve already written, don’t hesitate to create a new class for it. Sticking to the one test class per production class is a smell.

After giving it a moment’s thought, I realize I can keep cheating. Maybe this wasn’t such a great test after all. That’s OK; realizing that development is a herky-jerky activity is part of the strength of TDD. Here’s the changed Markov class:

def initialize(text)
  @original_text = text
  @prefixes = ["#{SENTINEL} #{SENTINEL}"]
end

def speak
  @original_text
end

Similarly, for the prefixes:

def test_prefixes
  assert_equal ["#{Markov::SENTINEL} word ", "word #{Markov::SENTINEL}"],
    @markov.prefixes
end

Notice that I’m including trailing whitespace as part of the prefix. I also intend to include trailing punctuation, so I don’t have to be too clever about parsing.

Finally my hard-coding ways have caught up with me. Now, for the first time, I actually have to think a little about implementation. I think that’s A Good Thing—interface first, implementation second. Here’s the changed code:

class Markov
  attr_reader :prefixes

  SENTINEL = ""
  WORDS_IN_PREFIX = 2

  def initialize(text)
    @original_text = "#{SENTINEL} #{text} #{SENTINEL}"
    build_prefixes
  end

  def speak
    @original_text.gsub Regexp.new("\\s*#{SENTINEL}\\s*"), ''
  end

  private
    def build_prefixes
      @prefixes = []
      words = @original_text.scan /\S+\s*/
      max_index = words.length - WORDS_IN_PREFIX
      (0..max_index).each do |i|
        prefix = words[i..(i+WORDS_IN_PREFIX-1)].join
        @prefixes << prefix unless @prefixes.include?(prefix)
      end
    end
end

I added the sentinels around the input text and parsed it into a list of two-word prefixes (The scan method, which you can read about in the Pickaxe book, returns an array of all non-overlapping substrings matching the regular expression. Regular expressions are delimited by slashes). The speak method now has to remove the sentinels. This fixes my new test, but breaks my empty string prefix test. An empty string will leave two spaces between the sentinels (look at the constructor above—there’s a sentinel+space before the text, and a space+sentinel after the text). I fix the test and get back to green.

It occurs to me that I probably overdid build_prefixes. While I think it’s right, there some code in it, like checking for duplicate prefixes, that isn’t strictly dictated by the tests yet. TDD takes discipline to not add code that isn’t demanded by failing tests. Figuring out how to do that isn’t always easy, but in this case it shouldn’t be hard. I just got carried away.

I think it’s still too big of a jump to start expecting random output text according to our Markov chain algorithm, so I start focusing on collecting the suffixes:

# In EmptyStringMarkovTest
def test_suffixes
  suffixes = {"#{Markov::SENTINEL}  #{Markov::SENTINEL}"=> []}
  assert_equal suffixes, @markov.suffixes
end

The suffixes variable above represents a hashtable, with an empty array as the value for the prefix key. I can make this one work by hard-coding, and since parsing the suffixes is non-trivial, I think I will:

# In Markov
attr_reader :prefixes, :suffixes

def initialize(text)
  @original_text = "#{SENTINEL} #{text} #{SENTINEL}"
  build_prefixes
  @suffixes = {"#{SENTINEL} #{text} #{SENTINEL}" => []}
end

I’m using the same trick here I used for the prefix code above. I’m desigining the code around the simplest example I can think of, and hard-coding a result until forced otherwise. That helps me focus on interface over implementation, without getting bogged down in tricky parsing code. Here, the interface is using a hashtable for suffixes. Now, let’s force some implementation:

# In OneWordMarkovTest
def test_suffixes
  suffixes = {
    "#{Markov::SENTINEL} word " => ["#{Markov::SENTINEL}"],
    "word #{Markov::SENTINEL}" => []
  }
  assert_equal suffixes, @markov.suffixes
end

This one requires some thought:

def initialize(text)
  @original_text = "#{SENTINEL} #{text} #{SENTINEL}"
  build_prefixes
  build_suffixes
end

private
def build_suffixes
  @suffixes = {}
  @prefixes.each do |prefix|
    @suffixes[prefix] = find_suffixes_for(prefix)
  end
end

You’ll see here the same “fake it until you make it” attitude that I used to hard-code test results. I don’t know how to find the suffixes for a prefix, so I’ll just pretend like the function to do it is already written. That makes the build_suffixes easy, even if it doesn’t work yet. Various authors in the Agile community, including Ron Jeffries and Brian Marick, have recommended using this approach, and I can attest to its effectiveness. Really, it’s the same principle behind divide and conquer algorithms.

Here’s my first attempt at find_suffixes_for:

def find_suffixes_for(prefix)
  pattern = Regexp.new("#{prefix}(\\S+\\s*)")
  suffixes = []
  text = @original_text

  loop do
    match = pattern.match(text)
    break if match.nil?
    suffixes << match[1]
    offset = match.offset(1)[0]
    text = text[offset..-1]
  end
  suffixes
end

This gets the test to pass. The match object returned by matching a regular expression returns the entire string as match[0], the first matched group (delimited by parenthesis in the regular expression) in match[1], etc. If we find a match, we add it to our suffixes array (using the << appending operator) and slice text to start right after the first character of the previous match.

Up until now, I haven’t had to do any refactoring, but it is the crucial last step to each test-code-refactor iteration. The design benefits of TDD depend on it, as it is what enables the ‘evolution’ in ‘evolutionary design’. Let’s face it: everybody loves to refactor. But refactoring takes discipline. Refactoring without tests is a lot like wearing a chastity belt—expect an initial surge of excitement to be followed by a prolonged period of frustration.

Looking over the code, I’m a bit bothered by the duplicated regular expression used to define the words in the text. Looks like I need to extract a constant:

# In Markov
WORD_REGEXP = /\S+\s*/

def build_prefixes
  @prefixes = []
  words = @original_text.scan WORD_REGEXP
  max_index = words.length - WORDS_IN_PREFIX
  (0..max_index).each do |i|
    prefix = words[i..(i+WORDS_IN_PREFIX-1)].join
    @prefixes << prefix unless @prefixes.include?(prefix)
  end
end

def find_suffixes_for(prefix)
  pattern = Regexp.new("#{prefix}(#{WORD_REGEXP.source})")
  suffixes = []
  text = @original_text

  loop do
    match = pattern.match(text)
    break if match.nil?
    suffixes << match[1]
    offset = match.offset(1)[0]
    text = text[offset..-1]
  end
  suffixes
end

I’ve pretty much exhausted my options with empty strings and one word strings. I have a good start on parsing the prefixes and suffixes, so maybe it’s time to start with the Markov chain. First, I need some input that forces the result of the speak method to be different than the original text. Since suffixes are randomly chosen, using a small string may not provide enough data to statistically guarantee a different string.

class StatisticalMarkovTest < Test::Unit::TestCase
  def setup
    @text = ('a b c ' * 10) + 'a b d '
    @text *= 1000
    @text = @text.strip
    @markov = Markov.new @text
  end

  def test_prefixes
    prefixes = ["#{Markov::SENTINEL} a ", "a b ", "b c ",
      "c a ", "b d ", "d a ", "d #{Markov::SENTINEL}"]
    assert_equal prefixes, @markov.prefixes
  end

  def test_suffixes
    assert_equal 10000, suffixes_with("c ")
    assert_equal 1000, suffixes_with("d ")
  end

  def suffixes_with(suffix)
    @markov.suffixes["a b "].find_all {|s| s == suffix}.length
  end
end

I wrote the prefixes test just to make sure I understood what they were. It passed the first time. I wrote the suffixes method to give me a little more confidence in my parsing. It also passed the first time. I’m not entirely sure either test was really useful, and it’s quite possible that I could have just been procrastinating.

def test_weighted_random
  markov_text = @markov.speak
  assert_not_equal @text, markov_text
end

I’m not done with this test, but I’ll leave it like this until I get back to green. After quite a bit of experimenting (encompassing more time than I usually like to go between passing tests), I come up with the following:

class Markov
  attr_reader :prefixes, :suffixes

  SENTINEL = ""
  WORDS_IN_PREFIX = 2
  WORD_REGEXP = /\S+\s*/

  def initialize(text)
    @original_text = "#{SENTINEL} #{text} #{SENTINEL}"
    build_prefixes
    build_suffixes
  end

  def speak
    text = words(@original_text)[0..1].join
    text += random_suffix_for(text) until random_suffix_for(text).nil?
    text.gsub Regexp.new("\\s*#{SENTINEL}\\s*"), ''
  end

  private
  def build_prefixes
    @prefixes = []
    words = words(@original_text)
    max_index = words.length - WORDS_IN_PREFIX
    (0..max_index).each do |i|
      prefix = words[i..(i+WORDS_IN_PREFIX-1)].join
      @prefixes << prefix unless @prefixes.include?(prefix)
    end
  end

  def build_suffixes
    @suffixes = {}
    @prefixes.each do |prefix|
      @suffixes[prefix] = find_suffixes_for(prefix)
    end
  end

  def find_suffixes_for(prefix)
    pattern = Regexp.new("#{prefix}(#{WORD_REGEXP.source})")
    suffixes = []
    text = @original_text

    loop do
      match = pattern.match(text)
      break if match.nil?
      suffixes << match[1]
      offset = match.offset(1)[0]
      text = text[offset..-1]
    end
    suffixes
  end

  def random_suffix_for(text)
    words = words(text)
    prefix = words[(words.length-WORDS_IN_PREFIX)..-1].join
    suffixes = @suffixes[prefix]
    return nil if suffixes.nil? || suffixes.length == 0
    suffixes[rand(suffixes.length-1)]
  end

  def words(text)
    # We're including punctuation as part of the word
    text.scan(WORD_REGEXP)
  end
end

I refactored the scan call into a words method while I was working because I need it in several places. This code works, but slowly. That’s OK—I’ll deal with that later.

This was a big jump, maybe too big. I might have been better off trying to come up with smaller steps. But the deed is done; let’s see if we can clean it up. Gritting my teeth, I decide to ignore the performance issue and go for some refactoring. The first thing I see is that I can make parsing the prefixes a bit more intention-revealing:

def speak
  text = first_prefix(@original_text)
  text += random_suffix_for(text) until random_suffix_for(text).nil?
  text.gsub Regexp.new("\\s*#{SENTINEL}\\s*"), ''
end

def random_suffix_for(text)
  prefix = last_prefix(text)
  suffixes = @suffixes[prefix]
  return nil if suffixes.nil? || suffixes.length == 0
  suffixes[rand(suffixes.length-1)]
end

def last_prefix(text)
  words = words(text)
  words[(words.length-WORDS_IN_PREFIX)..-1].join
end

def first_prefix(text)
  words(text)[0..1].join
end

I make the change and run the tests. 106 seconds later, I see I’m still on green. Fine, I guess it’s time to deal with performance.

It’s important to note that this is the first time I’ve really considered performance in the implementation. As Kernighan and Pike say, the first rule of performance tuning is don’t. Even more caustic is Knuth’s (or Hoare’s?) comment that “premature optimization is the root of all evil.” It’s an evil I’ve committed too often in my career, and every time the software has been the worse for it. Write your code as cleanly and intention-revealing as possible. If, and only if, there are performance problems do you start optimizing. Remember, about 4% of your program will account for about 50% of the runtime; optimizing because you can means your probably wasting your time in the other 96% of the code.

I break out the profiler and run my slow test (ruby -r profile statistical_markov_test.rb -n weighted_random). Here’s a sample of the top of the output:

1 tests, 1 assertions, 0 failures, 0 errors
%   cumulative   self              self     total
time   seconds   seconds    calls  ms/call  ms/call  name
23.56   12.24     12.24        7  1748.57  2685.71  Kernel.loop
17.99   21.59      9.35     5668     1.65     1.65  String#scan
16.42   30.12      8.53        1  8530.00 18550.00  Range#each
8.95    34.77      4.65    33001     0.14     0.21  Array#include?
4.31    37.01      2.24   103016     0.02     0.02  String#==
3.50    38.83      1.82    77379     0.02     0.02  Array#[]
3.41    40.60      1.77     5665     0.31     2.44  Markov#random_suffix_for

It looks like the String#scan method is my main problem (Ruby documenters use the # character to indicate an instance method). That makes sense; it’s used in my words method, which is called every time a prefix is requested in random_suffix_for.

My first thought is to cache:

def words(text)
  # We're including punctuation as part of the word
  # This is the bottleneck of the algorithm, hence the caching
  @words ||= {}
  @words[text] ||= text.scan(WORD_REGEXP)
  @words[text]
end

The best thing I can say about this is that it’s not the stupidest thing I’ve ever done. After wrestling with my machine for several minutes to regain control because of the massive pagefile swapping, I’m finally able to kill the ruby process running my tests.

Looking at the code I’m almost embarrassed to comment how the caching takes up memory but doesn’t actually get used. The text used as the key into the hashtable will always be different when getting suffixes. I’m shaking my head in shame. I really wish you had stopped me from making such a public fool of myself.

Now that I’m seeing things a little bit more clearly, I notice how the text that’s passed in is monotonically increasing in length, adding one suffix each time. Rather than scanning each time, we could scan once, and then add each suffix manually:

def speak
  @text = first_prefix(@original_text)
  @words = words(@text)
  while (suffix = random_suffix_for(@text)) != nil
    @text << suffix
    @words << suffix
  end
  @text.gsub Regexp.new("\\s*#{SENTINEL}\\s*"), ''
end

def random_suffix_for(text)
  prefix = last_prefix
  suffixes = @suffixes[prefix]
  return nil if suffixes.nil? || suffixes.length == 0
  suffixes[rand(suffixes.length-1)]
end

def last_prefix
  @words[(@words.length-WORDS_IN_PREFIX)..-1].join
end

This runs in a respectable 5 seconds, which, if memory serves, puts it roughly in the ballpark Kernighan and Pike had.

Notice how, by keeping more state, I’ve eliminated the parameter to last_prefix. I think I can do the same thing for random_suffix_for:

def speak
  @text = first_prefix(@original_text)
  @words = words(@text)
  while (suffix = random_suffix) != nil
    @text << suffix
    @words << suffix
  end
  @text.gsub Regexp.new("\\s*#{SENTINEL}\\s*"), ''
end

def random_suffix
  suffixes = @suffixes[last_prefix]
  return nil if suffixes.nil? || suffixes.length == 0
  suffixes[rand(suffixes.length-1)]
end

Scanning the rest of the code, I notice an egregious bit of duplication that I had somehow missed before:

def first_prefix(text)
  words(text)[0..1].join
end

Where did the 1 come from in the array slice? It’s really this:

def first_prefix(text)
  words(text)[0..(WORDS_IN_PREFIX-1)].join
end

I feel pretty close now, and I think it’s time to do some real testing. One of the things I remember Kernighan and Pike doing is verifying that, if ‘a b c ’ outnumbers ‘a b d ’ 10:1, then we should see 10 times as many ‘c’ suffixes as ‘d’ suffixes. This is where I was headed with the test_weighted_random method. However, I’ll need to count the number of occurrences of a substring, and I don’t see a library method to help me. No problem, this is Ruby after all. Here’s the code I ended up with after a couple TDD cycles, which have been compressed for your viewing convenience:

class StringTest < Test::Unit::TestCase
  def test_zero_occurrences
    assert_equal 0, 'the quick brown fox'.occurrences_of('cat')
  end

  def test_occurrences_in_empty_string
    assert_equal 0, ''.occurrences_of('a')
    assert_equal 1, ''.occurrences_of('')
  end

  def test_occurrences_of
    text = ('a b c ') * 10 + 'a b d'
    assert_equal 10, text.occurrences_of('a b c')
    assert_equal 1, text.occurrences_of('a b d')
  end
end

class String
  def occurrences_of(substring)
    count, i = 0, 0
    loop do
      i = self.index(substring, i)
      break if i.nil?
      count += 1
      i += 1
    end
    count
  end
end

Adding a method to an already existing class like this is known as aspect-oriented introduction. It’s a difficult, and sometimes impossible feat to do in many languages. In Ruby, it’s that easy.

Now I can change my statistical test:

def test_weighted_random
  markov_text = @markov.speak
  assert_not_equal @text, markov_text
  count_bc = markov_text.occurrences_of('b c')
  count_bd = markov_text.occurrences_of('b d')
  ratio_bc_to_bd = count_bc / count_bd
  assert ratio_bc_to_bd.between?(9, 11),
    "there are #{ratio_bc_to_bd} times as many 'b c' fragments as 'b d' fragments (should be ~10)"
end

I’m pleased to see it work the first time. I decide to test with some real input now. I’d like to verify that every three word combination (2-word prefix + suffix) in the output also exists in the source text. To do that, I’ll need access to the words method. I decide it probably belongs in the String class too, so I move it and make a few tests for it:

class String
  WORD_REGEXP = /\S+\s*/

  def words
    # We're including punctuation as part of the word
    scan(WORD_REGEXP)
  end
end

class StringTest < Test::Unit::TestCase
  def test_words_in_empty_string
    assert_equal [], ''.words
  end

  def test_words
    assert_equal ['the ', 'quick ', 'brown ', 'fox'],
      'the quick brown fox'.words
  end

  def test_words_include_punctuation
    assert_equal ['first, ', 'second!  ', 'third? ', 'fourth...'],
      'first, second!  third? fourth...'.words
  end
end

Now I grab some real input and give it a go:

class ComplexMarkovTest < Test::Unit::TestCase
  def setup
    @markov = Markov.new the_raven
  end

  def test_speak
    markov_text = @markov.speak
    puts markov_text
    assert_not_equal the_raven, markov_text, "should scramble text"

    # each (prefix + suffix) combo must also be in original text
    words = markov_text.words
    phrase_size = Markov::WORDS_IN_PREFIX + 1
    (0..words.length-phrase_size).each do |i|
      phrase = words[i..(i+phrase_size-1)].join
      assert the_raven.include?(phrase), " not in original"
    end
  end

  def the_raven
    # Returns the entire text of Edgar Allen Poe's "The Raven"
    # download the source code if you want to see the whole thing...
  end
end

I’m a bit surprised to see this one fail. After a good bit of playing around, I discover that I had an off-by-one error because I misunderstood how the library rand method worked. According to the documentation, if the parameter max is an integer greater than 0, then it returns a random integer in the range [0..max-1]). I was already subtracting one, so I was always leaving off the last suffix.

That makes me wonder how it ever worked for prefixes that had only one suffix, which is the way all my earliest tests worked. If max is 0, rand returns a real number between 0.0 and 1.0. I’m not sure how this works for array indexes, so I try it out in irb:

middlestate:~/Development/markov_tdd/markov/test bhbyars$ irb
>> test = [1]
=> [1]
>> test[0.5]
=> 1
>> test[0.3]
=> 1
>> test[0.01]
=> 1
>> test[0.9]
=> 1

I just got lucky then (or unlucky); Ruby appears to always truncate the array index, masking my defect. Testers like to distinguish between faults and failures. I had a fault all along in my code, but until now it had not caused an observable failure. It’s an easy fix:

def random_suffix
  suffixes = @suffixes[last_prefix]
  return nil if suffixes.nil? || suffixes.length == 0
  suffixes[rand(suffixes.length)]
end

You may have noticed that I added a line in the speak test to echo the markov_text. This was primarily for my own curiosity; I wanted to see what the text looked like. But I noticed something odd—the text wasn’t ending on ‘nevermore!’, which is the last word of the poem. The use of the sentinel values should force it to end on the last word in the source text, since the only empty prefix should be the last word plus the sentinel value. Let me test that:

def test_all_prefixes_have_suffix_except_last
  keys = @markov.suffixes.keys.find_all {|key| @markov.suffixes[key] == []}
  assert_equal ["#{the_raven.words.last} #{Markov::SENTINEL}"], keys
end

Here’s the result:

1) Failure:
test_all_prefixes_have_suffix_except_last:10
 expected but was
.

I spot the error almost immediately. All the extra prefixes have a ? character in it, which has meaning to regular expressions. I need to escape the Regexp:

def find_suffixes_for(prefix)
  pattern = Regexp.new("#{Regexp.escape(prefix)}(#{String::WORD_REGEXP.source})")
  suffixes = []
  text = @original_text

  loop do
    match = pattern.match(text)
    break if match.nil?
    suffixes << match[1]
    offset = match.offset(1)[0]
    text = text[offset..-1]
  end
  suffixes
end

While thinking about the prefix-suffix relationship, I realized that I had an unnecessary nil check in random_suffix; all prefixes should have suffixes, and only the last one should be empty:

def random_suffix
  suffixes = @suffixes[last_prefix]
  return nil if suffixes.length == 0
  suffixes[rand(suffixes.length)]
end

Now all tests pass, and the output looks right. I note to myself how, while the simple unit tests did help me design the application, it took some real tests with real input before I caught some rather important defects.

And now I admire my creation:

Open here I flung the shutter, when, with many a quaint and curious volume of forgotten lore,
While I nodded, nearly napping, suddenly there came a tapping,
As of some one gently rapping, rapping at my chamber door -
only this, and nothing more.

Written by Brandon Byars

May 6, 2007 at 12:40 pm

Posted in Ruby, TDD

Tagged with

Using Rails Fixtures For Storing Test Data

Yesterday I wrote about trying to retrofit agility into a large enterprise database. Even after we got our database scripted so that every developer could use a localhost database, we still had to switch back to an integration database too often to make it worthwhile. The problem was test data; we simply didn’t have enough to be useful, and it was difficult to maintain.

At first we tried SQL scripts, but quickly found them too hard to read. After creating a script to add data for one table, nary another script was written (random rant: why can’t the SQL INSERT statement follow the syntax of the UPDATE syntax? With big tables, the positional coupling between the column list and VALUES list is just too hard to maintain). Next, we tried a CSV file, which worked somewhat nicely because we could edit them in Excel. However, the BULK INSERT command we used to insert them caused too many problems. Any time a column was added, even if it was a nullable column, even if it was a computed column, you had to add it to the CSV file. And both the SQL file and the CSV files lacked the ability to add dynamic data. Many times we wanted to enter yesterday for a date, regardless of what actual date that happened to be. In some cases, we’d simply like to add 1000 rows, without regard to what data those rows contained.

Rails Fixtures

In the end, we hijacked some functionality from the Ruby on Rails testing framework. The Fixtures class allows you to make the following call in your tests to delete and re-add test data before each test:

fixtures :orders, :order_items

That one line of code will, by default, delete the order_items and orders tables (in that order), and then add all data from the orders.yml and order_items.yml file (in that order). YAML is preferred in the Ruby world for it’s readability; XML, like so many “enterprisey” technologies, is considered bulky and clumsy to the Rails team. Even better, the Rails code first runs your YAML files through ERB (a Ruby templating engine) before sending it to the YAML parser, so you can add dynamic content.

It turns out that using the fixtures code outside of Rails is really quite easy. First, we need to set up our connection information and define the necessary ActiveRecord classes. We can define all them in one file (I’ll call it driver.rb):

require 'active_record'
require 'active_record/fixtures'

ActiveRecord::Base.establish_connection(
  :adapter  => "sqlserver",
  :database =>  "db_name",
  :username => "db_user",
  :password => "password"
)

ActiveRecord::Base.logger = Logger.new(File.dirname(__FILE__) + "/debug.log")

class DisplayCategory < ActiveRecord::Base
  set_primary_key "DisplayCategoryId"
  set_table_name "DisplayCategories"
end

ActiveRecord is the Rails object-relational mapping framework, and like everything else in Rails, it’s built around the principle of “convention over configuration.” Since we’re dealing with a legacy database, and one not intended to be consumed by a Rails application, we’ll have to settle for a little bit of configuration.

By default, ActiveRecord expects the table name to be the plural version of the class name, with underscores between words (display_categories), and the primary key be an identity column called id. Our database has a different naming standard, with a healthy dose of standard violations, so we’ll have to add the call to set_primary_key and set_table_name to all of our ActiveRecord classes. DisplayCategories has an identity column, but I’ll show you an example below that does not.

When the script is parsed by Ruby, ActiveRecord will connect to the database and add a property for every column in the table. Metaprogramming is what allows ActiveRecord to be so DRY.

Then, we need the actual data. The Rails convention expects the YAML file to match the table name, so we’ll put the following in data/DisplayCategories.yml:

books:
  DisplayCategoryId: 1
  DisplayName: Books
  SortIndex: 1
fiction:
  DisplayCategoryId: 2
  DisplayName: Fiction
  SortIndex: 2
  ParentDisplayCategoryId: 1

You see how readable YAML is. The keys (books, fiction) are not inserted into the database. We’ll find a good use for them below for testing purposes, but for now we’ll simply use them to help us describe the record.

Let’s run it. The following code should work (in add_test_data.rb):

require 'driver.rb'
Fixtures.create_fixtures("data", [:DisplayCategories])

Unfortunately, we may get an error running add_test_data.rb. By default, the YAML parser simply dumps every record into a hashtable, with the identifier as the key. Since hashtables have no implicit ordering, we could run into a problem since our data requires that it be inserted in order (fiction has a ParentDisplayCategoryId foreign key pointing to books). If order doesn’t matter to you, use the above syntax. When it does matter, it’s a simple change:

--- !omap
- books:
    DisplayCategoryId: 1
    DisplayName: Books
    SortIndex: 1
- fiction:
    DisplayCategoryId: 2
    DisplayName: Fiction
    SortIndex: 2
    ParentDisplayCategoryId: 1

That omap syntax defines an ordered map, which solves our dependency problem. Now, running add_test_data.rb should work.

Dealing with Exceptions

DisplayCategories doesn’t quite meet the Rails conventions, but it’s close. What happens when we’re not close?

Let’s assume the following table definition:

CREATE TABLE PersonName (
  PersonNameId int NOT NULL
    CONSTRAINT PK_PersonName PRIMARY KEY,
  FirstName varchar(100) NOT NULL,
  LastName varchar(100) NOT NULL,
  FullName AS FirstName + ' ' + LastName
)

CREATE TRIGGER Audit_On_Update ON PersonName AFTER UPDATE
AS
INSERT INTO AuditPersonName(DateChanged, PersonNameId, FirstName, LastName)
SELECT GETDATE(), PersonNameId, FirstName, LastName
FROM deleted

Now, our configuration with ActiveRecord becomes more involved. There are four headaches we’ll run into here:

  • We don’t have an identity column
  • We have a computed column
  • Our table name isn’t plural
  • We have a trigger

Let’s tackle them one at a time. Instead of an identity column, let’s assume that we have to call a stored procedure called get_next_id to give us our next id as an output parameter. The following will work:

class PersonName < ActiveRecord::Base
  set_primary_key "PersonNameId"
  set_table_name "PersonName"

  def before_create
    sql <<-EOT
      DECLARE @id int
      EXEC get_next_id 'PersonName', @id OUTPUT
      SELECT @id
    EOT
    self.id = connection.select_value(sql).to_i
  end
end

We’re using one of the built-in hooks ActiveRecord provides to change how it gets the ids for us. In case you’ve never seen that EOT business, it’s called a here document; you can read about it here.

However, we still won’t be able to use that ActiveRecord implementation because it doesn’t understand computed columns, and will try to add FullName to the insert list. To work around that problem, add the following code to your PersonName class:

class PersonName
  @@computed_columns = ['FullName']

  def initialize(attributes = nil)
    super
    @@computed_columns.each { |column| @attributes.delete(column) }
  end
end

This is a bit hackish, but it gets the job done. I discovered the @attributes instance variable in ActiveRecord::Base when browsing for a solution, and found that I could simply remove the offending columns from it in the constructor.

Speaking of hacks, we can fix the problem of PersonName being singular by changing the way Rails understands the English language. Add the following to driver.rb:

require 'active_support/inflector'

Inflector.inflections do |inflect|
  inflect.uncountable 'PersonName'
end

The uncountable method was intended for words like fish and money. That’s ok, cheating seems like a pragmatic solution here. Chad Fowler mentions this solution in his book on Rails Recipes. Other options include:

Inflector.inflections do |inflect|
  inflect.plural /(ox)$/i, '\1en'
  inflect.singular /(ox)en/i, '\1'
  inflect.irregular 'person', 'people'
end

For more details, consult Fowler’s book, or dig through the source code. Now, add some test data and try to run it. We can add it to add_test_data.rb quite easily:

require 'driver.rb'
Fixtures.create_fixtures("data", [:DisplayCategories, :PersonName])

Try running it. If you’re using SQL Server, you should receive an exception that reads “Cannot create new connection because in manual or distributed transaction mode.” That’s because our trigger is getting in the way. Add SET NOCOUNT ON as the first line of your trigger, and try again. Everything should work now.

Using dynamic fixtures

I mentioned above that Fixtures first runs the YAML file through ERB before sending it to the parser. That means we can clean up our data to get rid of most of the noise for columns we don’t care about. The following is a valid YAML fixture:

<%
def ignore_values
  <<EOT
  Column1: 1
  UselessDate: 1/1/2000
  SomeNullColumn:
EOT
end
%>
identifier1:
  ImportantColumn: 1
<%= ignore_values %>
identifier2:
  ImportantColumn: 2
<%= ignore_values %>

Now, at least we’re only writing the columns we care about for each record. Sometimes, however, we can do a lot better. Imagine an address table that stores both a main and a shipping address for each person. The following would be a valid fixture:

<%
@next_id = 0

def addresses_for(name, attributes)
  "#{address_for(name, 'main', attributes)}#{address_for(name, 'shipping', attributes)}"
end

def address_for(name, type, attributes)
  @next_id += 1
  <<EOT
#{name}_#{type}:
  AddressId: #{@next_id}
  PersonId: #{attributes[:PersonId]}
  AddressTypeId: #{type == 'main' ? 1 : 2}
  Geocode: 
  Address1: #{attributes[:Address1]}
  Address2: 
  Address3: 
  City: #{attributes[:City]}
  State: #{attributes[:State]}
  Zip: #{attributes[:Zip]}
  County: #{attributes[:County]}
EOT
end
%>
<%= addresses_for('person1', {
  :PersonId => 1,
  :City => 'Marshall',
  :Address1 => '123 Some Road',
  :State => 'TX',
  :Zip => '75672',
  :County => 'Harrison'
}) %>
<%= addresses_for('person2', {
  :PersonId => 2,
  :City => 'Marshall',
  :Address1 => 'P.O. BOX 123',
  :State => 'TX',
  :Zip => '75672',
  :County => 'Harrison'
}) %>

That fixture will create 4 records, person1_main, person1_shipping, person2_main, and person2_shipping. Since we don’t care about having a separate billing address and shipping address for our test data, we can reduce the amount of information we have to add. And if we ever do care, it’s easy to write it out without calling our addresses_for method.

Testing

The Fixtures class was written for testing, and it has some nice features that give you access to each loaded record, for example, as an instance variable named after the record identifier (@person1, @fiction, etc). I had trouble getting that part to work with SQL Server, but even if you don’t ask Fixtures to load each record for you, you can still access them, which is quite useful if you’re using Ruby for any of your testing (for example, in Watir scripts). Let’s add the following to driver.rb:

require 'test/unit'

Test::Unit::TestCase.fixture_path = File.dirname(__FILE__) + "data/"
$LOAD_PATH.unshift(Test::Unit::TestCase.fixture_path)

class Test::Unit::TestCase
  self.use_transactional_fixtures = false
  self.use_instantiated_fixtures = :no_instances
end

Now, we should be able to load the fixtures automatically in our test cases:

require 'test/unit'
require 'watir'
require 'driver'

class ProductBrowsingTest < Test::Unit::TestCase
  fixtures :DisplayCategories

  def test_something
    # create a browser instance, and navigate to the right page
    browser.link(:text, @DisplayCategories["fiction"]["DisplayName"]).click
  end
end

Written by Brandon Byars

March 18, 2007 at 6:53 pm

Posted in Database, Ruby

Tagged with ,

Ruby and SQL DMO

We have a rather large legacy system that we are trying to inject a healthy dose of agility into. One of the biggest challenges has always been versioning the database, which is large, clumsy, and cluttered. We managed to get the rest of the code under a continuous integration scheme, but every time somebody made a schema change, or even updated a stored procedure needed for some tests to work, manual intervention was needed.

Pramod Sadalage and Martin Fowler wrote one of the first articles on iterative database design, and Pramod later teamed up with Scott Ambler to collaborate on Refactoring Databases. The advice, adopted by the Ruby on Rails team, was to create a separate migration file for each schema change, and number them sequentially. For example, the first migration would be 1.sql, then 2.sql, and so on. You could store the latest migration file run on a database in a version table, which would make updating a database as easy as running every migration, in order, whose version number is greater than the one stored in your database table.

We managed to do something similar, although it required a number of changes. But before it could all work, you need a starting point—a base schema to create a new developer database. We use SQL Server 2000, so initially I simply had Enterprise Manager create a SQL script for me. Not only did it not work (I don’t think it sorted the dependencies right), it was a ghastly nightmare to look at.

Why do standard vendor-supplied code generation tools create such ugly code?

I decided to do the code generation myself using SQL DMO (the same COM interfaces that Enterprise Manager was using, just poorly). I’d successfully used Ruby and ERB for code generation before, and discovered I could make very nice looking code (Code Generation in Action by Jack Herrington describes the concept nicely). Within just a couple hours, I had a SQL script that not only looked much nicer than anything Enterprise Manager spits out; but worked to boot.

First, I needed to connect to the database I wanted to dump into a SQL file:

require 'win32ole'

class SqlServer
  def initialize(server, catalog)
    sql = WIN32OLE.new("SQLDMO.SQLServer")
    sql.LoginSecure = 1
    sql.Connect(server)
    @db = sql.Databases(catalog)
  end
end

This uses Windows Authentication to connect to the appropriate SQL Server database. The DMO interface, described here, is one of those clumsy, wandering APIs commonly developed when VBScript is your target language (warning, the MSDN website for DMO only seems to work in IE, and poorly at that). I decided to wrap some of the classes to make it easier to use. First, a superclass:

class DBObject
  def initialize(object)
    @object = object
  end

  def name
    return @object.Name unless @object.Name =~ / /
    "[#{@object.Name}]"
  end

  def beautify(text)
    # strip brackets if they're not needed, and remove the .dbo prefix
    text.gsub!(/\[([^ ]*?)\]/, '\1')
    text.gsub(/dbo\./, '')
  end
end

Here I provide one of the keys to generating code that looks appealing. I really don’t like to look at all the noise of bracketing all the tables and column names, just in case the name contains a space in it. The following looks ugly to me:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
SET ANSI_PADDING ON
GO
CREATE TABLE [dbo].[tablename] (
    [id] [int] NOT NULL,
    [description] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL
) ON [PRIMARY]

That drives me wild. I don’t ever change any of the settings for ANSI nulls, etc, and I could care less what the collation is so long as it matches the database default. Nor do I care about the file group. It’s hard to see through all the noise.

Here’s what I want:

CREATE TABLE tablename (
    id int NOT NULL
    description varchar(100) NOT NULL
)

Our name and beautify methods will help us achieve prettier code. Here’s the most important subclass:

class Table < DBObject
  attr_reader :columns, :constraints, :keys, :indexes, :references

  def initialize(table, connection)
    super(table)
    @columns, @constraints, @keys = [], [], []
    table.Columns.each { |col| @columns << Column.new(col) }
    table.Checks.each { |check| @constraints << CheckConstraint.new(check) }
    table.Keys.each { |key| @keys << Key.new(key, connection) }
    get_indexes(table)
    get_references
  end

  private
  def get_references
    @references = []
    @keys.each do |key|
      if "" != key.referenced_table and name != key.referenced_table
        @references << key.referenced_table unless references.include?(key.referenced_table)
      end
    end
  end

  def get_indexes(table)
    @indexes = []
    # the Indexes collection includes objects already in Keys and statistics
    keys = @keys.map { |key| key.name }
    table.Indexes.each do |index|
      if not keys.include?(index.Name)
        if index.Type == 16 || index.Type == 0
          @indexes << Index.new(index)
        end
      end
    end
  end
end

You can find the classes it depends on by downloading all of the code here. Notice, however, that a database connection is needed for the Key constructor. As far as I could tell, there was no way, using nothing more than DMO, to find out if a key cascade deletes. I had to query the INFORMATION_SCHEMA views to find that information.

For our script to work, we’re going to need to order our dependencies correctly. The SQL script will fail if we try to add a foreign key to a table that doesn’t yet exist. The following should do the trick:

class SqlServer
  # returns a topological sort with parent tables in front of child tables
  def self.topologically_sorted(tables)
    sorted = []

    # We need a hash to navigate the references field, which gives only names
    table_hash = {}
    tables.each { |table| table_hash[table.name] = table }

    # first add all root tables to sorted
    sorted += tables.find_all { |table| 0 == table.references.length }

    while tables.length < sorted.length
      sorted += tables.find_all do |table|
        if sorted.include?(table)
          result = FALSE
        else
          # all dependencies must already be in sorted
          dependencies = table.references.map { |ref| table_hash[ref] }
          result = (nil == dependencies.find { |ref| not sorted.include?(ref) })
        end
        result
      end
    end
    sorted
  end
end

Now, our code is as simple as binding to some ERB templates:

require 'erb'
require 'db_object'

class SchemaWriter
  def initialize(server, catalog)
    @db = SqlServer.new server, catalog
  end

  def generate_create_script
    generate_code_for(@db.user_defined_datatypes, "create_udt")
    generate_code_for(@db.rules, "create_rule")
    generate_code_for(SqlServer.topologically_sorted(@db.user_tables), "create_table")
  end

  def generate_code_for(objects, template_name)
    file_name = template_name + ".erb"
    template = ""
    File.open(file_name) { |file| template = file.read }
    objects.each do |object|
      erb = ERB.new(template, nil, '-')
      puts erb.result(binding)
    end
  end
end

if $0 == __FILE__
  writer = SchemaWriter.new(ARGV[0], ARGV[1])
  writer.generate_create_script
end

As an example, here’s the create.table.erb template:

create table <%= object.name %>(
<% object.columns.each_with_index do |column, i| -%>
    <%= column.name %> <%= column.text %><%= "," unless i == object.columns.length - 1 %>
<% end -%>
)

<% object.keys.each do |key| -%>
alter table <%= object.name %> add constraint <%= key.name %> <%= key.text %>
<% end -%>
<% object.constraints.each do |constraint| -%>
alter table <%= object.name %> add constraint <%= constraint.name %> <%= constraint.text %>
<% end -%>
<% object.indexes.each do |index| -%>
<%= index.text %>
<% end -%>

Written by Brandon Byars

March 17, 2007 at 3:28 pm

Posted in Code Generation, Database, Ruby

Tagged with ,

Follow

Get every new post delivered to your Inbox.