A Day In The Lyf

…the lyf so short, the craft so longe to lerne

Using Rails Fixtures For Storing Test Data

Yesterday I wrote about trying to retrofit agility into a large enterprise database. Even after we got our database scripted so that every developer could use a localhost database, we still had to switch back to an integration database too often to make it worthwhile. The problem was test data; we simply didn’t have enough to be useful, and it was difficult to maintain.

At first we tried SQL scripts, but quickly found them too hard to read. After creating a script to add data for one table, nary another script was written (random rant: why can’t the SQL INSERT statement follow the syntax of the UPDATE syntax? With big tables, the positional coupling between the column list and VALUES list is just too hard to maintain). Next, we tried a CSV file, which worked somewhat nicely because we could edit them in Excel. However, the BULK INSERT command we used to insert them caused too many problems. Any time a column was added, even if it was a nullable column, even if it was a computed column, you had to add it to the CSV file. And both the SQL file and the CSV files lacked the ability to add dynamic data. Many times we wanted to enter yesterday for a date, regardless of what actual date that happened to be. In some cases, we’d simply like to add 1000 rows, without regard to what data those rows contained.

Rails Fixtures

In the end, we hijacked some functionality from the Ruby on Rails testing framework. The Fixtures class allows you to make the following call in your tests to delete and re-add test data before each test:

fixtures :orders, :order_items

That one line of code will, by default, delete the order_items and orders tables (in that order), and then add all data from the orders.yml and order_items.yml file (in that order). YAML is preferred in the Ruby world for it’s readability; XML, like so many “enterprisey” technologies, is considered bulky and clumsy to the Rails team. Even better, the Rails code first runs your YAML files through ERB (a Ruby templating engine) before sending it to the YAML parser, so you can add dynamic content.

It turns out that using the fixtures code outside of Rails is really quite easy. First, we need to set up our connection information and define the necessary ActiveRecord classes. We can define all them in one file (I’ll call it driver.rb):

require 'active_record'
require 'active_record/fixtures'

ActiveRecord::Base.establish_connection(
  :adapter  => "sqlserver",
  :database =>  "db_name",
  :username => "db_user",
  :password => "password"
)

ActiveRecord::Base.logger = Logger.new(File.dirname(__FILE__) + "/debug.log")

class DisplayCategory < ActiveRecord::Base
  set_primary_key "DisplayCategoryId"
  set_table_name "DisplayCategories"
end

ActiveRecord is the Rails object-relational mapping framework, and like everything else in Rails, it’s built around the principle of “convention over configuration.” Since we’re dealing with a legacy database, and one not intended to be consumed by a Rails application, we’ll have to settle for a little bit of configuration.

By default, ActiveRecord expects the table name to be the plural version of the class name, with underscores between words (display_categories), and the primary key be an identity column called id. Our database has a different naming standard, with a healthy dose of standard violations, so we’ll have to add the call to set_primary_key and set_table_name to all of our ActiveRecord classes. DisplayCategories has an identity column, but I’ll show you an example below that does not.

When the script is parsed by Ruby, ActiveRecord will connect to the database and add a property for every column in the table. Metaprogramming is what allows ActiveRecord to be so DRY.

Then, we need the actual data. The Rails convention expects the YAML file to match the table name, so we’ll put the following in data/DisplayCategories.yml:

books:
  DisplayCategoryId: 1
  DisplayName: Books
  SortIndex: 1
fiction:
  DisplayCategoryId: 2
  DisplayName: Fiction
  SortIndex: 2
  ParentDisplayCategoryId: 1

You see how readable YAML is. The keys (books, fiction) are not inserted into the database. We’ll find a good use for them below for testing purposes, but for now we’ll simply use them to help us describe the record.

Let’s run it. The following code should work (in add_test_data.rb):

require 'driver.rb'
Fixtures.create_fixtures("data", [:DisplayCategories])

Unfortunately, we may get an error running add_test_data.rb. By default, the YAML parser simply dumps every record into a hashtable, with the identifier as the key. Since hashtables have no implicit ordering, we could run into a problem since our data requires that it be inserted in order (fiction has a ParentDisplayCategoryId foreign key pointing to books). If order doesn’t matter to you, use the above syntax. When it does matter, it’s a simple change:

--- !omap
- books:
    DisplayCategoryId: 1
    DisplayName: Books
    SortIndex: 1
- fiction:
    DisplayCategoryId: 2
    DisplayName: Fiction
    SortIndex: 2
    ParentDisplayCategoryId: 1

That omap syntax defines an ordered map, which solves our dependency problem. Now, running add_test_data.rb should work.

Dealing with Exceptions

DisplayCategories doesn’t quite meet the Rails conventions, but it’s close. What happens when we’re not close?

Let’s assume the following table definition:

CREATE TABLE PersonName (
  PersonNameId int NOT NULL
    CONSTRAINT PK_PersonName PRIMARY KEY,
  FirstName varchar(100) NOT NULL,
  LastName varchar(100) NOT NULL,
  FullName AS FirstName + ' ' + LastName
)

CREATE TRIGGER Audit_On_Update ON PersonName AFTER UPDATE
AS
INSERT INTO AuditPersonName(DateChanged, PersonNameId, FirstName, LastName)
SELECT GETDATE(), PersonNameId, FirstName, LastName
FROM deleted

Now, our configuration with ActiveRecord becomes more involved. There are four headaches we’ll run into here:

  • We don’t have an identity column
  • We have a computed column
  • Our table name isn’t plural
  • We have a trigger

Let’s tackle them one at a time. Instead of an identity column, let’s assume that we have to call a stored procedure called get_next_id to give us our next id as an output parameter. The following will work:

class PersonName < ActiveRecord::Base
  set_primary_key "PersonNameId"
  set_table_name "PersonName"

  def before_create
    sql <<-EOT
      DECLARE @id int
      EXEC get_next_id 'PersonName', @id OUTPUT
      SELECT @id
    EOT
    self.id = connection.select_value(sql).to_i
  end
end

We’re using one of the built-in hooks ActiveRecord provides to change how it gets the ids for us. In case you’ve never seen that EOT business, it’s called a here document; you can read about it here.

However, we still won’t be able to use that ActiveRecord implementation because it doesn’t understand computed columns, and will try to add FullName to the insert list. To work around that problem, add the following code to your PersonName class:

class PersonName
  @@computed_columns = ['FullName']

  def initialize(attributes = nil)
    super
    @@computed_columns.each { |column| @attributes.delete(column) }
  end
end

This is a bit hackish, but it gets the job done. I discovered the @attributes instance variable in ActiveRecord::Base when browsing for a solution, and found that I could simply remove the offending columns from it in the constructor.

Speaking of hacks, we can fix the problem of PersonName being singular by changing the way Rails understands the English language. Add the following to driver.rb:

require 'active_support/inflector'

Inflector.inflections do |inflect|
  inflect.uncountable 'PersonName'
end

The uncountable method was intended for words like fish and money. That’s ok, cheating seems like a pragmatic solution here. Chad Fowler mentions this solution in his book on Rails Recipes. Other options include:

Inflector.inflections do |inflect|
  inflect.plural /(ox)$/i, '\1en'
  inflect.singular /(ox)en/i, '\1'
  inflect.irregular 'person', 'people'
end

For more details, consult Fowler’s book, or dig through the source code. Now, add some test data and try to run it. We can add it to add_test_data.rb quite easily:

require 'driver.rb'
Fixtures.create_fixtures("data", [:DisplayCategories, :PersonName])

Try running it. If you’re using SQL Server, you should receive an exception that reads “Cannot create new connection because in manual or distributed transaction mode.” That’s because our trigger is getting in the way. Add SET NOCOUNT ON as the first line of your trigger, and try again. Everything should work now.

Using dynamic fixtures

I mentioned above that Fixtures first runs the YAML file through ERB before sending it to the parser. That means we can clean up our data to get rid of most of the noise for columns we don’t care about. The following is a valid YAML fixture:

<%
def ignore_values
  <<EOT
  Column1: 1
  UselessDate: 1/1/2000
  SomeNullColumn:
EOT
end
%>
identifier1:
  ImportantColumn: 1
<%= ignore_values %>
identifier2:
  ImportantColumn: 2
<%= ignore_values %>

Now, at least we’re only writing the columns we care about for each record. Sometimes, however, we can do a lot better. Imagine an address table that stores both a main and a shipping address for each person. The following would be a valid fixture:

<%
@next_id = 0

def addresses_for(name, attributes)
  "#{address_for(name, 'main', attributes)}#{address_for(name, 'shipping', attributes)}"
end

def address_for(name, type, attributes)
  @next_id += 1
  <<EOT
#{name}_#{type}:
  AddressId: #{@next_id}
  PersonId: #{attributes[:PersonId]}
  AddressTypeId: #{type == 'main' ? 1 : 2}
  Geocode: 
  Address1: #{attributes[:Address1]}
  Address2: 
  Address3: 
  City: #{attributes[:City]}
  State: #{attributes[:State]}
  Zip: #{attributes[:Zip]}
  County: #{attributes[:County]}
EOT
end
%>
<%= addresses_for('person1', {
  :PersonId => 1,
  :City => 'Marshall',
  :Address1 => '123 Some Road',
  :State => 'TX',
  :Zip => '75672',
  :County => 'Harrison'
}) %>
<%= addresses_for('person2', {
  :PersonId => 2,
  :City => 'Marshall',
  :Address1 => 'P.O. BOX 123',
  :State => 'TX',
  :Zip => '75672',
  :County => 'Harrison'
}) %>

That fixture will create 4 records, person1_main, person1_shipping, person2_main, and person2_shipping. Since we don’t care about having a separate billing address and shipping address for our test data, we can reduce the amount of information we have to add. And if we ever do care, it’s easy to write it out without calling our addresses_for method.

Testing

The Fixtures class was written for testing, and it has some nice features that give you access to each loaded record, for example, as an instance variable named after the record identifier (@person1, @fiction, etc). I had trouble getting that part to work with SQL Server, but even if you don’t ask Fixtures to load each record for you, you can still access them, which is quite useful if you’re using Ruby for any of your testing (for example, in Watir scripts). Let’s add the following to driver.rb:

require 'test/unit'

Test::Unit::TestCase.fixture_path = File.dirname(__FILE__) + "data/"
$LOAD_PATH.unshift(Test::Unit::TestCase.fixture_path)

class Test::Unit::TestCase
  self.use_transactional_fixtures = false
  self.use_instantiated_fixtures = :no_instances
end

Now, we should be able to load the fixtures automatically in our test cases:

require 'test/unit'
require 'watir'
require 'driver'

class ProductBrowsingTest < Test::Unit::TestCase
  fixtures :DisplayCategories

  def test_something
    # create a browser instance, and navigate to the right page
    browser.link(:text, @DisplayCategories["fiction"]["DisplayName"]).click
  end
end
Advertisements

Written by Brandon Byars

March 18, 2007 at 6:53 pm

Posted in Database, Ruby

Tagged with ,

%d bloggers like this: