A Day In The Lyf

…the lyf so short, the craft so longe to lerne

Posts Tagged ‘Java

Introducing httpmock

leave a comment »

node.js is simply too cool to ignore. I find the trick to learning any new cool framework is to find a new cool project to write, and use the framework to write it in. httpmock is my cool new project.

I tend to work in large’ish programs of work with lots of teams. One recurring pain point in such an environment is dealing with other teams. Sure, your code is fine, but what happens if you have to integrate with their code? These days, that tends to be through web services, and it’s never pretty. Your team writes integration tests, verify they work, and commit the changes. Three days later, for no apparent reason, your build goes red. It turns out that other team broke their code again.

httpmock is a network server stubbing tool. I say “network,” primarily because I’ve got my dreaming hat on. At the moment, it’s an HTTP stubbing tool, but I’d like to add other protocols, especially SMTP. httpmock exposes a RESTful API that lets you spin up HTTP servers that you can control, and that helpully record how they were used so you can make useful test assertions. In other words, httmock expects you to execute your code full-stack, including making a full TCP handshakes across the network. httpmock just lets you control both hands involved in the handshake, and it does so in a way transparent to your code; you just have to point your configuration to the stub servers httpmock creates. I find this quite useful for functional testing.

Don’t get me wrong. Integration tests are important, and you should still have a set of tests that exercise the actual system, but expect some fragility in those tests. If they break, it doesn’t necessarily mean you broke them. Therefore, they should be in a separate build, one that may not have the same “stop the line” implications when it goes red. httpmock should not be used only for builds that are entirely in control of your team..

When I say that httpmock exposes a RESTful API, I mean it’s a truly RESTful API. There’s only one published URL, and hypermedia drives the interaction from there. This makes it easy to evolve the server while maintaining a stable client-side API. At the moment, I have only a Java binding, but I have intentions for more (and I’d love some help on the issue!). The idea of the language bindings is to present to the developer a friendly API that approximates normal unit-testing mocking frameworks without having to write your tests in a language different from your production stack.

Examples


@Test
public void showVerificationAPI()
{
    StubServer stub = ControlServer.at("http://localhost:3000").setupPort(3001);
    SystemUnderTest yourCode = new SystemUnderTest();
    
    yourCode.doSomethingThatCallsAWebService();
    
    assertThat(stub, wasCalled("POST", "/endpoint")
                        .withHeader("Content-Type", "text/plain")
                        .withHeader("Accept", "text/plain")
                        .withBodyContaining("TEST"));
    stub.close();
}

The code above shows the Java assertions. The first line actually makes two network calls; one to talk to the well-known httpmock URL (in this case, http://localhost:3000) to get the hypermedia. The second one tells httpmock to set up a new server for stubbing purposes, which is then spun down on the last call in the method. For performance reasons, you may want to do some of these administrative tasks once for the test suite, particularly retrieving the initial hypermedia. The wasCalled method also talks to httpmock to perform the verification. The SystemUnderTest should expect to base its web service calls to http://localhost:3001.


@Test
public void showStubbingAPI()
{
    StubServer stub = ControlServer.at("http://localhost:3000").setupPort(3001);
    stub.on("GET", "/").returns(new Stub().withStatus(200).withBody("BODY"));
    SystemUnderTest yourCode = new SystemUnderTest();
    
    yourCode.doSomethingThatShouldCall("http://localhost:3001/");
    
    stub.close();
}

Like typical stubbing libraries, httpmock allows you to set up stubs that will be executed if the server receives the expected call.

Installing

You need node.js to run the server. You can download httpmock from the incredibly ugly downloads page. Untar the server, and run bin/httpmock start.

This will run the control server on port 3000. You can optionally pass a --port command line argument to change the port and a --pidfile argument to use a non-default pidfile (useful if you’re running multiple instances). If you run the server in a background job, bin/httpmock stop will kill it.

If you’re using npm, you can install httpmock with sudo npm install -g httpmock. Installing it globally allows you to simply say httpmock start on the command line, which feels quite sexy.

The client bindings will simply be a library. For Java, add httpmock.jar to your classpath.

Roadmap

The following are the features I’d like to add to httpmock:

  • Support for multiple protocols, at least for verification if stubbing doesn’t make sense (SMTP, FTP, perhaps even things like printer protocols). It appears at the moment easy enough to support a plugin mechanism where the same REST API will work for multiple protocols, but I won’t know for sure until getting around to a second protocol.
  • Support for HTML. Right now, there is a JSON-based content type used for all communication, which is great for progammatic manipulation. I think it would also be quite nice to allow QA’s to manually set up stubs and perform verifications through a web browser.
  • A more robust stubbing mechanism. It’d be nice, for example, to only conditionally execute stubs if a specific request header was set.
  • More language bindings.

Want to contribute? I’d love the help. Send me a pull request at github.

Written by Brandon Byars

May 30, 2011 at 9:48 pm

Posted in Testing

Tagged with , , ,

Orthogonality

Orthogonality means that features can be used in any combination, that the combinations all make sense, and that the meaning of a given feature is consistent, regardless of the other features with which it is combined. The name is meant to draw an explicit analogy to orthogonal vectors in linear algebra: none of the vectors in an orthogonal set depends on (or can be expressed in terms of) the others, and all are needed in order to describe the vector space as a whole.

– Michael Scott (Programming Language Pragmatics)

I’ve used Delphi and Visual Basic at previous jobs. I disliked both of them. VB has this annoying distinction between objects and primitives, so I’d consistently type

Dim server
server = "localhost"

Dim catalog
catalog = CreateObject("MyCompany.MyObject")

…only to be greeted by an ugly Windows popup box that Object doesn't support this property or method. The problem, as all you grizzled VB programmers no doubt spotted immediately, is the last line should start with the keyword Set. VB requires you to prefix assignments on “objects” with Set. But if you try to put a Set in front of assignments on what VB considers primitives (like the first assignment above), you get an error that reads Object required: '[string: "localhost"]'.

Delphi likewise frustrated me with mundane annoyances:

DoOneThing();

if value = goal then
    DoSomething();
else
    DoSomethingElse();

The code above doesn’t compile, and it won’t compile until you remove the semicolon from the call to DoSomething(). Semicolons complete a statement, and the statement is the entire if-else clause.

These problems in VB and Delphi are related to the concept of orthogonality mentioned in the opening quote. VB doesn’t let you compose assignment to objects the same way it lets you compose assignment to primitives. Delphi doesn’t let you end an if clause the same way it lets you end an else clause. These inconsistencies encourage even experienced programmers to make silly syntax mistakes and make the language harder to use.

What is orthogonality?

The key principles I extracted from Michael Scott’s quote listed above are consistency and composability. Composability means that features can be combined, and consistency stands in for the Principle of Least Surprise—features act how you expect they would, regardless of how they’re being combined. VB’s assignments lack consistency. Delphi’s semicolon parsing doesn’t act consistently when composed within a surrounding statement.

Scott claims that a highly orthogonal language is easier to understand and easier to use. Nowhere does he mention that it’s easier to implement. I’m sure the Delphi grammar was simplified considerably by refusing to allow the DoSomething statement to end in a semicolon when contained within an outer if-else clause. It’s also likely that the implementation of Visual Basic was simplified by having the programmer tell the compiler whether an assignment referred to an object or a primitive.

I suspect many non-orthogonal aspects of languages are there to make them easier to implement. However, some languages that are trivially easy to implement can also be amazingly orthogonal. Lisp is a prime example; it is a highly orthogonal language, and yet an entire Lisp interpreter written in Lisp fits on just one page of the Lisp 1.5 Programmer’s Manual.

I found it instructive to list out syntactic constructs that make languages less orthogonal. It’s amazing how mainstream most of them are:

Statements

Statements aren’t necessary, and a number of languages avoid them altogether. Having only expressions makes the language easier to work in. Compare the inconsistency of Delphi’s statement parsing to the composability of Ruby’s expressions:

puts if x == 0
  "Zero"
else
  "Not Zero"
end

The if clause is an expression; it returns the last value evaluated (e.g., either “Zero” or “Not Zero”). And that value can be composed within another expression.

So what happens when you don’t have a reasonable return value? Smalltalk always returns self, which is convenient because it allows method chaining. In the Ruby example above, the entire expression returns the result of the call to puts, which happens to be nil.

The beauty of expressions is that the composability doesn’t just go one level deep:

x = if a == 1; 1; else; 2 end

y = x = if a == 1; 1; else; 2; end

puts y = x = if a == 1; 1; else; 2; end

puts "returns nil" if (puts y = x = if a == 1; 1; else; 2; end).nil?

As the example above shows, orthogonality doesn’t guarantee good code. However, it allows the language to be used in unanticipated ways, which is A Good Thing. Moreover, since everything is an expression, you can put expressions where you wouldn’t normally expect them. For example, in Ruby, the < operator represents inheritance when the receiver is a class, but the superclass could be an expression:

class Test < if x == 1; OneClass; else; TwoClass
end

Pushing a language like this, only to find that it’s turtles all the way down, is a signpost for orthogonality.

Primitives

We already saw the clumsiness of VB primitives, but most mainstream languages share a similar problem. Java, for example, has a confusing dichotomy of longs and Longs, the first a primitive and the second a full-fledged object. C has stack-allocated primitives, which are freed automatically when they fall out of scope, and heap-allocated variables, which you have to free yourself. C# has value types, which force another abstraction – boxing – into the programmer’s lap.

private static object DBNullIf(int value, int nullValue)
{

    return (value != nullValue) ? (object)value : DBNull.Value;

}

The code above should be a head-scratcher. Isn’t object the superclass of all other types? And if so, why do we have to explicitly cast our int variable to an object to make this code compile? Why don’t we have to do the same for the DBNull?

I mentioned above that ease of implementation is a common source of non-orthogonality. With primitives, we can see another, more legitimate reason: performance. There is a cost to keeping primitives out of the language. Below, we’ll see several more non-orthogonal language features that make performance easier to optimize.

Nulls

Nulls have been problematic enough that an entire design pattern has been created to avoid them. Access violations and object reference exceptions are some of the most common programmer errors. Most languages divide object references into two types: those that point to an object, and those that point to nothing. There’s nothing intrinsically non-orthogonal about that division, except that in most languages the references that point to nothing work differently. Instead of returning a value, they throw an exception when dereferenced.

In Ruby, the division is still more or less there — some references point to objects, and some point to nothing — but both types of references still return a value. In effect, Ruby has built the Null Object pattern into the language, as the references that point to nothing return a nil value. But, like everything else in Ruby, nil is an object (of type NilClass), and can be used in expressions:

1.nil?      # returns false
nil.nil?    # returns true

You never get an NullReferenceException in Ruby. Instead, you get a NoMethodError when you try to call a method on nil that doesn’t exist, which is exactly the same error you’d get if you called a method on any object that didn’t exist.

Magic Functions

Most object-oriented languages have certain functions that aren’t really methods. Instead, they’re special extensions to the language that make the class-based approach work.

public class Magic
{
    public static void ClassMethod()
    {
    }

    public Magic()
    {
    }

    public void NormalMethod()
    {
    }
}

Magic.ClassMethod();
Magic magic = new Magic(5);
magic.NormalMethod();

Notice the context shift we make when constructing new objects. Instead of making method calls via the normal syntax (receiver.method()), we use the keyword new and give the class name. But what is the class but a factory for instances, and what is the constructor but a creation method? In Ruby, the constructor is just a normal class-level method:

regex = Regexp.new(phone_pattern)

Typically, the new class-level method allocates an instance and delegates to the newly created instance’s initialize method (which is what most Ruby programmers call the “constructor”). But, since new is just a normal method, if you really wanted to, you could override it and do something different:

class Test
  def self.new
    2 + 2
  end
end

Test.new    # returns 4!!

Operators also tend to be magic methods in many languages. Ruby more or less treats them just like other methods, except the interpreter steps in to break the elegant consistency of everything-as-a-method to provide operator precedence. Smalltalk and Lisp, two other highly orthogonal languages, do not provide operator precedence. Operators are just normal methods (functions in Lisp), and work with the same precedence rules as any other method.

So here we have yet another reason, on top of ease of language implementation and performance, to add non-orthogonality to a language. Ruby adds operator precedence, even though it adds an element of inconsistency, presumably because it makes the language more intuitive. Since intuitiveness is one of the purported benefits of orthogonality, there is a legitimate conflict of interest here. I think I would prefer not having operator precedence, and leaving the language consistent, but it seems more of a question of style than anything else.

Static and sealed Methods

Class-based object-oriented languages impose a dichotomy between classes and instances of those classes. Most of them still allow behavior to exist on classes, but that behavior is treated differently than instance behavior. By declaring class methods as static, you’re telling the compiler that it’s free to compute the address of this function at compile-time, instead of allowing the dynamic binding that gives you polymorphism.

Ruby gives you some degree of polymorphism at the class level (although you can’t call superclass methods, for obvious reasons):

class Animal
  def self.description
    "kingdom Animalia"
  end
end

class Person < Animal
  def self.description
    "semi-evolved simians"
  end
end

In most mainstream languages, not even all instance methods are polymorphic. They are by default in Java, although you can make them statically-bound by declaring them final. C# and C++ take a more extreme approach, forcing you to declare them virtual if you want to use them polymorphically.

Behavior cannot be combined consistently between virtual and sealed (or final) methods. It’s a common complaint when developers try to extend a framework only to find out that the relevant classes are sealed.

Instance Variables

Only the pure functionalists can get by without state; the rest of us need to remember things. But there’s no reason why the mechanism of retrieving stored values has to be treated differently from the mechanism of calling a parameter-less function that computes a value, nor is there a reason that the mechanism for storing a value has to be different from the mechanism of calling a setter function to store the value for you.

It is common OO dogma that state should be private, and if you need to expose it, you should do so through getters and setters. The evolution of the popular Rails framework recently reaffirmed this dogma. In prior versions, sessions were exposed to the controllers via the @session instance variable. When they needed to add some logic to storing and retrieving, they could no longer expose the simple variable, and refactored to a getter/setter attribute access. They were able to do so in a backward-compatible way, by making the @session variable a proxy to an object that managed the logic, but it was still a process that a more orthogonal language wouldn’t have required. The language should not force you to distinguish between field access and method access.

Both Lisp and Eiffel treat instance variables equivalently to function calls (at least when it comes to rvalues). Lisp simply looks up atoms in the environment, and if that atom is a function (lambda), then it can be called to retrieve the value no differently than if the atom is a variable containing the value. Eiffel, an object-oriented language, declares variables and methods using the same keyword (feature), and exposes them – both to the outside world and to the class itself – the same way (Bertrand Meyer called this the Uniform Access principle):

class POINT
feature
    x, y: REAL
            -- Abscissa and ordinate

    rho: REAL is
            -- Distance to origin (0,0)
        do
            Result := sqrt(x^2 + y^2)
        end

Once you get passed Eiffel’s tradition of YELLING TYPES at you and hiding the actual code, Uniform Access makes a lot of sense. Instance variables are just like features without bodies. C# has properties, which provide similar benefits, but force you to explicitly declare them. Instead of a getRho and setRho method, you can have a property that allows clients to use the same syntax regardless of whether they’re using a property or a field. Because Ruby allows the = symbol as part of a method name, it allows a similar syntax.

However, the separation between variables and properties is superfluous. For example, there’s no need for them to have separate access levels. If other classes need the state exposed, then declare it public. If the language doesn’t offer instance variables, then you’re simply exposing a property method. If you run into the same problem that Rails ran into, and suddenly need to add behavior around exposed state, refactoring should be easy. Just add a private property method that is now the state, and leave the public property method.

So, in my hypothetical language, we might have the following:

class Controller
  public feature session
end

And when you feel you need to add behavior around the exposed session dictionary, it should be easy:

class Controller
  public feature session
    # added behavior goes here
    private_session
  end

  private feature private_session
end

One thing that this hypothetical syntax doesn’t allow is separate access levels for getting and setting, but it shows the basic idea.

Inconsistent Access Levels

Since we’re on the subject of access levels, Ruby’s private access level is not very orthogonal at all. Unlike C++ derived languages, Ruby’s private is object-level, not class-level, which means that even other instances of the same class can’t directly access the private method. That’s a reasonable constraint.

However, instead of making object-level private access level orthogonal, the implementors simply disallowed developers to specify the receiver for private methods. This undoubtedly made implementing object-level private access much easier. Unfortunately, it means that you can’t even use self as the receiver within the object itself, which makes moving a method from public or protected to private non-transparent, even if all references to the method are within the class itself:

class TestAccessLevels
  def say_my_name
    puts self.name
  end

  private
  def name
    "Snoopy"
  end
end

# The following line throws an exception
TestAccessLevels.new.say_my_name

Second class types

Having special types, like null, is a special case of having primitives, and it’s a common performance optimization. Being a first-class type has a well-defined meaning. Specifically, its instances can:

  • be passed to functions as parameters
  • be returned from functions
  • be created and stored in variables
  • be created anonymously
  • be created dynamically

It’s becoming increasingly common for modern languages to move towards first-class functions. In the Execution in the Land of the Nouns, Steve Yegge parodied the typical transmogrifications Java programmers had accustomed themselves to in order to sidestep the language’s lack of first-class functions. Java’s cousin (descendant?), C#, has more or less had them since .NET 2.0 in the form of anonymous delegates.

What neither Java nor C# have are first-class classes. Both have reflection, and even allow you to create new types at runtime (painfully…). But, because both languages are statically typed, you can’t access these runtime-created types the same way you can access normal types. Assuming Foo is a runtime-created type, the following code won’t compile:

Foo foo = new Foo();

The only way to make it work is by heavy use of reflection. Ruby, on the other hand, makes it trivially easy to add types at runtime. In fact, all types are added at runtime, since it’s an interpreted language.

Single Return Values

Since you can pass more than one argument to functions, why should you only be allowed to return one value from functions? Languages like ML add orthogonality by returning tuples. Ruby works similarly, unpacking arrays automatically for you if you use multiple lvalues in one expression.

def duplicate(value)
  [value, value]
end

first, second = duplicate("hi there")

This feature allows you to combine lvalues and rvalues more consistently.

Inconsistent Nesting

Nesting is the standard programming trick of limiting scope, and most languages provide blocks that you can nest indefinitely if you want to limit scope. For example, in C#:

public void TestNest()
{
    string outer = "outer";
    {
        string middle = "middle";
        {
            string inner = "inner";
        }
    }
}

In addition to blocks, Ruby allows you to nest functions, but the nesting doesn’t work in an orthogonal way:

def outer
  inner_var = "inner variable"

  def inner
    "inner method"
  end
end

outer
puts inner
puts inner_var

In this example, the last line will throw a NameError, complaining that inner_var is undefined. This is as we should expect – since it’s defined inside an inner scope from where we’re calling it, we should not be able to access it. However, the same is not true for the inner method defined in the call to outer. Despite the fact that it’s defined within a nested scope, it actually has the same scoping as outer.

Ruby’s scoping gets even weirder:

def outer
  begin
    inner_var = "inner"
  end

  inner_var
end

puts outer

This code works, printing “inner” to the console. It shouldn’t.

JavaScript similarly suffers from strange scoping rules. Because all variables have function scope, and not block scope, a variable is available everywhere within a function regardless of where it is defined:

function testScoping(predicate) {
  if (predicate) {
    var test = "in if block";    
  }
  alert(test);    // works, even though it's in an outer block
}

Fixing your mistakes

Speaking of JavaScript, few things bug me more about programming languages than those that try to fix your mistakes for you. JavaScript automatically appends semicolons for you at the end of a line if you forget to. Most of the time, that works fine, but every now and then it creates a ridiculously hard bug to diagnose:

return
{
  value: 0
}

The return statement above looks like it returns an object with a single property. What it really does, though, is simply return. The semicolon was appended on your behalf at the end of the return keyword, turning the following three lines into dead code. It is for this reason that Douglas Crockford recommends always using K&R style braces in JavaScript in JavaScript: The Good Parts.

Summary

Orthogonality makes the language easier to extend in ways that the language implementors didn’t anticipate. For example, Ruby’s first-class types, combined with its duck-typing, allows the popular mocking framework Mocha to have a very nice syntax:

logger = stub
logger.expects(:error).with("Execution error")

do_something(:blow_up, logger)

The fact that classes in Ruby are also objects means that the same syntax works for classes:

File.stubs(:exists?).returns(true)

I picked on JavaScript a couple times, but it did a better job of any other language I know of with literals. JavaScript has literal numbers and strings, like most languages. It has literal regular expressions and lists like Perl and Ruby. But where it really shines is in it’s literal object syntax. Objects are basically dictionaries, and Ruby and Perl have hash literals, but JavaScripts objects include function literals:

function ajaxSort(columnName, sortDirection) {
    setNotice('Sorting...');
    new Ajax.Request("/reports/ajaxSort", {
            method: "get",
            parameters: { sort: columnName, sortDir: sortDirection },
            onSuccess: function (transport) { 
                $("dataTable").innerHTML = transport.responseText;
            },
            onComplete: function () { clearNotice(); }
        }
    );
}

In time, that syntax was leveraged to form the JSON format.

In a more dramatic example, when object-orientation became all the rage, Lisp was able to add object-orientation to the language without changing the language. The Common Lisp Object System (CLOS) is written entirely in Lisp. The reason Lisp was able absorb an entire new paradigm is largely due to the language’s orthogonality. Not all function-like calls work the same way; some are called “special forms” because they act differently (for example, by providing lazy evaluation of the arguments). However, Lisp allows the programmer to create their own special forms by writing macros, which are themselves written in Lisp.

It helps to have turtles all the way down.

Written by Brandon Byars

July 21, 2008 at 10:36 pm

Posted in Languages

Tagged with , , , , , , , ,

Follow

Get every new post delivered to your Inbox.