Heuristics for Code Readability
Code readability is hard to define. There are some things that everybody can agree on, like how Hungarian notation was a sin for which Microsoft will die a thousand deaths. Certainly, the biggest gains can be made by following simple rules, like using intention revealing names, and fluent use of Extract Method refactorings to further clarify intent. However, how do you decide between things like abbreviations and commonly used prefixes like underscores to denote private variables? Certainly, the readability gains are smaller, but perhaps they’re still worth discussing.
While this is a work in progress, I’m proposing the following three heuristics:
- In general, code is more readable if you can read it more easily than the alternative. (duh!)
- In general, code is more readable if your fellow developers can read it more easily than the alternative (duh!)
- In general, code is more readable if your non-developer customer can read it easier than the alternative (huh?)
All three heuristics have exceptions. For example,
- While leaving comments and variable names in your native language is easiest for you to comprehend, none of your team members actually speak your native tongue.
- Your team is developing an API for clients that will have a different knowledge base. Using team conventions for the public interface may not be appropriate.
- Non-developers would find English sentences more readable than code.
However, I think finding the right balance between those three heuristics is a worthy goal. The first two go without saying—you and your team will be the maintainers of the code. The third one is a bit more controversial, but I think it’s defensible. To an extent, it’s what Eric Evans was talking about in Domain Driven Design when he mentioned the Ubiquitous Language. As much as possible, use the same words in your code that you use in communication with your customers.
But what about abbreviations and those ubiquitous underscore prefixes? I have a preference, but it’s not strong enough to be dogmatic about it. I think, in general, they decrease readability compared to the alternatives for the third heuristic above. Some developers may argue that, while customers may suffer reading the code a little, developers suffer by not using those underscores. That’s fine—heuristic 2 trumps heuristic 3. But in the absence of such a complaint, I think the default should be no underscores, and sparse use of abbreviations.
I prefer wording the third heuristic in the strong form given above, but in a weaker form it could also apply to programmers unfamiliar with a language. For example, a common Ruby idiom looks something like this:
The meaning behind the line above is often a necessary evil in Ruby, and it’s frequently written that way. I prefer a bit more verbose syntax, however. The following is equivalent:
While you still may not know exactly what that line of code does, I suspect one of them is easier to read than the other.