Your Developers Aren't Slow?

A friend sent me a link to "Your developers aren't slow" today after I basically accused our dev team of being slow following an particularly unproductive sprint.

The article's points appear to be....

1. Developers average pretty consistent start-to-complete rates
2. Speccing out and prioritizing work has more volatility than actual development
3. Time from done to deployed is significant

Therefore, it's likely there are other causes to slower than desired deployment rates than deficient developers. They suggest things that slow down your deployment might include...

1. Unclear requirements
2. Changing requirements
3. Context switching

Then they say managers should...

1. Define a clear vision
2. Write well-designed user stories and bug reports
3. Reduce context switching

There's some redundancy there. There seem to be two main suggestions.

1. Write better work specifications
2. Prevent context switching

The first point is true of our team--many of our tickets are vague. It's not uncommon for us to fundamentally change the design after a job has been completely finished once. I have one in mind now--a UI element that I'm not happy with. I wasn't involved in the design, and I'm technically not on the QA path, but I have major issues with the look and feel. I would fail it and suggest a completely different UI design. I guess that does slow us down in the long run. Perhaps I should consider ways to track time lost due to post-mortem re-designs. Every release cycle has at least a few features that we completely rework once or twice. Some of that could be alleviated by better design up front.

But that wasn't the problem this sprint...for most of us. Though as I write, I realize that poor specifications have significantly slowed one of our new developers. He has basically had to teach himself the domain so he could design appropriate specifications himself. That has cost him significant development time.

Context switching is another great thing to consider. Joel Spolsky has some interesting thoughts on task switching. In a nutshell: don't do it (as a programmer). As a manager, don't let your programmers do it. Programmers will naturally avoid context switching, so really that last sentence should read "don't force your programmers to task switch." Context switching certainly contributed to some productivity problems last sprint, as our most senior developers were context-switched to non programming tasks much more than usual.

There's another aspect of context switching that is annoying but seems inevitable--code reviews. Our team has an ad-hoc review process. When you finish development, just ask anyone else on the team to review your work. However, our most talented developers are typically the ones asked to do the reviews. They also have a vested interest in keeping the code base up to their standards. But that can be a lot of context switching. I spent almost half my day doing a code review (and rebasing a branch on the side).

Another interesting idea in Joel's article is the consideration of work latency. If small tickets block on large tickets (or if they share the same person's time equally), the average completion time drops, even if the total number of completed tasks is equal. If we want to decrease average task time, we should front-load the short tickets. Of course, that's kind of gaming our metrics. Ultimately what matters is total number of completed tickets per release. And superseding raw counts is the goal of increasing product quality every release. There might be other benefits from front-loading easy work. It builds momentum. After a few days of productive coding, it feels easier to continue being productive. Seeing your work merged and feeling the accomplishment of completed work is satisfying and motivating.

I should look more critically through our tickets before our next sprint kick-off though, and make sure the details are specified sufficiently that developers don't have to go hunting and guessing, and that completed work doesn't need major overhauls.

Quantifying Good Code

When a team does code reviews, there are many opportunities for opinion-driven arguments.  These can be difficult to resolve without well-defined criteria that can be objectively evaluated.  When team members come from different backgrounds, they bring different style preferences and different ideas about what makes code good.  This post is an attempt to establish objective criteria, which will hopefully result in conversations about code that are driven more by data than opinion.

Criteria 1: Code should accomplish the intended objective

This probably seems obvious, but it's an objective metric and a good place to start. If the code doesn't do what it's supposed to, it shouldn't pass code review. Of course, assessing whether it works properly (and in all cases) is often not straightforward. This leads us to a next objective criteria of good code.

Criteria 2: Code should be tested

One of the most important tools for a developer are automated tests. Proving code works in all cases is difficult, and rigorously proving code works is incredibly challenging in most cases, so it's important to at least develop a body of evidence of the cases in which your code works. If code is not checked by automated tests, it probably shouldn't pass code review. There are times when it's difficult to test code using automated tools. Sometimes there are good reasons for this difficulty, but most of the time the cause of the difficulty is the structure of the code. This leads us to the next objective criteria.

Criteria 3: Code should be (easily) testable

Code that is testable is easily checked by automated tests. A good sign that code is testable is if the corresponding unit tests are straightforward--requiring only small amounts of simple setup code to check that the behavior of a function or class is correct. If unit tests must mock many objects or duplicate the behavior of collaborating units to check that a unit under test is behaving correctly, the code is not easily testable.

If unit tests are easy to write, programmers will be much more likely to write more of them. When code is easily testable, automated tests become the straightforward way for a programmer to run their code. The alternative is spinning up the whole application and manually running through some steps. That's a lot of work when an automated test that does the same thing can be written just a quickly.

Usually code that is easily tested is also broken down into small chunks. Classes are small, have few dependencies, and adhere to the single responsibility principle. Functions likewise are small, require few arguments, and avoid having side effects wherever possible. This leads to the next criteria.

Criteria 4: Code should be broken down into small units

Programming is all about breaking problems down into smaller, more tractable problems. When a code solution is broken into small pieces of functionality that come together to solve the larger problem, it is easier to test each piece in isolation. It also becomes easier to re-use common pieces when different parts of code require solutions to the same problems.

Long methods are particularly troublesome. They usually result in unclear coupling between all the variables in that function's scope. Usually long methods persist for this reason--it's difficult to untangle which sub-steps require which variables. When those dependencies are understood, they are often very difficult to separate without duplicating the setup code that establishes some of the variables in the first place.

Long methods do more than one thing, so proper behavior in all cases is difficult to check in tests. To check behavior of a middle step (say step 2 in a 3-step long-method process), you have to mock out all the behavior required for step 1 to properly execute before code even gets to step 2, then to check the results, you have to make sure all the dependencies behave sufficiently well to make it through step 3 with a return value that can be checked. Thus the tests for step 2 require detailed knowledge of steps 1 and 3 as well.

Sandi Metz defines four simple rules for developers to follow and produce good code.

1. Your class can be no longer than 100 lines of code.
2. Your methods can be no longer than five lines of code.
3. You can pass no more than four parameters and you can’t just make it one big hash.
4. When a call comes into your Rails controller, you can only instantiate one object to do whatever it is that needs to be done. And your view can only know about one instance variable.

You can break these rules if you can talk your pair into agreeing with you.

The rules are focused on ruby / rails development and the limits probably should be expanded for boiler-plate rich Java codebases, but the idea of small units is consistent with this smallness criteria.

Criteria 5: Code should be necessary

Every class, function, and interface should be necessary to accomplish the stated goal of the task. Code that is not being used (aka "dead code") complicates the reader's job of understanding a given section of code as they first need to learn that it is dead, and then actively ignore the code. Dead code adds more visual clutter when reading code, and more mental clutter when comprehending its purpose. Dead code also likely violates criteria 1 and 2 in that it likely doesn't work and likely isn't well tested. This becomes especially risky as future features may see the dead code and assume it works, use it, and later be forced to go fix the parts that are broken.

Unnecessary code can also be good code that does more than the stated objective. Sometimes this is good, and the stated objective should be expanded. Other times it would be more easily reviewed and tested as a separate code review, even if it is good code and a worthwhile feature.

It can be tempting to solve two or more problems in the same pull request "while I'm in there..." It's good for a team to have a guiding principle that encourages focus on the task at hand and small pull request / code reviews.  This allows the objective task to be quickly accomplished, and lowers the risk of code review quagmire. Less code to review means a faster review and less fodder for disagreement or nit-picking.

Criteria 6: Complexity should be necessary

Along the same lines as not allowing dead code, we should also avoid introducing unnecessary complexity. Classes and functions should not have more indirection, abstraction, or configurability than is required to accomplish the stated objective. It can be tempting to add extra abstraction and configurability in view of the future, thinking it's noble to do more work today some someone in the future can benefit.

However, this introduces two new problems. Firstly, until that future arrives, everyone who reads or modifies the code needs to deal with the extra abstractions and configuration options even though they aren't really necessary. Secondly, when the future actually arrives, it might not be exactly as expected (the future rarely is). Then the work that was done previously needs to be re-worked to accomplish the differing requirements of the actual future requirements. Typically it's easier to write fresh code from scratch than to read, understand, and modify existing code that was designed for different requirements. Because of these downsides, it's best to delay the extra complexity until it actually becomes required for today's coding objective.

Criteria 7: Code should be correct

When writing simple straightforward code it can be tempting to gloss over the corner cases and requirements that force additional boilerplate or complexity. For example, it can be easy to overlook checking for nulls and other bad inputs. Ignoring those details can make code look cleaner and read more clearly, but violates Criteria 1 and exposes potential for latent bugs.

Good tests can flush out problems of nulls and bad inputs, but there are other similar complications that are easy to overlook. For example, behavior in multi-threaded environments needs to be considered carefully. Usually the simple, straightforward, immutable solution will also work best in multi-threaded environments.

Another detail that's easy to overlook is object lifecycle when dependency injection is automated by some framework. JAX-RS / Jersey will automate injection of a Context into a Resource on each request, which can lead to unexpected behavior. These details need be dealt with, and covered in automated tests as much as possible.

Criteria 8: Code should be consistent with style standards

As basic and tedious as it may seem, it's important for a team to have coding standards and adhere to them. Basic code style is a tempting topic to argue over, and will come up continually until a team has a standard. The topic will continue to arise until that standard is universally enforced. These standards are by nature objective and enforcement is easy to automate. Every project should be set up to check for adherence to code standards as part of the build. If there are nits about style that regularly arise during reviews (like "add newline at end of file"), add that to the automated checks.

With automated checks, coders are forced to address style issues when their build fails. Then code reviews aren't bogged down in stylistic nit-picking, and the quality of discussion and quantity of good will toward co-workers is elevated. It's much better for team dynamics to have everyone get slightly annoyed at themselves when their build fails because they broke the style guidelines than to have people get annoyed at the team member who cares about consistency and continuously points out every style violation in code reviews (or at the team member who consistently violates the standards).

Summary

This is an attempt to collect a few objective metrics for assessing code quality. There is surely more that could be said about things like introducing new library dependencies, avoiding duplication within a codebase (DRY), rules of thumb for adopting frameworks (perhaps something like: "the amount of code required to interact with the framework should be less than the amount of code required to accomplish the same goal without the framework"), conventions for which libraries and classes to use for what purposes (eschewing java.util.Date for example), conventions for static methods and *Util classes (avoid them as much as possible), etc.

However, I think these criteria nail down some frequent concerns that arise when reviewing code that seem to devolve into subjective comparison of opinion. If I can convince my coworkers that these high level objectives are acceptable, then we can focus discussion on whether a particular aspect of code (for example) is strictly necessary or not, without having to re-hash the argument of whether code-for-the-future is a good idea (YAGNI).