Metrics and Money

I’ve been trying to stick together a few notes about what I’ve learnt about process design and implementation over the years.  The “short notes” just keep getting longer and longer.  However, this story is worth repeating and should illustrates one rule that everyone should know by now is: never link a metric to cash.  It seems like the most sensible thing in the world to give out bonuses on the basis of deliverables.  It’s not, it will damage your department and your firm.  If you’re lucky, you’ll know how.  If you’re unlucky, you’ll never find out.

Let me give an example, a friend of mine worked at this firm.  We’ll call him Joseph Developer.  Management, having finally ditched performance-based metrics a year previously (for reasons that would make a good post of their own) had decided that it was time for re-introduction.  With a new and popular CTO keen to make his mark, they sat down and came up with a new plan.

OK, I can hardly tell this story without laughing.  I’m actually laughing right now typing this.  I can assure you that it wasn’t as funny for anyone directly affected by this.  Let me just remind you that this was the best plan the CTO, the COO, the CFO and CEO locked up in a room could come up with.  This was what they pitched to the developers.

  • They were concerned about the quality of their systems.
  • So they thought about linking bonuses to the number of bugs assigned to someone.
  • But they knew that didn’t work.
  • So, rather than that, they decided to target the number of re-opened bugs.

I can pretty much pinpoint that as the point were some people’s nagging suspicion that the CTO wasn’t as good as they hoped transformed into an unshakeable certainty he didn’t have a clue.  Here’s what they’d done: they’d taken a system they knew didn’t work, taken the first derivative and decided to implement that.

Labouring the point

Clearly, this announcement would never have occurred if the managers had actually understood the problem.  So let’s talk for a bit about why you shouldn’t incentivize people on the basis of bug count.

  • What is a bug and what isn’t is ill defined.  That isn’t a problem until you start making dollar amounts (which are about as concrete as you can get) dependent on the distinction.
  • You’ve just created an adversarial relationship between the bug reporter and the developer.  These can develop anyway, but they’re never productive.  By putting money on the line, you’ve guaranteed it.
  • You’ve re-incentivized people to work in a negative, rather than a positive way.
  • Your developers will figure out a way to game the system.

The first two points look like they’re unique to bug tracking, but they’re not.  All of the metrics that you’ve got are indicators of what’s going on, not the unvarnished truth.  All that the bug metric tells you is that a certain number of bugs have been entered into your system.  It’s not a measure of quality and it sure as hell isn’t a measure of productivity.  That’s not just true of bugs, it’s true of the actual P&L of your company.  Don’t believe me?  Ask an accountant.  Or a market analyst who has to decode earnings announcements.  Not even the bottom line is the bottom line.

People get misled by examples from manufacturing and construction, in which well-defined metrics produce well-defined outcomes.  Go and read that Joel article again.  I’d go a bit further and say that such incentives do work in very limited circumstances: where you want the guy to do that and only that.  My old project manager once killed a round of testing simply because it meant he would hit his bonus targets.  Fixing bugs would not.  Sales commissions are brilliant at getting salesmen to sell.  You’d better keep a fairly tight eye on exactly what they’re selling, though.

The adversarial point is equally general.  You’ve replaced a metric which helps you tell what’s going on with a salary negotiation mechanic.  I hope you’re not planning on that metric being used for anything else.  Such as, for instance, bug tracking.  I think there have been enough options scandals by now to emphasize that this is equally true of earnings numbers.

The final point is the one that should make you pause, even if the others didn’t.  Let’s take a look at our four guys locked up in a room.  They were employing more than 50 developers.  Bright as they were, they weren’t brighter than 50 guys with degrees and training in a profession that emphasizes logical thinking.  Actually, it doesn’t even matter that they were bright, anyone could have come up with a way of circumventing it.  Since you’re paying them to do so, you can be reasonably guaranteed they will.

So how many of those objections didn’t apply to the “penalize re-opened bugs”?  None of them.

  • What was an unfixed bug and what was a separate issue was ill-defined.
  • They made a fairly adversarial relationship between QA and development ten times worse. 
  • People regularly worked in a negative way.  Often, more time was sent arguing about the exact status of a bug than fixing it.

And as for gaming the system, well, that’s where the fun really began.

Gaming the system

When I was first told of this, it took me five seconds flat to figure out how to game it.  Just fix easy bugs.  If a label’s wrong, fix it.  Avoid anything involving a nasty interaction or an ill-defined behaviour.  Never mind that those are where the value is.  In fact, the truly pernicious thing about this whole process was that it actively penalized senior, responsible developers who took on hard problems.  Now, as I say, I was lucky, I didn’t have to put up with such a stupid system.  My friend did.

So, I met up with Joseph a couple of months after this had been put in place and asked him how the firm was going.  He told me that the atmosphere was very negative (not solely caused by this decision) and that he’d already given up on getting a bonus this year.  He was two months into the bonus cycle.  This was an extremely talented and conscientious guy, regarded as a star at the firm.  And he’d given up on getting a bonus, simply because he was behaving like a responsible developer and not gaming the system.

Two months later, he left and joined a much better firm.  Yes, eventually he figured out a way to circumvent the policy he was happy with: he quit.  He wasn’t the only one.  And, for the reasons I’ve already outlined, it was the best staff who jumped.

Now, management at this firm clearly forgot Evil Overlord Rule #12, but just because the example’s extreme it doesn’t mean that the point isn’t general.  Incentive structures distort behaviour and de-motivate staff.  Metrics-based incentive structures distort metrics as well.  At my current workplace, I receive a bonus based on how well the firm did and how well my managers think I did.  Yes, I still have targets, and I still try to hit them, but I don’t let my targets interfere with serving the business.

So how’s Joe now?  He’s still at his new firm and very happy.  I tried to tempt to a job that would pay significantly better, but he’s not interested.  Now, that’s a firm that knows how to motivate its staff.

Technorati Tags: ,,

NHibernate Truncates Milliseconds when Using DateTime

It’s even documented, right here.  It’s a bit annoying, seeing as it violates the principle of least surprise, but it does make the DateTime type portable across database implementations.  However, if you want to just use your native precision, just declare it as having “Timestamp” type.  Problem solved.  No need to mess around with IUserType or worse.

Technorati Tags:

Automated Deployments #2: Configuration Management

It’s amazing how much engineering time is spent on arguing about the difference in abstraction strategies, followed by someone saying “just copy the files up, but make sure not to touch the config”.  This is a recipe for disaster.  There are three common failure scenarios:

  • Someone takes a copy of the live system, runs some tests and accidentally enters the test data into the live system.  I once saw that happen with a stress test.  It wasn’t funny.  (In fairness, it’s pretty funny in retrospect.)
  • Someone uploads a debug environment, rendering the live system unstable.  (This is mostly a web-related scenario.)
  • A new version is correctly released, but it required a config change which never made it into the production config.

Now, most people run with this policy because “don’t touch the config” produces fewer failures than “touch the config”.  You could argue that most of these scenarios are associated with not carrying out the instructions to the letter.  However, this is to miss the point.

Successful processes minimize the chance of human error.

If someone forgot a step, and that guy is not a muppet, your process has too many steps.  Our release process has one.  Exactly one.  I loathe processes that seem to have as their principal benefit that you know who to blame when it goes wrong.  I would much rather things didn’t go wrong in the first place.  So, we’re looking for a process that guarantees that the environmental differences are respected, but that changes required by the code are propagated.

Types of Environmental Factor

Configuration management is a big and scary subject, and is the proper preserve of your IT department, not your developers.  However, if you concentrate on just the bits that matter to developers, it need not be that big an undertaking.  Let’s go back to basics.  In general terms, there are three common sorts of .NET application:

  • A standard windows client application.  This includes console and GUI apps. 
  • A windows service.
  • A web site or web service.

For standard windows applications, your environmental delta will usually be in the app.config file.  Unless you have multiple installs of the one service on a machine, it is unlikely you’ll have any environmental changes in windows services.  Web sites themselves are typically identical on all deployment environments.  The fact remains that nearly all installs of corporate applications can be summarized as follows:

  • Copy some files
  • Fix the config file
  • Set up the windows service
  • Set up the service in IIS.

Now, to produce a perfect install, you end up messing around with InstallShield or WiX or some such tool.  However, to cover 95% or more of environmental issues, all you really need is a way of fixing the config file.  I’ll remark at this point that since you have control over the entire ecosystem, you can ensure that your system doesn’t require weird settings you can’t handle.  Equally, I’d go out of your way to eliminate stuff like custom config sections.  They’re more trouble than they’re worth.

In practice, in our environment, we have a phenomenal number of programs are the only config entries we ever change upon deployment are:

  • AppSettings
  • ConnectionStrings (We encrypt the connection strings when we apply the delta)
  • Setting compilation debug equal to false (I can’t stress how important this one is.)

I may one of these days publish the code we use for this (it’s a powershell cmdlet) but the fact remains, it’s easy enough to implement on your own.  Incidentally, I can highly recommend you don’t use the Enterprise Library solution.  It’s quite complex and has weird bugs (e.g. it won’t work on a network file).

Storing Environmental Deltas

When we were designing this system, we consider the following models:

  • We use a model in which all deltas are stored in the same file with the master.  The program then determines which environment it needs to use.
  • The development config is the master, deltas are separate files applied to it upon deployment
  • Deltas are applied to produce all configs, including the development config.  The deltas are applied to a master.config.
  • We use the user settings features built into AppSettings.
  • We just have different configs for each environment.  The deployment process just copies up the right one.

There are die-hard fans of all approaches, but I’ll outline why I believe the second to be superior. 

The monolithic file approach is attractive at first because everything’s together but suffers from catastrophic unmanageability as you get a lot of settings (which is a problem you shouldn’t have, but may have).  Furthermore, there is the inelegance of having to deploy information for one environment to another (unless you write a post-processor, in which case you might as well have opted for alternatives 2 or 3).  The self-discovery aspect is attractive, and the monolithic file is easy enough to put into source control.  Just putting all of the configs into source control has its attractions as well, but suffers from the fact that 90% of the XML will be the same in each file, making it hard to track down the differences.  I prefer a model with explicit deltas.

The built-in features sound attractive, because it feels like Microsoft has already done the heavy lifting for you.  However, you’re pretty much guaranteed to be still modifying the web.config anyway, and you’ve split the config into multiple parts not only for management, but for the deployment environment too.  A lot of people practice this method by having the deltas only present

Finally, we’re left with the choice between having an abstract master file and having the master file be the local development config.  Here, I’d argue that the local development file will be edited by developers directly whether you like it or not.  Best to embrace that than have it as a failure point each time it happens.

Final Thoughts

The level of configuration management you need for .NET apps is pretty easy to implement, which makes it a pity very few people both.  All you really need is a couple of xml pokes and you’re done.  One of the great benefits is that all of your environmental information is in source control (you can even make the program that applies the deltas encrypt the data if you regard that as desirable) which makes it much easier to check things in a large heterogeneous environment.  (Again, not a problem you should have, but a problem you may have.)

And yes, the first failure scenario mentioned at the top is also the reason you should have a firewall between development and production.  Next time I’ll talk about configuration that doesn’t appear in the .NET config file.

How AutoGen is different from Castle’s TypedFactory Facility

Mauricio Scheffer asked how AutoGen differs from the TypedFactory Facility already in Castle.  (An equally valid question is why the code is five times the size.)  The answer is that it doesn’t in essence, but it does in detail.  However, the details matter, in that AutoGen addresses common use cases, whereas the TypedFactoryFacility is only going to save you 6 lines of code.  The principal differences are:

  • Configuration
  • Constructor Arguments
  • Handling Keys
  • Disposal
  • Handling Multiple Methods

How TypedFactoryFacility is used

Let’s take a look at how you’d implement the example on the AutoGen home page using the TypedFactoryFacility

public void TutorialConvertedToTypedFacility()
var container = new WindsorContainer();
var facility = new TypedFactoryFacility();
container.AddFacility("TypedFactory", facility);
facility.AddTypedFactoryEntry(new FactoryEntry("X", typeof(IFactory), "CreateExample", null));
var factory = container.Resolve<IFactory>();
Assert.IsTrue(factory.CreateExample() is Example);

OK, let’s observe a couple of things.  First, we created the facility.  AutoGen is implemented the same way, as a facility called AutoGenFacility.  However, the next line is where the two begin to diverge.  With the TypedFactoryFacility, you add the interfaces you want directly to the facility, rather than by adding configuration attributes to the components.  This means that it has its own syntax, whilst AutoGen just expects you to add “ccAutoGen=’true'” in your XML files, or “@ccAutoGen=’true'” in your Binsor file.  This is the way the RemotingFacility works.  The downside of this approach is that it’s harder to use with fluent configuration, which is why I provided an extension method directly for that use case.

(The integration of the AutoGen syntax directly into the registration mechanism is, of course, via a mechanism (extension methods) that didn’t exist when the TypedFactoryFacility was created.)

So far, the two facilities are pretty similar, except for the fact that the TypedFactoryFacility requires you to tell it the create and release methods.


Supporting the Abstract Factory Pattern

There is, of course, a difference between a Factory pattern and an Abstract Factory pattern.  The factory pattern is basically just a method which creates an object.  Let’s remind ourselves of the GoF maze abstract factory:

  • MakeMaze()
  • MakeWall()
  • MakeRoom(int n)

There are rwo important observations here.  The first is that there is more than one creation method.   The typed factory facility can’t do this, and it can’t do it by the design of the API: by asking for a single creation method, it can’t support interfaces with more than one.  Second, the MakeRoom method takes a parameter.  Often when dealing with object creation, there are parameters that vary at runtime.  Castle supports this, but the typed factory does not.  The following code demonstrates this:

public void TypedFacilityCantImplementConstructorParameters() {
var container = new WindsorContainer();
var facility = new TypedFactoryFacility();
container.AddFacility("TypedFactory", facility);
facility.AddTypedFactoryEntry(new FactoryEntry("X", typeof(IFactory2), "CreateExample2", null));
var factory = container.Resolve<IFactory2>();
var result = factory.CreateExample2(999);
Assert.That(result.Value, Is.EqualTo(999));

public void AutoGenCanImplementConstructorParameters() {
var container = new WindsorContainer();
var factory = container.AutoGen<IFactory2>();
var result = factory.CreateExample2(999);
Assert.That(result.Value, Is.EqualTo(999));

This isn’t tricky to support (Castle’s got all of the hooks you need), but it requires about five times the code of the existing TypedFactoryFacility, since you’ve got to support various use cases for how you’d like to map the parameters to constructor parameters.  Sadly, Castle’s support here isn’t quite as deep as I’d like, but that’s a subject for another day.

For reference, here is the IExample2 code

public interface IExample2 { int Value { get; } }

public class Example2 : IExample2 {
private readonly int value;

public Example2(int value) {
this.value = value;

public int Value {
get { return value; }

Supporting IDisposable

The TypedFactory facility does it’s job and leaves.  If you want to release the object from Castle, it’s got the ability to declare a release method, but that’s it.  AutoGen actually goes further, in two ways:

  • Any object returned that implements IDisposable will be released from Castle when the dispose method is called.
  • Any factory that implements IDisposable will release all objects it has generated when it is disposed.

To illustrate, here’s the test that a) Release is called and b) the original disposal is called.

public void DisposeReleasesASingleton()
var factory = container.Resolve<IFactory>();
var singleton = factory.Create();
Assert.That(factory.Create(), Is.EqualTo(singleton));

Of course, doing this means that you need to be careful with lifetimes, hence such tests as “SingletonsAreSingletonsWhenProxied”. 

Again, this makes the code longer and more detailed than the TypedFactoryFacility.


I hope I’ve managed to demonstrate why the AutoGen code is different from the TypedFactoryFacility.  I hope I’ve also described why I think the concept is important.  A valid question, however, is why I haven’t submitted the work to the Castle dev team yet.  The answer to that one is less good: I didn’t think there would be a lot of interest (or someone would have already done it).  I should, however, verify that.

As an aside, the Dynamic Proxy (without which none of this would work) code in Castle is extremely wonderful and powerful.  This was my first project that used it, but it won’t be the last.  I honestly do not believe it should be wasted on AOP.

Running MTA code in an STA thread

I don’t claim to understand the windows threading model that well.  For that matter, I don’t want to learn either.  But every so often you hit an error like this:  “WaitAll for multiple handles on a STA thread is not supported.”.  Now, we’ll calmly step away from the car crash that is understanding the apartment threading model and skip straight to making the problem go away.

private void RunInMtaThread(ThreadStart action)
    var thread = new Thread(action);

Incidentally, the SetApartmentState call isn’t actually necessary, since it’s the default.  I’m including it so that it’s obvious how to achieve the reverse.  As an aside, I’ve been spending a lot of time thinking about API design (Ayende’s opinionated API article started me on this road), and I can think of no good reason that the apartment state should not be an optional parameter of thread start, rather than a settable property.  It’s not like you can change the apartment state once it’s running.

AutoGen for Castle Windsor

I recently published a tool called AutoGen for Castle.  You can check it out at Google Code.  In essence, it auto-generates factory objects for you and adds them to your container.  This is an extremely useful thing to be able to do.  I do, however, find it a bit hard to explain, so bear with me.

A relatively good rule of thumb these days is that a class should instantiate objects or do something with objects but never bothMiško Hevery has written a lot of good stuff on this subject, and I don’t propose to mangle his thinking here.  Now, if we go back to our old Gang of Four maze construction example, the abstract factory in that case produced objects that were closely related and independent.  However, that’s rarely the case these days if only because we design systems to have very few interdependent components.  It’s actually much more likely that the objects could have been produced using a DI framework such as StructureMap or Castle.

Now, when you start using dependency injection containers, you seem to end up putting “Resolve” or “GetInstance” all over your code.  This is an extremely bad idea, for two principal reasons:

  • Calling Resolve is conceptually as bad as calling new
  • Your code should not be taking a dependency upon its DI framework.

Now Jeremy Miller wrote an excellent article on the question of libraries taking a dependency upon IoC containers.  It’s a known problem and hard to deal with without Microsoft stepping up to the plate.  However, typically a your code probably doesn’t need an interface as general as the one Jeremy proposed.  That’s only going to be useful for people building frameworks.  It’d be better if you could specify your own.

That’s what AutoGen does, it lets you specify an interface (or multiple interfaces) for how you interact with Castle.  Anything you like, really.  By default a parameter called “key” is the key, and anything else gets passed to the constructor.  (Obviously, it uses Castle’s semantics for doing this, there’s not a lot of control there.)  It even, if you so wished, allows you implement Jeremy’s interface.  That won’t help you with standardization, however.

Ideally, this means that you can actually restrict your interaction with your container to your main method.  You only need one call to Resolve/GetInstance: the call that resolves your service class.  The rest of your code can now be container-agnostic.

Anyway, if you’re interested, you can take a look here:

It depends upon Castle Core, Dynamic Proxy and (obviously) Castle Windsor.  The tests are written in NUnit 2.4.

Understanding Inversion of Control Containers

I thought I’d write a bit about how I understand the philosophy of IoC containers.  I realize I’m actually building up to something that I started talking about a while ago.  I’m probably not saying anything that Martin Fowler didn’t say before, but I have my own slant on it.  To start off with, I’d like to just review what we actually mean by various terms:

  • Inversion of Control (IoC) is a general name for the pattern where an object isn’t responsible for managing the lifecycle of the services it uses. 
  • The simplest way to implement this (in .NET) is passing services in through the constructor.  This is termed constructor injection.
  • Typically, services are passed in using interfaces, which eases testability.  However, Inversion of Control is not about testability.

So what is an IoC container?  It’s a configuration tool.  That’s it.  Typically, it implements the constructor injection pattern like so:

  • For each object registered, you usually specify:
    • A name for the component
    • The interface it implements
    • The class that implements it.
  • For primitive values, you just say what the constructor parameter is and what the value should be.
  • For interfaces, you either not specify the implementation, in which case you get the default, or specify a particular component reference.

Actually, there is one other thing the container does: it handles lifecycles.  This is a major feature that people often take for granted.  The clue is in the name, really.  Containers are things that hold objects, not produce them.  Containers typically allow you to specify the lifecycle of the object e.g.

  • one implementation in the process (Singleton)
  • one implementation in the thread
  • one implementation in an HttpContext

This lifecycle management is crucial to the use of IoC containers in most environments.  The catch is that it can have side effects you do not expect.  For instance, if you call a parameterized resolve on an object with a singleton lifecycle, the object will only ever have the first set of parameters passed in.  Any others will be ignored (the moral of this story is to always use transient lifecycles when dealing with run-time parameters).

A fundamental part of the philosophy of IoC containers is that they should be extremely low footprint and non-invasive.  The code should not need to know it is running in a container.  Nor should the interfaces.  There are, however, a number of times that you do need to know about the container.  The obvious one is when reasoning about lifecycle management, however there are a number of times the abstraction gets broken.  Having the abstraction broken is not as painful as having no abstraction at all, but it can be a distraction.

Evaluation of Containers

There are, of course, a lot of subtleties about containers.  Quite a lot of people come to the conclusion that the libraries out there are too “heavy-weight” and that they would be better off rolling their own.  If you’re one of those people. hopefully after reading this list you will either decide to refocus your efforts on improving the existing libraries, or you will have a USP that merits the duplication of effort.  (Or you just want to have fun, which is always a valid reason for writing code.)  I’ve listed some of them out here:

Most of this is specific to Castle Windsor, since its the one I’ve worked with most, but many of these questions are common across implementations and are things you should watch out for when evaluating.  I will re-iterate that whilst it is easy to write a simple IoC container, writing a very good one such as Castle is a challenge.

Are Primitives Strings?

My personal bugbear is that IoC containers started out when XML was fashionable.  As a consequence, there’s a tendency in most of them to treat everything as a string.  Since these days there’s a move towards DSLs such as Binsor or fluent configuration, the requirement that parameters be strings is out of date.  There are a number of side effects of this.  Castle Windsor RC3, for instance, fails one of its unit tests in a UK environment due to different date formats.  Equally, adding a primitive type that isn’t easily expressed as a string is painful.  Custom Type Converters are a useful concept for dealing with text-based file formats, but seriously, why can’t you say

Parameter.ForKey("target").Eq(new Uri("")) ) );

The current way of achieving this is unnecessarily verbose.

How are Lists Handled?

If there is one thing I heartily dislike about Castle, it’s the list handling.  Ironically, in many ways, the list handling is strong: it’s relatively easy to register an array of array of strings, for instance.  However, once you leave primitives, it gets more ambitious.  If you create a constructor parameter of IEnumerable<IService>, it will by default pass in a list of all components that are registered with the IService interface.  There are a number of problems with this

  • The worst is that it gets in the way of the second simplest use case of a list: one where you specified a list of component references yourself.  If you try this, you end up with a type conversion error.
  • It can’t handle super-interfaces, it will only ever do exact matches.
  • You can’t specify that you care about interfaces on the registered implementations.  Thus, requesting IEnumerable<IDisposable> wouldn’t return the “Right Thing” (all registered disposable objects) even if you could specify that you wanted super-interfaces.

I would advise anyone evaluating a container to pay particular attention to how you can specify lists of components, because it come up a lot in real use cases.

What Issues are there with Open/Closed Generics?

There’s always a couple of bugs to do with open and closed generics.  Castle recently fixed another one.  In March of this year, it wasn’t possible to express this concept in StructureMap:


Indeed, this issue was pretty much why I moved to Castle in the first place.  These days you’ve got to come up with something fairly involved to run into a problem (e.g. an open generic type relying on a closed one).  However, if you’re using one of the many less-popular frameworks, or rolling your own, you need to watch out for this.

How does the Container Deal with Multiple Interfaces?

If you register the same class as the implementation of multiple interfaces, typically you will end up with multiple instances.  It’s possible to mitigate this by using explicit component references, but that’s not a perfect solution.  Sometimes you want a service that exposes different interfaces to different consumers.  Castle Windsor calls this feature “forwarding”.

How can you Inject your own code?

How good is the container at handling the case where it doesn’t create the object itself?  Can you say something like this?

.CreatedBy(() => ConnectionFactory.NewConnection() )

Windsor’s story here is rather painful, with two facilities defined which use reflection to run.  On the other hand, they support Dynamic Proxy out of the box, so intercepting method calls to the interfaces is pretty simple and powerful.

Can you Create a Derived Container?

I am, frankly, amazed this does not come up more often.  It should be relatively easy to create a container based upon another container, overriding and extending parts of the configuration.  This is actually extremely useful.  Binsor has the Extend keyword (you’ll need to check the unit tests for documentation) which achieves this, but frankly this is too important a feature to be left to the DSL, this should be a fundamental part of the container.  Certainly there’s no easy way to achieve this without using Binsor in Windsor.  I think there will probably be a whole separate post about this.