Going Cold Turkey on Static Methods

I recently had a rather incoherent rant about why Singleton is an anti-pattern.  Let’s say that you decided that static methods needed to be eliminated from your code base.  So you embark on the refactoring to end all refactorings.  That’s exactly the situation I’m in at the moment.  The irony is that the static methods I’m trying to eliminate are calls to StructureMap’s ObjectFactory.GetInstance.  Yes, riddled through the code are calls to services.  Nothing is denied to any object. 

If you start trying to replace static methods with proper instances, you’re going to be asking yourself a lot:

  • Why does that object need a connection to that system?
  • For crying out loud, why do I need to add an extra constructor parameter to support only one method?
  • Why exactly can’t I just store the current user somewhere?
  • But now I’ve got X, which has a dependency on Y, and Y has a dependency on X, does this make any sense at all?
  • How many methods am I going to have change?
  • Does everything have to be an instance method?
  • Exactly how many levels do I need to pass this object down?
  • Why does this object take a dependency on every service?
  • How long is this going to take?
  • Is this really worth it?

The answer to the last question is yes

Static methods are like crack, they’re convenient, they solve your problems quickly and you quickly become dependent upon them.  You’re so dependent you don’t even know you’ve got a problem.  You may even read an article like this and believe that it doesn’t apply to you, because you’ve got your static methods “under control”.

Now, if you’ve never done experienced a code base that’s not riddled with static methods, you’ve never felt what it is to be clean, to be able to analyze your dependencies as easily as being able to examine your constructor, to be able to replace sections of your system and build new systems out of the components you’ve created without worrying about support functionality you’re not actually using.  It feels good, but before that, it’s going to feel really bad.  I don’t mean really bad in the “non-technical manager who always opposes refactoring” bad, I mean bad as in “This was a huge mistake and I’ve no idea how I’m going to get it working again” bad.

What’s worse, going cold turkey is something you have to do on your own.  Solving this one is hard even for other team members to help you.  Branching is going to help you this time; it’s best to just have a code freeze.  But I figure it won’t hurt to talk, so here’s answers to the questions earlier:

  • Almost certainly, because your object design is wrong.  Just get it working.  We’ll worry about how to fix the design another time.
  • Well, you might not have to.  I’ll show you how to deal with that later.
  • You can, but that “somewhere” is going to be another constructor parameter e.g. ICurrentUserProvider. 
  • It doesn’t, and it never did.  It’s just that static methods allowed you to continue with this state of affairs.  You’re going to have to break the circular dependency.
  • About 50%.  I didn’t say it was going to be easy.
  • Yes, it pretty much does.  It doesn’t really perform any worse, so don’t worry about it.
  • Ridiculous numbers, especially if your object design is wrong or you’re not using a dependency injection tool.  Just live with it for now, we’ll come back to this another time.
  • It took me three hard slog days with pretty much the entire code base checked out.
  • Because it’s assaulting the single responsibility principle with an axe.  You’re going to want to restructure this at some point.  Not now.

Reversibility Patterns: Memento and Command

These two patterns deal with times that you’ve got a requirement to be able to undo work.  There’s Memento, which is a pattern of limited use (if only because a lot of the classic implementations are already available for you to use), and Command, which is one of the most criminally under-used patterns in the whole book.

Memento

Let’s deal with Memento first.  Personally, I think most of the problem with Memento is its name.  It’s a checkpoint.  You store checkpoints as you’re working, and you restore back to a checkpoint if something doesn’t work.  A good design for a limited set of use cases.  Why aren’t you going to need this very often:

  • Because you’re using a transactional system such as a database and can use the transactions directly.
  • Because you’ve followed good design principles and avoided having a lot of mutable state in your code.
  • Because your reversibility concerns are handled by using the command pattern.

If you’re wondering how on earth you could implement this behaviour without mutable state, try taking a look at Eric Lippert’s post on immutable stacks.  Usually, an approach like this is likely to be a better solution.

An interesting example where it is (probably) used is Prince of Persia: Sands of Time.  In the game, you can hit a button that reverses time.  By storing the state of every actor in recent frames, it can just as easily rewind them.  The example also highlights the importance of keeping the mementos small: if it, for instance, chose to serialize the walls every frame, the game would grind to a halt. 

Command

Let’s be clear: the command pattern is stone cold brilliant.  I know I’ve just been going on about the Memento pattern.  Now it’s time to wake up.  Also every UI you create, every workflow you implement, has a Command pattern in it.  Well actually, it doesn’t, because you don’t know the Command pattern and it’s not obvious.  But it should.

Here’s the basic form of the command pattern:

interface ICommand {
    void PerformAction();
}

Now, in itself, this isn’t that interesting.  So far, we’ve managed to represent an action (code) as an object (data).  As such, it might as well just be a function delegate.  However, where things get interesting is in the principal variation to the pattern.

Undo Command

To add reversibilty, we can extend the interface like this:

interface ICommand {
    void PerformAction();
    ICommand UndoCommand();
}

We’ve got something much more interesting.  All of a sudden we’ve got a pattern that you should be implementing in every GUI you ever write.  This is the answer to how Word allows you to undo multiple operations.  If we return to the Prince of Persia example, each monster would have a currently executing command, and the command object would need to provide additional information should as animation frames.  (Don’t worry, I’m not about to start blogging about how to write a computer game.)

The reason this is so hugely important is that you need it from Day One.  Adding undo features to a design that doesn’t implement the command pattern can be an exercise in frustration.  Obviously, the undo command of an undo command should be a redo command.  Generalizing further, you get to concepts such as the layers in PhotoShop, where you keep track of your action history on an object and can insert and replace actions. 

You’ll note that GMail generalizes this concept in a couple of interesting ways:

  • A “following command” concept.  This is usually undo, but can be something else.  For instance “send an email” has a following command of “view the email you just sent”.
  • Actually, sometimes there are multiple following commands.  This enables such functionality as “Invite this user to gmail”.
  • Certain commands cannot be undone, and the UI supports highlighting those exceptions.  (You could just implement this by returning null for the Undo command, but that would prevent you from specifying behaviours when reversibility is lost.)

Serializing Commands

It’s often an excellent idea to make your command objects serializable, both in the sense of supporting .NET serialization and also in the more general sense of being able to round-trip the object to the database.  You can do the following

  • Provide a complete audit log of your user’s actions
  • More, be able to replay a user’s session.
  • Analyze usage patterns to spot common behaviours.
  • Distribute your application across multiple servers, probably by using a publish/subscribe network such as nServiceBus.

Using Commands to make Transactions

Obviously, one obvious model for implementing rollback is the Memento pattern.  However, it’s also possible to use the Command pattern for this purpose.  The following code illustrates how:

void PerformTransaction(IEnumerable<ICommand> commands)
{
    Stack<ICommand> executedCommands = new Stack<ICommand>();
    foreach (var command in commands)
    {
        try
        {
            command.PerformAction();
        } catch
        {
            command.PartialUndo().PerformAction();
            foreach (var commandToRollback in executedCommands)
            {
                commandToRollback.UndoCommand().PerformAction();
            }
            throw;
        }
        executedCommands.Push(command);
    }
}

Summary

We’ve covered the two main patterns to implement reversibility.  In practice, the Memento pattern is often quite brittle.  Changes in the behaviour of related objects could lead to changes in what has to be stored.  This is less likely to happen with the Command pattern, since it separates out the responsibility for reversibility into the relevant transitions.  Furthermore, command objects can be extended in further directions, taking in permissions, interruptibility, batching, to name but a few.

So, why isn’t it more heavily used?  Why is it that I always see UIs that desperately try to solve the problems the Command pattern solve in ad hoc and incomplete ways?  I think the problem is that it’s quite hard to refactor to use this pattern:  there’s just too many parts of your code affected by pulling the command concerns together.  This makes it all the more important that you design in Commands right from the start of your project.

Double Dispatch: The Visitor Pattern

I’ll be frank, I don’t like the Visitor Pattern.  It’s a hack.  It’s just a way of getting around a deficiency in the language.  Basically, extending the functionality of objects is what inheritance is for.  The whole reason the visitor pattern exists is to deal with the times that this model falls down.  Proxies and the decorator pattern could also be considered object extension mechanisms, but I’ve already dealt with them.  Another reason I don’t like it is that it relies fairly heavily on abusing function overloading, and it’s extremely brittle with regards to changes in your inheritance structure.

A note: it is often assumed that the visitor pattern has something to do with iteration and trees.  Whilst it can be used in such scenarios, it’s not really the point and often there’s a simpler solution.  So, what I’m going to talk about is double dispatch.

There’s basically two limitations to virtual methods:

  • You can’t add them to an existing class.
  • Sometimes you don’t want to add them to an existing class.  This is usually because doing so would violate the single responsibility principle.

The visitor pattern is a way around this limitation, but it’s not elegant.  What’s worse, it requires you to be able to modify the target classes, so it doesn’t even fully address the first limitation. Whilst it can be useful, it’s always worth examining exactly why you need a visitor.  It can be a code smell.

Implementing the Visitor Pattern

Let’s say that we wish to write a routine that processes messages in a trading system.  We’ll say for the sake of argument there are order messages, execution messages and allocation messages.  Now, the “object orientated” way of doing this would be to add processing method directly to the class, but that isn’t possible or desirable in C#.

So, we define a “visitor” interface

interface IMessageVisitor {
    void Visit(OrderMessage message);
    void Visit(AllocationMessage message);
    void Visit(ExecutionMessage message);
}

and we add a method to the IMessage interface.

void Visit(IMessageVisitor visitor); 

It’s probably worth defining an interface IVisitable<TVisitor> for this purpose. 

internal interfaceIVisitable<TVisitor> {
  
void Visit(TVisitor visitor);
}

We now implement the following method in every one of our target classes:

void override Visit(IMessageVisitor visitor) { visitor.Visit(this); }

Note that you can’t implement this code the once in a base class, because it won’t work.  What this code does is to abuse function overloading.  If the message is an allocation message, it will call the “Visit Allocation Message” routine.  If you’ve got an “Automated Allocation Message” that inherits from allocation message, it’ll call the same routine.  The same semantics, in other words, as a virtual function.

If, on the other hand, you wanted to specialize the “Automated Allocation Message”, you’d need to change the IMessageVisitor.  It’s not a perfect solution.

Alternatives to the Visitor Pattern

It’s worth noting that many modern “typeless” programming languages allow you to add methods directly to classes at runtime.  This provides a strong alternative to the visitor pattern.  It doesn’t violate the single responsibility principle as long as you segregate the scope of the routines.  If you can’t modify the target classes, a (carefully written) big ugly cascading if statement can be used instead of implementing IVisitable.  Finally, you could use a chain of responsibility instead, which is effectively a well-structured cascading if statement, but a lot more flexible. 

The catch is: any of the above solutions don’t get you the magic static type checking of the visitor pattern.

In general terms, if you’ve only got a small number of implementations of a given IVisitor class, you should probably just consider adding virtual functions directly to the calling classes.  If, on the other hand, you have fifty, the visitor pattern may be pretty much the only way to keep your problem space manageable.

Tree Walking

The classic gang of four example of tree walking is actually a mix of the visitor pattern and the composite pattern.  With this, the parent nodes visit method automatically call visit on their children.  Seriously, just don’t do this unless you absolutely have to.  You’ve mixed the dispatch behaviour with iteration behaviour, there’s no way for the caller to figure out the structure of the tree and you can’t vary between depth and breadth first iteration.  Here’s some code that’s often more useful than doing the composite and visitor trick:

interface INode<TNode> {
    IEnumerable<TNode> Children { get; }
}
IEnumerable<TNode> DepthFirst<TNode>(TNode root) 
where TNode: INode<TNode>
{
    return new[] { root }.Union(root.Children.SelectMany(c => DepthFirst(c)));
}
IEnumerable<TNode> BreadthFirst<TNode>(TNode root)
where TNode : INode<TNode>
{
    return BreadthFirst<TNode>(new[] { root });
}
IEnumerable<TNode> BreadthFirst<TNode>(IEnumerable<TNode> children)
where TNode : INode<TNode>
{
    return children.Union(children.SelectMany(c => BreadthFirst(c)));
} 
Technorati Tags: ,

UPDATE:  This article used to contain text about the composite pattern, which I’ve removed.  You can find my revised thoughts about composite pattern here, or the original text with editorial here.

Patterns Everyone Knows: Unhelpful Terminology

The patterns book is 15 years old, that’s about 150 in developer years.  All told, it’s amazing that more of the patterns aren’t out of date.  Here are some that I think could be safely retired.  Again, they’re not necessarily bad ideas, it’s just that they’re special cases of more general principles, and I favour understanding the principles.

Bridge

I’ll be honest, I’m not sure I get the point of the Bridge pattern.  A bridge is exemplified by the following example from finance:

  • You have an interface for a traded instrument.  Call it IInstrument.
  • Shares are one of the simplest kinds of instrument.  We extend the interface to IEquity.
  • We provide a base implementation of Instruments that developers can inherit from.  We call this InstrumentBase.
  • We implement shares using the concrete class Equity which inherits from InstrumentBase.

In this case, we could modify the IInstrument interface.  This would affect the InstrumentBase classes, but not the Equity class.  Equally, we could modify the implementation of Instrument without modifying the external interfaces.

Now, I have a problem with all of this.  Basically, as far as I can tell, the Bridge pattern is what I call object-orientated development.  There’s two components of the pattern:

  • The use of interfaces to shield implementations from the consumers.  This is the Open/Closed principle.
  • The use of inheritance to specialize.  This is also known as coding in C#.

In short, Bridge doesn’t say anything other than “built according to good design principles”.

Template and Strategy

Template patterns are the same as Bridge patterns, only the emphasis is different.  Rather than specializing an entity, you’re specializing an algorithm.  Now, the interesting thing here is: there’s two ways to create a general algorithm with specialization.

  • Inheritance (which the Template pattern requires)
  • Composition (in which case you call it a Strategy pattern)

Typically, I’d favour the latter, as would Gamma et al. 

Usually, in C#, you can generalize an algorithm just by passing in a function.  For instance, consider the following code:

gofPatterns.FirstOrDefault(pattern => pattern.Name == “Template”)

FirstOrDefault is a parameterized algorithm.  The expression ”  pattern => pattern.Name == “Template”  ” specializes it.  Implementing the same algorithm through inheritance would just be plain painful.  In short, Bridge and Template are both names for specific cases of more general problems, and the specialization isn’t useful.  So, I wouldn’t really recommend using the terminology.  That’s not to say that you won’t use the pattern, just that you’ll probably not help anyone by telling them you used a template pattern when you did so.

In practice, I have been known to use the term “Strategy Pattern”, but usually I just mean “I outsourced some decision making”.  In general I think you’re better off just understanding that there are multiple ways of parameterizing an algorithm.

Flyweight

Basically, a flyweight object is a representation of an entity which strips out most of the information and just keeps the truly vital information.  This is done for memory conservation.  I’m including this under “obvious” because it will have occurred to anyone who has run into the problem it solves (and because I can’t think of a better place to put it).  You could also describe this as a “reference” pattern, since often the flyweight objects contain just enough information to find the true entity in a database if needed.

An interesting case is lazy loading in NHibernate.  NHibernate generates proxy objects which are themselves initially flyweight, but become heavyweight on their first access.

Again, I don’t think this is particularly useful terminology.  In the time since Design Patterns was written, the whole idea of a “true” object representation has declined in importance.  Finely-grained interfaces customized by consumer, pervasive proxying, presentation models are all part of a general principle that objects matter at point of use.  The world has moved on.

Summary

Bridge, Template and Strategy are also obvious patterns, but unlike Facade, Builder and Adapter, aren’t really useful for communication. 

  • In the case of Bridge, typically just saying that you’ve a) hidden something behind an interface or b) used inheritance to specialize is more clear. 
  • In the case of Template or Strategy, it’s better to refer to what you’re doing as a parameterized algorithm, irrespective of the form that parameterization takes.
  • “Strategy Pattern” is nonetheless a popular term, and I’m not advocating getting into a terminology war with something who wants to call what they’re doing a strategy pattern.  But you might wish to introduce them to Erlang sometime…

The Flyweight pattern is part of a general principle that objects can have more than one representation.  I wouldn’t personally use the term.

Object Creation: Factory and Abstract Factory

Okay, we’re onto the first of what I described as the “stone cold brilliant patterns”.  This particular one looks obvious, but its benefits are actually quite subtle.  There are basically only two ways of creating an object in .NET:

  • Call new
  • Call a function that calls new

The factory pattern is basically just using the latter.  Now, this might strike you as an unnecessary level of indirection, and you might be right.  You’d be right if:

  • The class is a pure data transfer object.
  • The class doesn’t have a complex data structure.
  • The class has no functional logic.

I’d rather not talk about testing too much, but what this comes down to is “The object could be easily created in a test.”  Now, if you’re not that into testing (because, I don’t know, maybe you like bugs…) you might be wondering why I’m going on about testing.  Well, testing is re-use.  These objects are what I term “Leaf Objects”.  They’re the basic values in our system.  Everything else, you need a factory.

The whole point of the pattern is to avoid the basic drawback of new: that it isn’t polymorphic.  You can’t call new Employee() and receive a MicrosoftEmployee object.  And even if you don’t want that to happen, one day you probably will wish you could.

Incidentally, don’t create a static factory method.  If you do, you’ve missed the whole point.  The same goes for putting significant logic into your factory.  Logic mixed with object creation is inflexible.  In C++, you can overload the behaviour of the new operator.  This, sadly, isn’t as useful as it first appears, since again it’s a static operation and cannot be varied according to runtime context.  There’s no way to achieve the design benefits of the factory pattern.

The Difference between Factory and Abstract Factory

To be honest, I don’t really think the distinction between factory and abstract factory is useful:

  • Factory Method.  You declare a variable of type Func<>.  There’s not much more to it, really.
  • Abstract Factory:  You declare an interface in which every method returns or creates an object

It’s the difference between one function and multiple functions.

Summary

I don’t honestly think I’ve done this pattern justice.  It’s probably the single most important design pattern we’ve got.  It falls into an interesting category: one where it’s obvious what you do, but much less obvious when you should use it.  It took me years to appreciate how important it was to well-designed code.  I started to understand it through reading Miško Hevery‘s work and applying that to my own experience of where my own testing approach was falling down. 

Patterns Everyone Knows: Useful Terminology

I guarantee, you already know these patterns.  However, the patterns terminology is useful, if only to communicate the concepts quickly.

Builder

An object used to build another object.  The most obvious implementation of this pattern in the framework is StringBuilder.  It can be quite useful to have a builder in cases where what you’re constructing is complex and you don’t need to read from the constructed object as you’re going.  (If you do, just using the object’s own method is often simpler.)  The Builder pattern is used in fluent interfaces to support method chaining.  In this case, the builder constructs the object internally, but returns itself.

Arguably, this pattern should be marked “Caution: Hazardous Material”.  Although everyone is familiar with it, every so often someone gets it into their head that every object should have a builder.  The code rapidly becomes a mass of useless indirection.

Adapter

You want to expose one interface from an object, but it exposes a different interface.  You write an adapter object to translate calls from one to the other.  Differs from the proxy pattern only in as much as the proxy pattern mandates that the exposed and the internal interface should be the same, but it’s definitely a proxy in the looser sense that we commonly use.

Adapters usually occur at sub-system or system boundaries.  Third-party libraries should typically be wrapped

Facade

Now, this pattern is so general it’s going to cover a lot of code you’ve written over time, but the terminology is actually useful, simply because we do actually need a word for it.  Say you’ve got an extremely complex trading system.  However, all that your code needs is a list of accounts and the cash in each.  A Facade is an interface that just exposes the bit you need. 

Summary

Facade, Builder and Adapter are amongst the most common patterns in software development.  Most developers will have independently come up with these solutions, since they’re pretty obvious.  The terminology can, however, be useful to communicate between developers.

Bridge and Template are also obvious, but unlike Facade, Builder and Adapter, aren’t really useful for communication. 

  • In the case of bridge, typically just saying that you’ve hidden something behind an interface is more clear. 
  • In the case of Template, it’s better to refer to what you’re doing as a parameterized algorithm, irrespective of the form that parameterization takes.

The Flyweight pattern is part of a general principle that objects can have more than one representation.  I wouldn’t personally use the term.

There’s one more “obvious” pattern: factory.  However, when you should use it isn’t as obvious, so it’s getting a post of its own.

Gang of Four Anti-Patterns

These is basically patterns where you’re better off avoiding them.  Seriously, just skip those parts of the book:

Singleton

All patterns are a trade-off.  The trade-off with singleton is reducing the number of parameters certain functions take and trading them for more static methods.  From the perspective of code agility, a every method on singleton might as well be a static method.  Not only that, but everything the singleton touches might as well be a static method.  The testing implications of the Singleton pattern are horrendous.  Steve Yegge has written far more than I intend to on the subject, so I’m really not going to spend any more time on telling you what a stupid, stupid idea it is.  Singletons, static methods and global variables all expose the same problem.  In fact, they’re pretty much effectively the same thing most of the time. 

So, what should you do instead of a singleton?  Well, let’s say, for instance, that you have a network connection (call it X) wrapped as an XConnection object.  There’s only one physical connection, and that needs to be accessed from various parts of the system.  Well, actually, the answer is very simple: you pass the object into the contructor of the objects that need it.  If some of those objects get created before the network connection gets called, you could pass in an object that creates the connection on the first call, or you could pass in a promise/lazy object/future/whatever it’s called in the language of your choice.  Basically, take the static methods you were thinking of, and turn them into instance methods.

Now, the technique that we’ve just described is called Dependency Injection (DI), and is made easier by using a DI container.  Ironically, you tend to tell the container that the connection object has a “Singleton” lifecycle.  But all that means is that there’s one connection referenced by the container, not one connection usable in the whole of your memory space.  The distinction may appear small, but it’s the difference between a flexible design and a brittle nightmare.  Much as certain extremely talented people may use the technique, don’t make your container a singleton.  That way lies madness.

Singletons are, of course, a special case of the concept of shared state.  In general terms, state and especially shared state, is problematic.  Designs that share very little state or, even better, have no state at all tend to be less prone to hard to analyze bugs and easier to repurpose when requirements change.

Prototype

This one basically says you shouldn’t create your objects, you should copy them.  We already know this concept in .NET, it’s called ICloneable.  There’s all sorts of problems with cloning objects, number one of which is that it’s not that well defined.  If X has Y as a property, when you clone X, do you clone Y?  Seriously, semantically, the prototype pattern is a mess.

The BCL team were talking about demising ICloneable five years ago

The thing with prototype is: you can see how it might look sensible if you’re a C++ programmer.  Copy means something quite specific: a memory-based clone.  Any good C++ developer would know exactly what that would do.  It’d copy things declared as values and re-use things declared as pointers.  Let’s just think about that for a second: what happens when your data structure changes a value to a reference?  Well, the semantics of your copy change.  And, of course, you can’t copy anything that uses an external resource.

Now, JavaScript is based around the idea of prototypes instead of classes.  These are not the same thing as the prototype pattern. 

On the other hand, it’s important to note that sometimes context is everything. 

Summary

The Singleton and Prototype pattern have no place in the arsenal of the educated C# developer.  Prototype just makes no real sense, Singleton will positively ruin your code.  Most people know that singleton is a bad idea, but are still enamoured with the concept of shared state.  On the other hand, I don’t think anyone is seriously attempting to use the prototype pattern in C# in the first place.

EDIT: Thanks to Svend Koustrup for pointing out a dangling sentence.