Thinking Again About the Decorator Pattern

Speaking to a colleague about the Decorator pattern, I think I was unduly harsh on decorators the last time I wrote on the subject.  I’ll start off by re-iterating my opinion that writing pass-through methods is for the most part a horrible waste of time and a code smell to boot.  However, there are some cases where proxy/decorator functionality is useful where there are relatively few methods you need to proxy.  Ironically, one of these is my very first post.  It employs the decorator pattern to give you a class that executes Retlang Commands, but allows you to halt processing at a time of your choosing.  This kind of buffering is generally useful, you can do things like process a message based on the following message this way (and if you know enough about Bloomberg feeds, you’ll know why this isn’t an academic example).

Some other examples of where the decorator pattern can come in handy:

  • Currency Conversion: here the converted amount can be a decorator on the underlying value.
  • A read-through cache: here the read-through functionality is added as a decorator on the underlying value.
  • Parameter interception: mostly useful for testing, you log the parameters and then call through to an underlying implementation.  (I actually used to have a Rhino Mocks capture constraint that was a decorator around a proper constraint.)

A good rule of thumb for using the decorator pattern:

  • You have an underlying behaviour that you wish to leverage.  In the case of the haltable executor, I wanted to allow any underlying executor.
  • You want to change one particular aspect of the underlying behaviour.  Again, in the case of the haltable executor, I wanted to allow for halting.
  • There are very few methods that need delegating.

Decorator vs Composite

The problem with decorator is that you’re constantly in danger of making a Law of Demeter violation.  You’ve got to ask yourself “do I really need the underlying object, or is it just convenient?”  Let’s take the example of the Rhino Mocks capture constraint.  In practice, I didn’t really have any underlying behaviour I wished to leverage.  I just wanted to deal with the fact that you can only have one constraint per parameter.  Problem is, we’ve already got a pattern for dealing with that situation: composite.  A proxy or decorator needs exactly one target. 

Now imagine you can a constraint that evaluated both constraints (it’s not hard, it actually exists:  Rhino.Mocks.Constraints.And).  Now you can have the capture constraint just always return true.  Your constraint now becomes “the underlying constraint” and “the capture constraint”.

Decorator vs Chain of Responsibility

Some have argued that Chain of Responsibility, one of my personal favourite patterns, should be replaced wholesale with Decorator.  I seriously disagree with this.  First, let me point out that the version I used in the previous post on the subject isn’t actually the classic gang of four pattern.  The original is a linked list, hence the “Chain” terminology.  The interface I gave has rules that don’t need to know about other rules.  This gives you a bit more flexibility in constructing different rule chains.  Providing the rules don’t need to share information (which will usually cause problems of their own) linked-list behaviour or using a decorator is just a Law of Demeter violation.

Does Anyone Out There Understand The Microsoft Public License?

Seriously, I’ve raised the subject of this before, but it’s getting to be a bit of a joke.  Take a look at this recent discussion on the castle list.  Whereas the Free Software Foundation will tell anyone who cares to listen (and frankly, many who’d rather not) the minutiae of compliance with the various versions of the GPL, Microsoft can’t or won’t answer a straight question about what they regard as being compliant with a license they themselves drafted.  Rather laughably, they then point amateur open source projects at lawyers.  Lawyers who are going to tell them “well, it depends, there’s no case law on the subject”.

Believe me, I would love to see a good answer to this question, but as it is it’s hard to recommend the use of this licence to anyone.  I think people who do choose to use it are going to be receiving a lot of requests for dual licensing.

Technorati Tags:

SOLID Principles: O is for Open / Closed

Any tour of SOLID should start with the Open/Closed principle.  The Open/Closed principle is actually different from the others.  All of the others are development practices.  OCP is a philosophical goal.  Here’s the classic statement:

Objects should be open for extension but closed for modification.

A great phrase, but meaningless unless you’ve had it explained to you.  Now, the SOLID principles are all about improving project velocity.  What the Open/Closed Principle actually says is: you don’t want to be editing code to change its behaviour.  You should be changing a class to fix a bug, not to make it do something else.  Now, when this was originally formulated, TDD was in its infancy and changing code was regarded as being inherently fragile.  We’ve mitigated that with TDD, but actually it turns out that the same principles enable testing: you shouldn’t have a separate code path for testing.

Let’s take a look at some examples of code that fail the Open / Closed test.

public void PrintLines(string[] lines) {
    foreach (string line in lines) {
        Console.WriteLine(line);
    }
}

Okay, let’s think about how we’ve violated the open closed principle.  First off, we’ve got a great big ugly static method.  I’ve talked a fair bit about these already.  Let’s talk about possible scenarios that could come up:

  • What happened if you wanted to write to a file?  You’d have to change the code.
  • What happened if you wanted to disable writing to anywhere?  Because the function isn’t virtual, you’d have to change the code.
  • What happened if the lines were streaming from a database?  Passing them in as an array isn’t ideal so you’d have to change the code.

Let’s look at another example:

public void PrintLines(string[] lines) {
    using (var writer = new StreamWriter(@"c:x.txt")) {
        foreach (string line in lines) {
            writer.WriteLine(line);
        }
    }
}

Now, obviously many of the objections to the last code are valid again, but this one’s got some more to worry about:

  • You can’t change the filename.
  • Even assuming you only ever wanted to write to a file, you can’t choose to buffer the file.

Finally, consider this code:

public void PrintLines(ILineProvider lineProvider) {
    using (var writer = new StreamWriter(@"c:x.txt")) {
        foreach (string line in lineProvider.Lines) {
            writer.WriteLine(line);
        }
    }
}

This has an interface in it, so it must be better, right?  Sadly, it isn’t.  This code is actually less flexible than the previous example.  Now you’ve got to implement ILineProvider just to use it, not just any old array of strings.  This is what is known as the Law of Demeter.  The Law of Demeter isn’t explicitly mentioned in the SOLID principles, but it should be.  Maybe it could be SOLIDD…

Danger Points

Just these two examples have given us some talking points that highlight points at which you’re likely to violate the Open/Closed principle:

  • Static Methods
  • Non-virtual methods
  • Creating objects
  • Too-specific variable declarations
  • Hard-coded values
  • Using the wrong object (the Law of Demeter)

If there’s a summary to all of this, it is this: be careful what you depend upon.  Next, I’ll talk about how we actually go about achieving this.

Technorati Tags: ,

Everything I Learned About Object Oriented Design Was Wrong

For all the kerfuffle about maintainability recently, it’s worth noting that actually the principal participants agree much more about best practices than they disagree.  The SOLID principles underpin pretty much every major open source .NET project I’ve examined.  (There’s a selector bias here, but I’m not going to worry about it.)  They are, in my opinion, much more important than the design patterns I’ve been talking about.  Sadly, they’re not half as well known as the Gang of Four book.  They should be, because they can be summarized as follows:  you need to change the way you develop.

Now, people are always saying this, and they’re usually wrong.  AOP, Workflow, there’s any number of new development ideas that have fallen by the wayside.  In practice, this one won’t, mostly because actually the ideas are quite old.  They’re just not widely understood.

When I was at University, I had to deliver a project in Turbo Pascal.  I think the Maths Department must have been allergic to curly braces.  Like a good little geek, I opened the user guide and started reading.  Now, most technical documentation is completely unreadable, but this was different.  It was, in fact, one of the single best bits of technical documentation I’ve ever read.  It introduced a whole new concept to Turbo Pascal: object-oriented development.  C++ had been around for a while, but the concepts were entirely new to me; programming for me was pretty much defined by Kernighan and Ritchie, as it was to most of the people I knew.

This new OOP concept was a revelation: it was, finally, a logical way of organizing your code built right into the language.  All you had to do was identify your objects, which behaved like actors in a play.  The actors would play their parts, and the program would come about as a combination of the actors.

Unfortunately, this was dead wrong.

Objects aren’t actors, a single business concept doesn’t necessarily correspond to one object.  This will lead you directly to the creation of god objects.  It turns out that the development of code isn’t principally about modelling.  It’s about flexibility, dependencies and coupling, and only then about modelling.  Ironically, this approach leads to better models in the long term, simply since the model will track changes in requirements faster.

What is “good” code?

I think I’ve probably seen more fruitless arguments about this than any other subject.  Everyone thinks their code is good, and sometimes they’re right.  Good code definitely isn’t the same thing as useful code, which is an error a lot of developers make.  Good code isn’t about aesthetics or huge architectures either.  Here’s the only definition of good code worth having:  good code is code you can easily change.  There are many formerly popular technologies (CORBA particularly springs to mind) that failed that one test.  The SOLID principles are guidelines designed to make your code easy to change.

So, I’ve decided to write a York Notes guide to SOLID.  My principal sources are Uncle Bob’s own website, and the excellent Hanselminutes podcasts on the subject.  One of the things you’ll probably notice as you go through is that your code tends to get longer:

  • There’s more interfaces
  • There’s more small classes
  • There’s more constructor parameters
  • There’s more instance variables

I really, really advise getting ReSharper, it turns doing this from what feels like an extremely bureaucratic endeavour to a case of pressing “Alt-Enter, Enter” a lot.  One day I hope I’ll be able to see this stuff baked into the language.

Technorati Tags:

Why I Don’t Regard Algorithm Proof as Useful

Okay, now that I’ve swapped insults with Frans (in fairness, I started it) I think I should explain my position on provability more carefully.  Some context: I’m a mathematician by background, I came to computing relatively late, and am extremely aware of the gap between the two disciplines. 

Let’s be clear, I actually agree with a fair amount of what Frans said in the original article.  I’m certainly not advocating a plough-in “RAD” approach to development, I just think that the concept of proof in computing isn’t particularly useful. 

The Importance of Algorithms

Let me start with sorting.  There are a number of obvious known algorithms for sorting on one thread:

  • Bubble sort
  • Quick sort
  • Merge sort

Of these, one is stupid, one has the wrong complexity but low memory usage and one is fast but uses more memory.  Which one should I implement?

Well, actually, usually the answer is: I shouldn’t bother.  I’ll just type “orderby person.Name” and get on with my life.  I’ve got a bunch of algorithmic knowledge in my head and it’s about as useful as my encyclopedic knowledge of Babylon 5.  Frameworks like .NET and, frankly, LLBLGen have been implementing high quality algorithms for most use cases, which means that most of the time, I don’t need to bother.  Sure, there’s very few general frameworks for graph traversal, but that’s mostly because it’s of specialized utility.

I used to work at a dot-com with over 50 developers.  Every so often, something somewhat mathematical came across our desks.  When this happened, it got passed to me.  I estimate that I did something like this for about two days in every three months.  The truth is, there are many environments in which this stuff doesn’t come up very often.

The Problem of Proving Anything

Let’s first deal with a philosophical point: what do you mean by proof?  As a former mathmo, I have a pretty strong idea of what that means.  It means 100%.  Sadly, C# isn’t a proven language, and Windows isn’t a proven operating system.  In terms of formal specification and verification, hardware is ahead of the game, but your average processor is chock-full of errata.  In general terms, we’ve no good model for proving imperative style code, even if there have been recent breakthroughs.

This sounds all very academic, but it has some real consequences.  One of the principal ones is that, even if you prove something, proving that your implementation matches your conceptual model isn’t really possible.  There’s a huge gap between your pen and paper and the keyboard.

Now, previously I talked about how algorithms were a relatively small part of the average development workload.  Problem is, algorithms are actually the things that are easiest to deal with using formal proof techniques, because they’re small and they’re abstract.  Your average workflow is harder.

This is all assuming you have formal training in the discipline that this involves.  Most people don’t.  Your proof is only as good as your confidence that you haven’t made a mistake.

What Are You Proving?

I referred to proof before as specification and verification.  So far I’ve talked about verification, the part that most people regard as proof, but specification is just as important.  Let me tell another abstract story: Roger Needham was pretty much the world authority on authentication.  He pioneered third-party authentication protocols and helped develop the world’s first authentication logic, a system for verifying authentication protocols.  You might think that a paper such as that would be the last word on the problem.  The problem is in the specification.  All the logic deals with is verifying that specific types of attack are not possible.  It didn’t necessarily protect against guessing attacks on poorly chosen secrets, which is unfortunate since pretty much all passwords are poorly chosen. 

The problem here is specification.  Your employer or your client is only interested in “Is it secure?” or even more nebulous concepts.  The only way to formally specify “secure” is to restrict your attention.  Sometimes things slip through the gaps.  Famously, the initial authentication logic failed to spot a password sent in plain text.  Needham always argued that this wasn’t a problem with the logic because it was a violation of the principles, but even that argument should illustrate the practical limitations of the techniques of formal proof.

Again, this isn’t a an academic objection.  About six months ago, I put live an extremely sensitive piece of code.  It dealt with something that was actually appropriate to this kind of analysis.  I and several others spent a lot of time testing it, trying to poke holes in it.  There’s a raft of automated tests for it.  We even ran it in parallel to the live systems for months to check that it behaved.

It failed a month after it went live.  Why?  Because we missed something.  The system worked perfectly, but it was only part of a larger system.  That larger system had complexities we hadn’t captured that caused it to fail.  In short, we didn’t ask the right questions.

Unattainable Goals

So, what’s the alternative?  Well, I’d still favour TDD.  I know Frans wasn’t arguing against this, but I think that talking about provable algorithms is a distraction.  Your average developer, when being told that he needs to prove things, won’t outline the arguments that I’ve just given, but he’ll know in his gut:

  • It doesn’t seem very relevant to what he’s doing.
  • It sounds impossible
  • It doesn’t sound like what he does for a living

On the other hand, approaching things in a story-based fashion, where cussed exceptional cases are built into the test framework, that’s something he can see is working. 

Your average developer wouldn’t be able to read the phrase transitive closure without reaching for a dictionary (or wikipedia).  What he found there would be unlikely to enlighten him.  In fact, most developers don’t really ever analyze complexity.  I used to regard that as shocking.  These days, I’ve come to the conclusion that it rarely makes them less productive.  Complexity does occasionally come up and bite them, but it’s rare. 

I’m not arguing against learning, or against continuous improvement.  But you’ve got to have priorities and there are a phenomenal number of things to learn.  For instance, if you don’t know anything about graph algorithms or the SOLID principles, I’d say you were better off studying and practicing the latter.

Summary

Frameworks are different, they’re useful precisely because they do things that the average developer would find at best time-consuming and at worst error-prone or impossible.  They’re the exception to the build vs buy rule: you’re building so that someone else can buy.  For these reasons, I don’t necessarily think the best practices for an ORM developer are necessarily the same as those for a LoB developer.  In short:

  • I don’t think algorithms are that important in LoB development.  Those that are common are already written.
  • Formal verification isn’t a error-free as is first appears.
  • You can only verify what you specify.  Errors in specification can and do happen.
  • Finally, for the reasons above, I think that the concept of proving your algorithms is of niche utility.  TDD is the tool I’d reach for first, not formal proof.

 

Technorati Tags: ,,

Microsoft’s Game Plan for C# 4.0

There’s a reason certain companies are on top.  People always talk about Google’s long term strategy and how collectively bright an organization it is.  Less is said about Microsoft, partly because it occasionally missteps quite badly.  A good example of this is Windows Phone.  Microsoft have had a strategy for a phone platform for going on ten years.  Google, seeing a long-term threat from the iPhone, knock together something better reactively.  However, for every Windows Phone, there’s an X-box Live.

But .NET’s what I’m interested in, and C# 4.0 is focusing on, of all things, compatibility with Microsoft Office.  Now, if you seriously think there are any competitors to Excel, you really don’t understand the product well enough.  You can already write Office automation apps in .NET, it’s not brilliant, but it’s doable.  I’m really not looking forward to the next round of marketing that tells me that I should really be using Office as “A massive DLL” for my app.  Microsoft do this kind of marketing because it’s part of their long term strategy: keep selling Windows and Office.  But, to be honest, I don’t think even they believe that they’re going to get that many people to start writing office workflow apps, especially after SharePoint.

So, if it isn’t really about integration apps, what’s it about?  My apologies if this was obvious, but the light only just went off in my head: it’s about macros.  I’d be watching what Microsoft’s doing to the CodeDom very carefully.  I’ll bet you’ll start to see the sort of enhancements that would make macro recording in C# possible.  Making it easier to link to Word isn’t really that fascinating.  On the other hand, VBA is 15 years old and hasn’t really evolved in all of that time.  But there’s a phenomenal amount of code out there in VBA, most of which is written by non-programmers.  Allowing people to write their sensitivity analyses in a proper programming language, that’s huge.  Stopping having to support VBA, that’s huge.

I seriously doubt they’re going to turn the average end user into a C# programmer.  Python though, sounds just about possible.  I’d keep an eye out for new work on language design coming out of the labs, too.  I’m not saying it’s dead just yet, but Anders is measuring VBA’s coffin as we speak.

Technorati Tags: C# 4.0,Com Interop,Excel

Metrics: Signal to Noise

Okay, I promised I’d stop talking about metrics, but it occurred to me that there’s a serious point I hadn’t covered: sometimes your measures are broken, and you need to fix the measure before it’s going to tell you anything useful.  The classic example of this is error reporting: if you’re reporting too many errors, you won’t notice when something serious is going on.  I once saw a system that generated 2400 errors a week.  That’s under unexceptional circumstances.  Assuming you spent 30 seconds on each error, that would take over half your week.

It doesn’t take a genius to figure out that no-one was looking at those errors.  So, if there were a couple of really serious problems occurring, no-one would ever know.  Well, not until they started to get the phone calls.

Thing is, the code entanglement statistics from NDepend look like that: I can pretty much guarantee that 95% of them aren’t a problem.  Thing is, you’re never going to know unless you go through them all.  Oren’s random sampling doesn’t really prove anything other than there’s a lot of noise.  It doesn’t mean there isn’t some signal there.

Personally, I find the idea of code metrics fascinating, but I can’t help feeling that we’re still a long way from having usable ones.  An NDepend-driven refactoring might yield genuine results, or it might be a complete waste of time.  This doesn’t mean that Patrick shouldn’t be trying to improve the state of the art.

Technorati Tags: ,