Colour Coding

SEO Snake Oil

I never really felt .NET “got” SEO and ScottGu’s latest post doesn’t really fill me with confidence. Read this:

One simple recommendation to improve the search relevancy of pages is to make sure you always output relevant “keywords” and “description” <meta> tags within the <head> section of your HTML.

Oh yeah? Tell that to Matt Cutts. Meta keywords haven’t had weight in any respectable search engine this millennium. As the article explains, meta description is next to useless as well.

What does matter?

The title tag. It shows in the browser and so Google rates it as important information. This is a welcome addition to ASP.NET 4.
Heading tags. They’re typically in large print and summarize the article.
Your URL matters, so you should try to make them readable. URL routing has been missing from vanilla ASP.NET since day 1 and has been a pain to implement.
Links do, even if you mark them nofollow (although they don’t count as much).
The text of links matters as well.

So, all in all, there are some good features being covered here. Still, it’s hard to know who this is aimed at. If you’re writing a new solution for SEO, it’s unlikely you’d be choosing webforms anyway. If you’ve got an existing solution, you’ve probably already solved these problems. That probably only leaves existing solutions that are badly implemented and, frankly, these new features aren’t going to fix that.

York Notes for SEO

I don’t talk much about SEO, because it’s really hard to create posts as compelling as seomoz, but here’s pretty much everything you need to know:

You can do a lot worse than writing content people want to read and link to.
Observe web standards.
Google doesn’t use Javascript on pages much. Javascript-only links are particularly stupid.
Age matters. That’s why it’s better to do a permanent redirect rather than just move the page.
Don’t try to cheat.

Here’s some things that people do all the time that don’t work:

Writing “advertising copy” won’t help you on the web. Google treats it as the spam that it is.
Slapping everything into a DIV is silly. Use markup like h1 and blockquote for the purpose for which it was intended.

Here’s some ways of cheating that don’t work:

White text on a white background will positively hurt. Search engines will spot it, and will mark you down as a spammer.
Giving radically different content to the google bot from the user is called cloaking. This can get you in the sin bin forever.
Link trading is basically spam. Google will treat it as such.

Black Hatting (cheating google) is possible, but the guys who are good at it aren’t about to tell you how.

Technorati Tags: ASP.NET,SEO

Getting Out of the Configuration Game

I once took advantage of the published source code of the Microsoft Enterprise Library Data Access Block to make some firm-specific modifications. One of those changes that only really makes sense in the target environment, nothing particularly general. The Data Access Block, for those of you who aren’t familiar with it, is basically an improved version of the ADO.NET API, with some pretty useful support for stored procedures. It’s not NHibernate, but it’s better than coding the API in the raw.

While I was there, I decided to rip out the stuff we didn’t use. That turned out to be nearly all of it. This was quite shocking to me, because I thought we used a fair chunk of the functionality. I reckon that when it came down to it, two thirds of the library was non-functional. The vast majority of the code, which I had no interest in, was there to support ObjectBuilder.

For those of you who aren’t familiar with it, ObjectBuilder is Enterprise Library’s configuration system. It was created to solve the some of the same sort of problems as IoC containers, but it also addresses some interesting concerns such as providing a UI for configuration. Enterprise Library is also the original source of my ideas on environmental deltas.

So, how did I get rid of all of this code? Well, I junked DatabaseFactory and used constructor injection again. It took a bit of time, but I deleted two thirds of the code without an appreciable loss of functionality.

How Much is Too Much?

Before you think this is just an exercise in ragging the Pattern & Practices group, who are a pretty easy target, I want you to ask how much code your internal and open source libraries should be devoting to configuration. Now, I doubt you’ll find many people arguing that two thirds is the right amount. However, I bet you’re thinking that five percent is acceptable. I’m going to argue the correct figure here is zero.

Let’s take a look at a successful open source library: log4net. In some ways, log4net is even worse than the Data Access Block. At least Enterprise Library didn’t require me to use ObjectBuilder. The code was dead and irrelevant in our applications. log4net, on the other hand, isn’t even usable without digging into its configuration system. Because it’s so prescriptive on this point, it then proceeds to provide several alternative approaches to how to use its configuration system.

But I don’t want to use it at all. I’ve got Castle Windsor already, why would I want two configuration systems? If I develop a large application with a lot of libraries, I could end up with four of five configuration files. And pretty much all of them aren’t exactly well thought out or powerful. Want a variable to be consistent across the files? You’d better get coding.

And this is where it gets really ugly. APIs defining their own configuration system doesn’t save time, it makes work for the application developer. Configuration is right on the outside of the Onion model and should be in one place, not dotted around your system. The choice of how to configure the system should be up to the application developer. Look at Retlang, it has no configuration at all. Does it make it harder to get running? Actually no, a small program can hard code everything in the Main method, a large program can use an IoC container.

The whole philosophy of constructor injection and IoC containers is that classes shouldn’t take on externalities they don’t need. The same applies to libraries. Be ruthless in applying the single responsibility principle and get configuration out of your code.

Technorati Tags: Enterprise Library,nServiceBus,Data Access Block,ObjectBuilder,Constructor Injection,Castle Windsor,Retlang

Separating Configuration from Wiring: Nirvana

If you’re writing SOLID code, you’re going to be doing a lot of wiring. Typically, this wiring only changes in response to business change. Some wiring is going to be static for years. If you’re building a car, you won’t suddenly wire the clutch to the CD player, as entertaining as it would be when your brother-in-law tried to drive it.

On the other end of this spectrum, there’s stuff that changes all the time. Every machine you use, it needs to change. This is actual configuration information. The rest of it can, frankly, be hard wired. True configuration information needs to be in external files, wiring is best left in C#. Take a look at your config settings. How much of that has changed in the last year? Two years? Would it really be a big deal to create a new version and deploy it in the unlikely event that you needed to change it in the next two years?

Configuration API Design

It may sound like I’m just stating the obvious here, but it’s try to think of a major .NET project that gets this. I’m reckoning it’s nServiceBus and… well, no-one actually. Microsoft definitely don’t get this or your average config file would be five lines long, rather than 200.

Castle Windsor is schizophrenic on this. There’s an XML configuration model in which everything is configuration, and a fluent model in which everything is wiring. What you actually need is environmental deltas like Binsor has. Now, I have form on this subject, having been blogging about configuration management and the need for derived containers almost as long as I’ve been writing.

So, does everyone need an XML format as well as a C# API? Well, no, because you can use a braindead DSL these days. You don’t need to wait for Craig and Oren to write a Binsor equivalent for whatever library you’re using. Moreover, although I’ve made a big distinction between wiring and configuration, the difference between the two is going to be application and domain specific. What’s consistent for one company could change on every box at another. It’s best to be using the exact same API as you are in C#. It’s easier to learn and it doesn’t impose an unnecessary barrier between the stuff you change and the stuff you don’t.

Now, after I’ve said all of that, it might be surprising that I use Binsor. Shouldn’t I be using Castle’s fluent registration? Honestly, I’d much rather use it, but Binsor has the Extend macro, and Fluent Windsor doesn’t. The Extend macro enables my separation of configuration and wiring in a way that Windsor’s fluent interface does not. If you dig deeper, you discover that the fundamental problem is you can’t have incremental registrations.

(As an aside, this is because the container violates SRP extremely badly, being responsible both for registration and for resolution. This, in turn, prevents the creation of a registration compilation step and introduces all sorts of problems that are most easily solved by treating registrations as immutable.)

Achieving Configuration Nirvana

So, here’s how I think we should be thinking about configuration:

Configuration is different from wiring
The difference between the two is application specific.
Configuration needs to be in files external to your code. This doesn’t, however, mean you need two mechanisms.
There’s no need to use XML. IronPython, IronRuby or Boo are fine.
Specialized DSLs are not the way to go. There’s nothing wrong with .NET code.
Configuration APIs need to be incremental to support this. If this means having a separate “configuration compilation” step, so be it.
Most libraries shouldn’t concern themselves with configuration. IoC containers are fine.

I haven’t really explained the last point. We tend to behave as if configuration is something we need to build into everything we do and this is not only wasted effort on the producers’s part, it’s often a pain in the neck for the consumer as well. I’ll talk about that some more on my next post. Then I’ll come down to earth and start discussing what we do in the real world where my perfect configuration system doesn’t exist.

Technorati Tags: Castle Windsor,Configuration Management,Constructor Injection,Inversion of Control,SOLID Principles,nServiceBus,Binsor

The Configuration Class Anti-Pattern

I wanted to write something about the concept of a “configuration class”, which Emanuele touched on in a discussion about the singleton pattern. Let me start by explaining what I mean by a configuration class. It’s a relatively common “pattern”, although you’ll never find it in any books. I’m going to argue that this is because, like Singleton, it’s the wrong solution to the problem.

The Problem

.NET has an incredibly flexible XML-based configuration API. However, for the purposes of this discussion, we can concentrate on the bit that most developers use: AppSettings. This is basically just a string hashtable with delusions of grandeur. This is, frankly, the wrong model. What goes wrong?

Different pieces of code read the same setting for different purposes. e.g. two assemblies both of which use “ConnectionString” as a key.
Different pieces of code read the same setting in slightly different ways. Maybe one guy made his test for “True” case insensitive, but did everyone?
The configuration file is a singleton, with all the problems that implies.
You’ve no idea what bits of code use what configuration settings. Ultimately, this is the one that’s going to kill you.

The Configuration Class

Well, you know that working with the raw hashtable is dangerous bordering on foolhardy. So, you put an abstraction on top of it to fix some of the horror of working the bare metal.* You’ve now got, in effect, a strong typed class with properties that correspond to the settings in your config file. Next, you get the rest of your code to use this class. I’m not going to go into more detail about this since Jeffrey Palermo’s article already does a better job than I would.

Now, this is vastly better than what we started with:

Different pieces of code using the same setting are now explicit, so your chances of a conflict are much smaller.
Everyone now reads the setting the same way.
You’ve separated yourself from the physical configuration file, so you can use an alternative data store or stores.
You can see who uses the setting by tracing method calls.

Why it Doesn’t Really Solve The Problem

Well, let’s start out by pointing out that schema-less XML configuration isn’t really that bright an idea to begin with. I don’t think I’m the only one tired of the insane verbosity and lack of verification in XML. That in itself is relatively solvable by just reading all of the configuration settings on startup rather than when they’re first needed. It’s not elegant, but it works.

However, you’ve still got some fairly ridiculous things. First, your configuration class is a junkyard of settings for your app. Again, you can probably solve that by splitting it, but now you’ve got lots of configuration classes. And then you’re running into the risk of re-used settings again.

Let’s ignore that last problem and think a bit more about the users of these configuration classes. How many of them are there? Well, actually, there should only be one. If you’re observing the single responsibility principle, it’s fairly clear that any given data item should only have one purpose. If you’re remembering that data and code together is a fundamental principle of object oriented development, it’s pretty obvious that the class that is using the data item should be the configuration class itself.

Okay, aren’t we back to configuration chaos? Not quite, we’ve introduced some discipline. It helps to think about the responsibility of this new look configuration class? Basically, it’s now an abstract factory. The classes that originally used your configuration class now just take their configuration as constructor parameters. So, we’ve actually come full circle to the article that started it all.

Still Missing The Point

All of this is fine, but all I’ve really told you so far is that there are better pattevns for accessing AppSettings. But AppSettings is still a pretty useless bag of strings. Wouldn’t it be better to not have to write all of this wiring code? Wouldn’t it be better to say “set the connectionString parameter on the SessionFactory class to <<DEV>>” in the config file? Wouldn’t it be completely amazing if there was a framework that implemented a completely general version of this configuration class for you?

Well, actually, there is. Actually, there’s lots. AutoFac, Castle Windsor, NInject, Hiro, StructureMap (in no particular order) provide exactly this kind of functionality for .NET. PocoCapsule does it in C++, Spring and NanoContainer kicked off the whole concept in Java. It’s not as if this is even cutting edge anymore. All of them change the model: the configuration is pushed into your code, you don’t pull from it. All of a sudden, you don’t need to worry about writing configuration code at all.

If you ask me, the one thing training materials on the subject should really tell you is that the ConfigurationManager API is there’s a better way of handling configuration in your application.

And Merry Christmas. 🙂

Technorati Tags: Configuration,Castle Windsor,Constructor Injection,Singleton,Abstract Factory

*I toyed with the idea of calling this article “What Jeffrey doesn’t teach you about reading from .Net configuration” but rejected it as pointlessly provocative. The article is a good introduction to the problems of dealing with the .NET API and approaches for dealing with it. Jeffrey was blogging about IoC before I even got my head properly around the concepts.

CORRECTION: Changed Guice to PocoCapsule. As Mauricio points out, Guice is a Java container, not a C++ one. It’s harder to write an IoC container in a language without reflection support, however.

Why Do People Behave as if Asynchronous Processing is Easy?

Let’s see if you can spot the difference. Your manager comes up to you and says

We need to get invoices from our order system into our treasury system. What we were thinking was: we’ll get the order system to export its orders onto the file system, and the treasury system can read in the files and delete them when it’s finished. How long should that take?
We need you to write a robust queuing solution. How long should that take?

The answer is, of course, that there’s no difference, except possibly that you’re being asked to do a bad job in the first instance for which you will be blamed later. If you want to get items from one system to another reliably, you need a queue. If you don’t use one, you’ll end up writing one.

But it’s simple

The biggest problem I have with convincing people to use things like Retlang and MSMQ is the perception that they are somehow “complicated”. In each case, something is perceived as being “easier” (Messing around with threads and locks in the first instance, hacking around with files in the second.) What we have here is an example of developers doing something they dislike non-technical managers for: treating anything they don’t understand as easy.

Put it this way, a friend of mine works in communications. He writes a lot of complex asynchronous code, often bespoke and under serious time pressure. When I showed him Retlang, he immediately jumped at it, because he understood the problem it was solving. Most of us don’t deal with these things often enough to really understand the problem space.

If you don’t understand all of the things that can go wrong with a file drop copy or naive multi-threading code, you shouldn’t be writing it. If you do, and you still don’t want to use a third party solution, you should really ask yourself what value you’re adding to your business.

Technorati Tags: Retlang,MSMQ

You’re Only as Good as Your Worst Component

Here in my office we have great fridges in the kitchen. They’re not cheap, but they are seriously good. Large, easily cleanable shelves, an aluminium body, even refrigeration and actually rather nice to look at. There’s even a handle at the front that opens in a really satisfying way (It’s a lever that pulls apart the door from the rest of the fridge). In short, a cracking fridge.

It’s a pity it’s broken. That flash handle I mentioned: it’s got a component made out of plastic. And it broke, meaning you can’t actually open the fridge. Except that, of course, I’ve long since figured out how to open the fridge door. Of course, since the handle breaks every six months and it takes over a month to replace it, you’d pretty much expect me to have got pretty good at it by now.

Now, clearly, the manufacturers messed up when they designed this. The handle isn’t anything like as sturdy as the other parts of the system. It’s probably not even regarded as very important by the developers. After all, it’s still keeping the food at the right temperature. That’s the principal SLA of a fridge, right?

All our talk of modules and KPIs can sometimes distract from actually seeing things from the user’s perspective. He doesn’t care if 99% of the system is perfect if the 1% that isn’t makes his life hell. So, every so often, it’s worth just taking a look at all of the things you do and seeing which you think is the worst. Someone will probably argue that it’s not very important to fix it, but the chances are, someone will thank you if you do.

So, which is your worst component? I’m betting it’s in the same area as my broken fridge: the front end.

Technorati Tags: Quality

Braindead Boo Embedding

I’ve got to admit, I’ve got my reservations about Boo. There’s the take-up question, which wasn’t a big deal before the DLR came striding into town, but is much more important now that there are other ways to skin this particular cat. Then there’s the extensible syntax, which is one of the most powerful shotguns I’ve ever seen. They’ve even helpfully set it up to point at your foot by default.

Fact remains, I’ve just done some work for the Castle project that uses Boo for its DSL. Again, I’ve gone for the “just get it working” approach, so I thought it’d be useful to see this and compare it against the Python version. Here’s the bootstrap code:

namespace SolutionTransform {
    using System;

    public class Program {
        public static SolutionFile GetSolutionFile(string path)
        {
            return new SolutionFile(path, path.FileContent());
        }

        public static void Main(string[] args)
        {
            if (args.Length < 2) {
                Console.WriteLine("Usage:  SolutionTransform <scriptPath> <solutionPath>");
            } else
            {
                var interpreter = new Boo.Lang.Interpreter.InteractiveInterpreter2();
                interpreter.SetValue("solution", GetSolutionFile(args[1]));
                var script = args[0].FileContent();
                var context = interpreter.Eval(script);
                foreach (var e in context.Errors)
                {
                    Console.WriteLine(e.ToString());
                }
            }
        }
    }
}

Couple of interesting things to note:

Boo doesn’t believe in convenience methods. You want to read a file, you do it yourself. I’d argue this is good design.
There’s no concept of “script scope” separate from an execution engine. In practice, this is a consequence of the DLR supporting multiple languages whilst Boo just has to support itself.
Boo doesn’t report compilation errors to the console by default. This is a bit of a gotcha, but not really an issue in non-braindead scenarios.

Ultimately, there isn’t really a “better” to be found here, although personally I hope that Boo supports the DLR in future, which would make swapping between languages easier. It’d probably also promote takeup of Boo. The question is whether or not those benefits would be worth the extremely large effort required.

The most interesting point, for me, is the script scope differences. It highlights a difficulty with the Single Responsibility Principle: what “one thing” means is dependent on context. I’m sure the developers of Boo think that the interpreter implements SRP, but the fact remains that the DLR has managed to split out a responsibility from it.

Now let’s take a look at the script.

import SolutionTransform
import System.Text.RegularExpressions

solution.Transform(
    {l|Regex.Replace(l, "-vs2008", "-Silverlight", RegexOptions.IgnoreCase)}, # rename rule
    StandardFilters.RegexFilter(["Castle.Core", "Castle.DynamicProxy", "Castle.MicroKernel", "Castle.Windsor"]), 
    StandardTransforms.SilverlightTransform(),
    StandardTransforms.CastleStandardsTransform()
)

# This script is the script for converting a castle solution to the corresponding castle silverlight solution

Now, I think this is inarguably more elegant than the IronPython version around the import statements. This is a consequence of Boo being a .NET language, rather than a replication of CPython’s semantics integrated with .NET. On the other hand, there’s a couple of things I don’t like. One is the lambda syntax. Do we really need another one? Boo is a python-like language, and there’s nothing wrong with Python’s syntax here. The other is the RegexFilter function: it takes an non-generic IEnumerable because Boo won’t just try tacking a .Cast<string> onto the end. IronPython gets this right. This is probably a consequence of Boo’s optional static typing.

How you feel about Boo probably comes down to how you feel about the syntax extensibility. Personally, I’ve never seen a Boo DSL that was well documented and as a friend of mine recently said “A DSL is no substitute for an FAQ.”.

Technorati Tags: IronPython,DSL,Boo

DSLs for Dummies: Braindead Python Embedding

Okay, let’s go over the basic DSL argument:

You need to configure something
It can behave in very different ways in different environments
The data model is subject to change and in any event not easily represented by a relational database

These are pretty much exactly the use cases for Composite and Chain of Responsibility patterns. But that’s one of those “raises as many questions as it answers” things. How are you going to configure it? Well, until the advent of embeddable scripting languages, the answer was pretty much always you hacked something together. That, or you hacked together a configuration language together, often using XML.

Embeddable scripting languages change this equation. Now you can do anything you could do in code. If you’re lucky, you can make it look DSL-y. Now, you could write a book on this. Oren already has, but I’m not talking about all-conquering DSLs that solve a domain problem, I’m taking about hacking something together. So, what are you going to need for that?

Embed Python in your program
Inject some variables into the python script
Allow the Python script to call back to the original program

With this you can do anything. You can even read an XML configuration file, if you must.

Hosting IronPython

I’m aware I’m something like the 100th person to blog this, but most of the information is horribly out of date or misses out an important detail. So, here’s some really trivial code that runs a (hardwired) python file.

static void Main(string[] rawArgs)
{
    var engine = Python.CreateEngine();
    // Set the search path to include the calling program, so that you can import it with the python code
    engine.SetSearchPaths(new[] { Path.GetDirectoryName(Application.ExecutablePath) }.ToList());
    var scriptScope = new ScriptScope(engine, new Scope());
    // Inject in a variable
    engine.SetVariable(scriptScope, "c", new CurveDownload());  
    engine.ExecuteFile("curve.py", scriptScope);
}

Note that we’ve set the search path, to allow the python file to import the original assembly. You might not need that, but in general you will. Equally, we’ve slapped a variable in there to demonstrate the ability to pass in context. We could make all of this a lot prettier, but we’re going for braindead here.

Okay, next you need to actually reference back to the calling program

import clr
clr.AddReferenceByPartialName("Bloomberg.YieldCurves")
from Bloomberg.YieldCurves import *

curves = [
    ShortLongCurveSpecification("USD", "CMTUSD__", "A_A", "US000__", "A_360"),
    ShortLongCurveSpecification("GBP", "COMGBP__", "A_A", "BP000__", "A_365"),
    ShortLongCurveSpecification("EUR", "CMTEUR__", "A_A", "EE000__", "A_360")
]
c.Export("""c:tryme.xml""", curves)

So, the first three lines do that. I’d like to get that whole bit slicker, preferably so that you could move it into the C# code, but for now it’ll do (obviously you could just hack it by modifying the file). For those of you who know practically no Python, like me, the difference between “import Bloomberg.YieldCurves” and “from Bloomberg.YieldCurves import *” is that in the former case you’d have to refer to Bloomberg.YieldCurves.ShortLongCurveSpecification.

Technorati Tags: IronPython,DSL

SOLID Adoption: The Curse of F12

David Wheeler (1927-2004) was something of a computer science folk hero around Cambridge. You think object-orientation was a neat idea? He invented the subroutine. Academics would tell stories of having conversations in which they’d describe some particularly thorny problem they were trying to solve, only to be met with, “Oh yes, I remember that: I never got around to writing a paper, but I’ll give you my notes on how I dealt with it.”

But he’s most famous for saying “Any problem in computer science can be solved with another layer of indirection.”. You’d have thought inventing functions would be more memorable, but actually it turns out that pithy remarks win out where posterity is concerned.

SOLID is all about abstraction. Let me express them in those terms:

Open/Closed: Make stable abstractions
Liskov: Don’t break your abstractions
Dependency Inversion: Don’t take dependencies on concretions, including on object creation
Interface Segregation: Make your abstractions as small as possible

Now, abstraction is useful precisely because it introduces indirection. Single Responsibility isn’t explicitly about abstraction, but applying it will introduce more indirection into your code. It’s exactly this that delivers the benefits of SOLID.

But that will usually create another problem

Of course, Prof. Wheeler didn’t end the quote there. Someone misquoted it as “Any problem in computer science can be solved with another layer of abstraction, except too many layers of abstraction.” Multiple layers of indirection can be quite painful to navigate, as anyone who’s stared as Castle Windsor’s code for the first time will attest.* This is a huge barrier to acceptance. I’m proud to say that I’ve managed to convince a number of developers of the utility of SOLID, but the fact remains that many just see it as unnecessary complexity. One complaint that I often encounter basically comes down to “F12 doesn’t work.”

Now, someone smarter and funnier than I am described not using dependency inversion as like soldering a lamp directly into the wall. This sounds ridiculous but let’s talk a bit about the benefits of soldering it. Well, figuring out your wiring is really easy. It never changes** and tracing your circuits is really easy. If you’re dealing with the plug next to the sofa, it’s got a reading lamp in it. It’s never a phone charger. When you switch on the reading lamp, you’re guaranteed it’s going to work and you’re not going to have to mess around unplugging whatever the last guy was using. SOLID code, by contrast, looks like tangle of wires the first time you see it.

Tracing the circuit in old-school code is pressing F12 in Visual Studio. SOLID breaks F12. You know the old saying that when all you’ve got is a hammer everything looks like a nail? Well, imagine you’ve got a screw and, instead of being able to whack it in, your hammer didn’t work at all. You might well come to the conclusion that there was something seriously wrong with this funny-looking nail. If you don’t have a key press for “go to implementation of method” like in ReSharper, you’re going to find that the only way to trace the execution flow is with a debugger. Equally, if “Find Usages” doesn’t understand that a method may be called because it’s an implementation of an interface, you’re going to find SOLID code harder to navigate. Never mind the cost of writing all of those single responsibility objects when you can’t press “Alt-Enter Enter” for the obvious bits. And I’ve only just scratched the surface.

Yes, you can do SOLID in Notepad and No, this isn’t the only barrier to adoption, or even the biggest. Still, it’s well worth bearing in mind that tooling matters. I kid you not, I’ve seen code get rewritten from SOLID style to something that works better with F12. If you want to get people into proper object-orientated design, you could do worse than starting with getting ReSharper on their desks.

[Disclaimer: I have no financial or personal interest in JetBrains; they’ve never even offered me free stuff.]

*Or StructureMap, or NHibernate, or pretty much any well written open source project.

**at least, not until requirements change, but that’s the point…

Technorati Tags: SOLID Principles,ReSharper,Visual Studio

Gotcha: DistinctRootEntityResultTransformer Doesn’t Play Well with Restricted Result Sets

I’ll be honest, the main reason for the last post was to make sense of this one. Consider the following code:

public User UserByName(string name)
{
    return session.Linq<Person>()
        .Cached()
        .DistinctRoot()
        .Expand("UserRoles")
        .Where(u => u.WindowsUserName == name)
        .FirstOrDefault();
}

It’s exactly what you’d hope for. Nice and explicit: you want the first person whose windows user name matches ‘name’ and to pull back the person’s roles at the same time. Because you understand how eager fetching works, you’ve slapped a distinct root in there too.

Pity the code’s wrong. The gotcha is the way that NHibernate.Linq interprets FirstOrDefault. To be clear, this isn’t a bug, it’s definitely the right behaviour. FirstOrDefault translates to a “top 1” in SQL Server (or a Limit in others). DistinctRootEntityResultTransformer works after the query has run.

So, you will get at most one Person object back, but you’ll also get at most one Role back, which would undoubtedly lead to problems elsewhere in your code. Try writing an example program to demonstrate this and get it to print the SQL you run.

So, how do you deal with it? Well, you need to stop FirstOrDefault getting translated into the SQL. So we use my favourite LINQ defeater: ToList.

public User UserByName(string name)
{
    return session.Linq<Person>()
        .Cached()
        .DistinctRoot()
        .Expand("UserRoles")
        .Where(u => u.WindowsUserName == name)
        .ToList()
        .FirstOrDefault();
}

Now that code actually does what you wanted. Of course, there’s a catch: if there really were two people with the same windows user name, it would fetch them both. But at least your code is now correct.

Technorati Tags: NHibernate.Linq,NHibernate