More Great Things about Retlang

I’ve lost count of the number of times I’ve seen a technology that looked great in the sample, but didn’t hold up when I took it for a proper test drive.  Mike Rettig, on the other hand, has really thought through the use cases.  So when you try to take a sample program and hit the real world, you discover he’s been there before you.  Here’s some highlights:

  • Everything implements interfaces.  You can mock or stub pretty much everything.  (You might want to create your own abstract factory, though.)
  • Not only does everything have an interface, the interfaces are finely grained, making stubbing even easier.  You don’t need to simulate channel subscription if you only use channel publishing, for instance.
  • The publish/subscribe model is very robust
    • You can send a single message to multiple listeners
    • You can handle a message synchronously on the sending thread (providing you’re careful about thread-safety)
    • You can batch the processing of messages, allowing you to ignore duplicates.
  • You can set up timed events, one-shot or repeating.  This is pretty vital to allow long-running services to report their state and perform periodic clean-ups.
  • It’s really fast.  If you’ve got a performance problem, I can assure you it’s your problem, not Retlang’s.
  • You can inject your own behaviours nearly everywhere.  For example, you can set up a common error trap and logging using ICommandExecutor.
  • A single queue can hold lots of different messages.  This is vital for complex interactions.

While I’m here, a couple of things I don’t like:

  • It would be nice if there was an interface that incorporate ICommandQueue and ICommandTimer.
  • Equally, I’d rather like IQueueChannel to implement IChannelPublisher.
  • The default command executor kills the thread if an action throws an exception.  This is a pretty aggressive default behaviour.

You’ll gather that these are pretty minor quibbles about an excellent library.

Technorati Tags: ,

Using Retlang to implement a simple web spider

The Retlang wiki is a bit short on the sort of messy examples that I find useful when learning a product, so I thought I’d write one of my own.  The following is a 200-line web spider.  I’ll go through it and explain how it works and why you’d build it like this.  I recently used techniques similar to this to get a FIX processor to run 30 times faster.  Seriously.  Retlang’s that good.

Five minute introduction to Retlang

Here’s how Retlang works:

  • A Context is a Thread/Queue pair.  That is to say, a thread with an associated queue.  (In practice, we actually use PoolQueues in the code, but the semantics are the same.)
  • Messages are sent one-way to Contexts across Channels.
  • Contexts subscribe to Channels by specifying a function to be called when the message comes off the queue.
  • Messages are processed in the exact order in which they were transmitted.
  • Typically, all of a given context’s messages are handled by a single object.  This is usually termed the service.

Now, the important thing with Retlang is that it is designed to prevent you from having to put lock statements everywhere.  This results in a couple of restrictions:

  • You shouldn’t use shared state.
  • Messages must be either immutable or serializable.  (Immutable is faster.)

You can actually violate the restrictions if you know what you’re doing.  The problem is, once you violate the restrictions, you need to start worrying about thread safety again.  You’ll also need to worry about maintainability.  Although Retlang doesn’t prevent you from using other techniques and threading models, you lose a lot of the readability when you do so.

There is a third restriction:  You shouldn’t wait for another Context to finish doing something.  In fact, you can do this, but you should always try to avoid it, since you can quite quickly kill your performance by doing so.

NB:  Actually, threads and contexts are slightly different, but if you want to understand the differences, you’re better off reading Mike’s Blog.  I’ve just called it a thread for simplicity here.

Shared State

The program works as follows:

  • The Spider class reads a page and works out what URLs are in the page.
  • The SpiderTracker class keeps track of what pages have been found.

In the code, there are five spiders.  However, there can only be one spider tracker, which co-ordinates all of the spiders.  Since I’ve already told you that you can’t have shared state, you might be wondering how this is handled.  The answer is that you associate the SpiderTracker itself with a context.  All modifications to and results from the Tracker comes through the same Retlang architecture.  The Spiders each run on their own Context.

We only ever need to transmit strings, which are immutable.  Now, Channels are one way, and are one way by design, so we need to pass the following messages:

  • Please scan this URL  (SpiderTracker to Spider)
  • I’ve found this URL (Spider to SpiderTracker)
  • I’ve finished scanning this URL (Spider to SpiderTracker)

Distributing the work load is handled by a QueueChannel, which automatically sends messages to the next Spider waiting for a message.  An alternative implementation would be to create separate channels for each Spider.

Halting

The last message is, in some senses, not necessary.  Without it, every page would get scanned.  However, the program would never finish.  One of the trickiest problems with asynchronous communication and processing is actually figuring out what is going on and when you’re finished.  With synchronous systems, you can usually determine both just from the call stack; it takes a bit more effort to display that information to the screen, but not a lot.

Therefore, having set up the Retlang contexts, the main thread then needs to wait for the track to indicate that it is finished.  The tracker, in turn, counts how many pages are currently being scanned.  When that hits zero, we’re finished.  Retlang doesn’t provide its own facility for doing this, reasoning that using .Net’s WaitHandles is good enough.

The Code

Okay, you’ve waited long enough:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
using System.Net;
using System.IO;
using System.Threading;

using Retlang;

class Program
{
    static void Main(string[] args)
    {
        string baseUrl = "http://www.yourblogname.net/Blog/";
        int spiderThreadsCount = 5;
        foreach (string url in Search(baseUrl, spiderThreadsCount))
        {
            Console.WriteLine(url);
        } 
        Console.ReadLine();
    }

    private static IEnumerable<string> Search(string baseUrl, int spiderThreadsCount)
    {
        // NB Make sure folders end in a slash: the code fails otherwise since it can't distinguish between
        // a folder and a file
        var queues = new List<IProcessQueue>();
        var spiderChannel = new QueueChannel<string>();
        var spiderTrackerChannel = new Channel<string>();
        var finishedTrackerChannel = new Channel<string>();
        
        var waitHandle = new AutoResetEvent(false);
        var spiderTracker = new SpiderTracker(spiderChannel, waitHandle);

        var spiderTrackerQueue = new PoolQueue();
        spiderTrackerQueue.Start();
        spiderTrackerChannel.Subscribe(spiderTrackerQueue, spiderTracker.FoundUrl);
        finishedTrackerChannel.Subscribe(spiderTrackerQueue, spiderTracker.FinishedWithUrl);
        for (int index = 0; index < spiderThreadsCount; index++)
        {
            var queue = new PoolQueue();
            queues.Add(queue);
            queue.Start();
            var spider = new Spider(spiderTrackerChannel, finishedTrackerChannel, baseUrl);
            // Strictly speaking, we only need one Spider that listens to multiple threads
            // since it has no internal state.
            // However, since this is an example, we'll avoid playing with fire and do
            // it the sensible way.
            spiderChannel.Subscribe(queue, spider.FindReferencedUrls);
        }
        spiderTrackerChannel.Publish(baseUrl);

        waitHandle.WaitOne();
        return spiderTracker.FoundUrls;
    }

    class Spider
    {
        IChannelPublisher<string> _spiderTracker;
        IChannelPublisher<string> _finishedTracker;
        string _baseUrl;

        public Spider(IChannelPublisher<string> spiderTracker, 
IChannelPublisher<string> finishedTracker, string baseUrl) { _spiderTracker = spiderTracker; _finishedTracker = finishedTracker; _baseUrl = baseUrl.ToLowerInvariant(); } public void FindReferencedUrls(string pageUrl) { string content = GetContent(pageUrl); var urls = from url in Urls(content, "href='(?<Url>[^'<>]+)'") .Union(Urls(content, "href="(?<Url>[^"<>]+)"")) .Union(Urls(content, "href=(?<Url>[^'" <>]+)")) where url != null && url.Length > 0 && IsInternalLink(url) && url[0] != '#' && !url.Contains("&lt") && !url.Contains("[") && !url.Contains("\") && !url.EndsWith(".css") && !url.Contains("css.axd") select ToAbsoluteUrl(pageUrl, url); foreach (var newUrl in urls) { _spiderTracker.Publish(newUrl); } _finishedTracker.Publish(pageUrl); } static int BaseUrlIndex(string url) { // This finds the first / after // return url.IndexOf('/', url.IndexOf("//") + 2); } string ToAbsoluteUrl(string url, string relativeUrl) { if (relativeUrl.Contains("//")) { return relativeUrl; } int hashIndex = relativeUrl.IndexOf('#'); if (hashIndex >= 0) { relativeUrl = relativeUrl.Substring(0, hashIndex); } if (relativeUrl.Length > 0) { bool isRoot = relativeUrl.StartsWith("/"); int index = isRoot ? BaseUrlIndex(url) : url.LastIndexOf('/') + 1; if (index < 0) { throw new ArgumentException(string.Format("The url {0} is not correctly formatted.", url)); } return url.Substring(0, index) + relativeUrl; } return null; } bool IsInternalLink(string url) { url = url.ToLowerInvariant(); if (url.StartsWith(_baseUrl)) { return true; } if (url.StartsWith("http") || url.StartsWith("ftp") || url.StartsWith("javascript")) { return false; } if (url.Contains("javascript-error")) { return false; } return true; } static IEnumerable<string> Urls(string content, string pattern) { var regex = new Regex(pattern); // Why exactly doesn't MatchCollection implement IEnumerable<Match> ? return from match in regex.Matches(content).Cast<Match>() select match.Groups["Url"].Value; } static string GetContent(string url) { var request = WebRequest.Create(url); request.Proxy = WebRequest.DefaultWebProxy; try { using (var response = request.GetResponse()) { using (var reader = new StreamReader(response.GetResponseStream())) { return reader.ReadToEnd(); } } } catch (WebException ex) { Console.Error.WriteLine("Problem reading url {0}, message {1}.", url, ex.Message); return ""; } } } class SpiderTracker { // NB We care about case. HashSet<string> _knownUrls = new HashSet<string>(StringComparer.InvariantCulture); IQueueChannel<string> _spider; int _urlsInProcess = 0; AutoResetEvent _waitHandle; public SpiderTracker(IQueueChannel<string> spider, AutoResetEvent waitHandle) { _spider = spider; _waitHandle = waitHandle; } public IEnumerable<string> FoundUrls { get { return from url in _knownUrls orderby url select url; } } public void FoundUrl(string url) { if (!_knownUrls.Contains(url)) { _knownUrls.Add(url); if (Path.GetExtension(url) != "css") { _urlsInProcess++; _spider.Publish(url); } } } public void FinishedWithUrl(string url) { _urlsInProcess--; Console.WriteLine(_urlsInProcess); if (_urlsInProcess == 0) { _waitHandle.Set(); } } } }

Caveats

Well, it’s only 200 lines, so it’s hardly going to be feature complete.  Here’s some restrictions:

  • You can’t really run 5 WebRequests simultaneously, so the 5 queues are actually kind of pointless.  They do handle 2 threads quite well, though.  The code does nothing to fix this.  If someone can point me in the right direction, I’ll release an updated version.
  • There are undoubtedly links that should be ignored that aren’t.  Subtext’s EditUris are an example.  In general, the HTML parsing is extremely simplistic, but it’s not the point of the exercise.
  • It doesn’t read robots.txt.  Please don’t run this against sites you don’t have permission to spider.
  • It doesn’t respect nofollow.
  • It doesn’t clean up its threads after completion.

UPDATE: I’ve tidied up the code slightly.  It’s now got a couple more heuristics about dud urls (it turns out that running the code against a blog full of url scanning code is an eye-opener… 😉 ).  I’ve also tidied up the proxy handling.  The IronPython version is here.

Technorati Tags: ,

Upgrading Skins to Subtext 2.0

The good news is, skins have hardly changed in Subtext 2.0.  The skins were very powerful in 1.9, so there wasn’t a big reason to change them.  The bad news is, they’re not quite the same.  If you’ve got your own skin, here’s all you need to do:

  • Copy across any non-standard skin you were using from your version 1 folder.
  • Copy across the skin settings from the old Admin/Skin.config to the new one.

If the skin is locked, put a space in the web.config first, which will unlock the skin.  Note that if you’re using a modified factory skin, you’ll still need to copy across the Skin.config entries, because SubText 2 has modified the location of the style-sheets on the standard skins.

Now, if you then view your page and all of your style sheet has disappeared, that’s probably because you haven’t updated the skins.config correctly.  Once you’ve done that, you might discover that images have disappeared.  This is because of a change in the way subtext includes stylesheets for skins.  (The new way is better, it’s just a breaking change.)  Your stylesheets are now served by a virtual file called css.axd.  Do a view source to see the url, then view the url in the browser to see the output.  This puts all of your stylesheets in one place, which is in the root of the skins folder.  Since your old skin may have kept the stylesheets in another folder, this breaks the stylesheet.  In my case, a quick search and replace of “..images” to “images” fixed the problem, but it depends on the architecture of your skin.

The only other gotcha I’ve noticed is that Subtext now uses XHTML transitional by default.  Again, this is actually a good thing, but it can affect rendering.  it certainly changes the behaviour of IE’s positioning.  This is usually subtle, but is the reason for the image problem I described previously.

Technorati Tags: ,

Upgrading to Subtext 2.0

It’s a source of embarrassment that most of the traffic for my web site is uploads of blogging software, but there’s nothing better for embarrassment than publicizing it, or so I was told.  So, if you haven’t done the upgrade yet, here’s my routine for doing it:

  • Export your existing blog to BlogML, back up the database, do whatever you can to avoid borking your site because of the upgrade.
  • Make a copy of your current web.config.  The configs aren’t compatible, but you’ll want to refer to it.
  • I uploaded the new software to a new folder (Blog2), rather than overwriting the old one.
  • Since you’re upgrading, delete the SQL dump files (they’re in App_Data, Subtext2.0.mdf and Subtext2.0_log.ldf).  They’re huge and they’ll slow your upload up.
  • If you’re feeling confident, you can avoid uploading some skins as well.  I’d hang onto the default skin, though.
  • Now I did the substitution in.  This actually involved renaming Blog to Blog1, creating a new Blog folder and moving everything from Blog2 into Blog.  If you were more worried about downtime, you’d set up Blog2 as a virtual directory and slap an App_Offline.htm into the Blog folder at this point.
  • Edit the web.config and switch customErrors to “Off” (remember the capital O, I’ve lost hours of life over that…)
  • Fix the HostEmailAddress.  This is the “forgotten admin password” email address.
  • Put the connection string in from the old config file.  (I’ve noticed that they’ve put a clear right at the start of the connectionStrings section.  This prevents a lot of stupid configuration problems.)
  • Now you can go to the blog.  You’ll get the “we’re being upgraded message”.  Now you can click through and log on.
  • If you’ve forgotten your password, the password reset won’t work and you need to run the query at the bottom of this page to sort things out.  If you’ve only forgotten your admin user name, you can find it out just by selecting from the subtext_hosts table.  Either, obviously, is painful in a hosted environment.
  • Hit the button and hopefully you’ll be upgraded!

Next, time to upgrade your skin; it’s a pain but not hard.  I’ve put that into a separate post.

Finally, switch customErrors back to RemoteOnly.

Technorati Tags:

clr20r3: The World’s least helpful runtime error

You’ve got to wonder what people were thinking when they designed .NET’s behaviour when encountering an unhandled exception.  Sensibly, it writes the event log.  Stupidly, it writes garbage you can’t read.  Let’s just appreciate it in all its glory.

clr20r3

Good luck figuring out what the problem was.  (Actually, the server was down, but that’s hardly the point…)  It’s pretty easy to say “Well, you should just catch all exceptions in Main.” but it misses the point.  Any thread failing can cause this to happen, and if you’re using third party libraries they’re not necessarily under your control.  So disciplined coding isn’t going to help you on this one.  Trawling the internet, on the other hand, will.  At least, if you look for long enough.  What you need to do is to put a handler on the domain’s unhandled exception event.  Here’s the code:

private static void LogUnhandledExceptions(string source, int fatalEventId)
{
    AppDomain.CurrentDomain.UnhandledException += delegate(object sender, UnhandledExceptionEventArgs e)
    {
        if (!(e.ExceptionObject is System.Threading.ThreadAbortException))
        {
            Exception exception = e.ExceptionObject as Exception;
string message = exception == null ? e.ExceptionObject == null ? "Missing exception" : e.ToString() : string.Format("Fatal Error {0}: {1}rn{2}", exception.GetType().FullName, exception.Message, exception.StackTrace); try { System.Diagnostics.EventLog.WriteEntry(source, message, System.Diagnostics.EventLogEntryType.Error, fatalEventId); } catch { try { System.Windows.Forms.MessageBox.Show(message, "Fatal Error",
MessageBoxButtons.OK, MessageBoxIcon.Stop); } catch { if (Console.Error != null) { Console.Error.WriteLine(message); } Console.WriteLine(message); } } } }; }

I kid you not, cut and paste this routine into every system you ever write, and make it the absolute first thing that gets called.  Some of this code is probably redundant (is Console.Error ever null, for instance?).  It won’t solve your problem, but at least you’ll be able to see the error message.  Note that this code is not configurable in any way, since it needs to run before configuration.  So log4net et al can’t be used.  Incidentally, the clr20r3 error will still appear, so code discipline is still worth practicing.

Now, I can’t count the number of times I’ve thought that Microsoft had made a bone-headed design decision which later turned out to be quite smart but really, why doesn’t the .NET runtime log a readable error by default?

Always use an absolute path when invoking Binsor

If, like me, you’re using Binsor for configuration, and using the include file mechanism to implement environmental deltas, you’re going to need to refer to the initial file using an absolute path.  The reason for this is, if you don’t, a relative path name within the Binsor script uses the current working directory.  Just to be annoying, this means that all of your code works until the moment you deploy it, and then your service fails to start (this bears no resemblance to a real issue, I can assure you…)

Anyway, here’s the code you need:

private static void ReadRelativePath(IWindsorContainer container, string relativePath)
{
    string location = System.Reflection.Assembly.GetEntryAssembly().Location;
    string directory = Path.GetDirectoryName(location);

    Rhino.Commons.Binsor.BooReader.Read(container, Path.Combine(directory, relativePath));
}

You could potentially fix this by changing the current directory in the program, but that’s the kind of externality I really don’t want to deal with.

Technorati Tags: ,

Abstract Factory 61 Revisited

I can’t be the only one glad that the Google Testing Blog has become active again (and I don’t mean “we’re having a conference on another continent” style posts, much as I eagerly await those missives.  But this post has actually crystallized my understanding of one aspect of designing for testability.  There are probably some who will regard this one as old news.  It is, however, pretty fundamental.  To summarize, it says that any given class should either create objects or use them. 

So, let’s assume most of your objects are constructed using auto-wiring of dependencies.  We can ignore cases like List<T>, since they’ve got no externalities and have well-known behaviour.  That still leaves us with some cases where we need to use an object that’s only purpose is to create other, specific, objects.  Which is, of course, the abstract factory pattern.  Now, anyone who’s skimmed the Gang of Four book (and let’s face it, most of us skimmed it) will know the pattern, although they’ll probably have trouble remembering which way around Abstract Factory and Factory go.  (Factory creates one object, and hence is probably best represented as a delegate in C#.)  However, what the “keep instantiation quarantined” approach means is that the section on Abstract Factory use cases can be rewritten as:

Use:  Any time you want to create an object that wasn’t instantiated by your auto-wiring of dependencies.

You’ll note that I didn’t say “that wasn’t instantiated by your IoC container”, that because calls to the container should be treated exactly the same way as calls to new, which is why Windsor doesn’t give you a static method to access the container. Jeremy Miller emphasizes this too, even if StructureMap does have a static accessor.  I’ve written some pretty bad code before today using that accessor…

Quick Tip: Don’t use Atom.aspx with Feedburner

If, like me, you’ve set up Feedburner with a Subtext 1.9.6 blog, point FeedBurner at RSS.aspx, not at Atom.aspx.  Most things work whichever way you do it, but when you view an article in Google Reader, it won’t include a link back to the original article.  As far as I can tell, this is because the Atom feed uses rel=self rather than rel=alternate.  This appears to be fixed in the latest code base, but that means it’s part of the huge 2.0 release.

It will take a while for FeedBurner and Google Reader to catch up.  I mean hours, not minutes, but it will fix the problem.

Technorati Tags: ,

Reporting considered harmful

Oren has been saying that he completely disagrees with Stephen Forte’s assertion that database models support more than one application.

Well, he’s right and he’s wrong.  He’s right in that data models (and especially database models) can and should be private to the application.  In practice, they never are.  There’s always someone adding a “quick fix” piece of functionality on the side.  Let’s look at the main categories:

  • Importing data into the system.  (Including data entry systems.)
  • Adding a separate program e.g. Invoicing added to a tracking system
  • Exporting data to other systems.
  • Reporting

Now, of these, the first is the only one with a right to exist.  The only real problem is that people don’t treat it as a proper project.  The second is a hugely bad idea, because the database is not a good API integration point.  The data model you’ve used for the tracking system isn’t the model needed for invoicing, you’ve basically started down the route of trying to develop the canonical One True Data Model for your business.  (Udi Dahan makes a remark in his latest post about this being a red herring.)  I’m going to slightly labour this point, but it’s important and counter-intuitive to most developers, including the younger me.   Beyond a certain point, trying to standardize the data that your system uses is an exercise that possibly results in intellectual satisfaction and may look, on paper, very impressive, but in fact makes communication extremely difficult and tedious.   If you try to build the Tower of Babel, Babel is exactly what you’ll get. 

The third is functionality that needs to be there, but it shouldn’t be hitting the database directly.  No, not even a “quick export”.

Reporting, on the other hand, is the devil.  Anyone who, like me, worked on VB3 back in the 90s will know that Crystal Reports is the devil.  Unfortunately, some people remember technical limitations as being the problem, not that reporting packages are inherently diabolical.  Business Objects, MS-SQL Reporting Services, the lot.  That’s not to say they’re not powerful and occasionally useful, but I highly recommend using a long spoon.

So, why are they so dangerous.  Well, we need to think about what they’re actually used for.  In practice, they’re usually used as a short-cut to add an extra feature (Management Reporting, Invoicing) or as a data export feature.  (MS-SQL Reporting Services is actually really good at this.)  And how do they achieve it?  By direct data binding.  Yup, the very technique that agilists, alt.netters and anyone with an ounce of self-respect has been trying to eliminate from our arsenal of techniques for years.

Direct Data Access

Now, lets go back to Oren and Stephen’s disagreement.  Let’s look at why people want to access the database directly.

  • It’s easy to develop.
  • It’s ready right now.
  • It’s stable (if the database wasn’t there, you wouldn’t have an application anyway)

The problem is, you’re running up a huge balance on the credit card of technical debt.  Here’s some problems:

  • You’ve just made whatever database structure you have right now a de-facto specification.
  • It’s unlikely you have any real idea of who is accessing your database or why.
  • The external systems don’t synchronize through your business logic (this is not so bad for read-only scenarios, but can be lethal in writeable scenarios.)
  • You’ve no control over exactly what locks are getting put on your DB.

Now, I have to admit, when I need to get data from an external system, I often request direct database access.  Why?  Because I’m not the one who’s going to suffer when these problems come up.  On the other hand, I fight tooth and nail when the shoe is on the other foot.

Getting that spoon out

One of the smartest things my last company did was to develop a database specifically for management reporting.  It was being constantly changed, but the truth is that those costs would have been there anyway, just not as visible.  It also enabled us to highlight that concepts in one part of the business were subtly different from what appeared to be identical concepts in another part.  Of course, it can be quite hard to get people to buy into this for “just a quick report”, which means you’re usually better arguing that it would be cheaper to add the functionality directly to the application.  After all, the implementation of a proper reporting package can be quite time-consuming.

Arguing for a separate database to contain data “you’ve already got” can be a hard sell, especially to non-technical types.  As with all sales, however, you’re probably best off differentiating the systems.  Selling an OLAP cube is, ironically, easier, even if you don’t think the requirements really justify it.  However, reporting requirements only ever grow.  Seven year old reports are still being used by someone in the organization, even if you don’t know their name.

And if you’ve already got a system that reporting has got its tentacles into?  Then I’m sorry, but it’s going to take a lot of spade work to dig out of that hole.

Developers aren’t designers

I know a fair bit of CSS.  I know about the three pixel bug, I’ve even contributed IE5 fixes to three column layout solutions.  However, I’ve just been reminded extremely forcefully of my limitations.

As is pretty obvious, I’m using SubText as my blogging engine.  It’s a good, fully featured system that supports more use cases than you might imagine if you’re still thinking “How hard can it be to write a blog engine?”.  It comes with a number of default skins and a “Naked” skin for developing your own skins.

The skinning system is really powerful and very easy to understand.  Still, I highly recommend avoiding it.  Six hours later and I’ve re-learned what I already knew: developers aren’t designers.  Don’t get me wrong, it was alright, in a Web 1.0 kind of a way.  I could have spent a couple of days learning Photoshop and getting rounded corners and gradients working.  I could have banged my head against a wall for several days trying to figure out why IE rendered a block one way on my local machine and another once I’d uploaded the skin.  The fact remains, at the end of the day, in that time I’d still have something that looked like a MySpace page, because visual design isn’t my strength.  In that time, I could have done something far more rewarding, like writing another article or watching Wall-E.

You want some evidence?  I’m afraid I’m too embarrassed by my own effort to publish it.  Take a look at Ayende’s site.  Cracking content, amateur design.  Now take a look at this one.  This is the weblog of the guy who wrote SubText.  Looks a lot better, doesn’t it?  That’s because the developer of SubText didn’t write his own skin.  This is Adam Smith’s theory of comparative advantage writ large.  Or, to put it another way, yet another example of why you should buy rather than build.  (Writing your own blog engine is another, although that hasn’t stopped Martin Fowler, another case of amateur design and fabulous content.)

So, if you’re blogging, what are your choices?

  • Be a crack visual designer
  • Buy a skin
  • Use one of the defaults

In practice, only really the third option was open to me.  So, having spent a week publishing using the Origami skin, I’m now publishing using a slightly modified Origami skin slightly more to my taste.

It seems like sooner or later, every blog becomes about blogging.  Let’s hope it’s just a phase.  Scott Hanselman recently said exactly the same thing, although he wasn’t talking about blogging at the time.  However, wheras I’ve just wasted six hours of my own time, it’s amazing how many companies delegate design duties to developers or other unskilled workers.  The results are predictable, and much more costly than one night’s sleep.

Technorati Tags: ,,

P.S. After reading all of this and you still want to give it a go, Simon Phils has written an excellent guide to the architecture.