Reactive Extensions for Retlang Developers

Most of the talk about Rx has been about the "Push LINQ" aspects of it, so I thought I'd have a bash at explaining it from a Retlang perspective instead.  Let's start with IObserver:

    public interface IObserver<in T>
    {
        void OnNext(T value);
        void OnError(Exception error);
        void OnCompleted();
    }

Now, in the same place Retlang just uses an Action<T> (or IPublisher<T>, depending on context).  If you want to catch exceptions, you implement IBatchExecutor and catch the exception yourself.  As for OnCompleted, there just isn't a straight parallel, really.

Now, let's see the equivalent of IFiber:

    public interface IScheduler
    {
        DateTimeOffset Now { get; }
        IDisposable Schedule(Action action);
        IDisposable Schedule(Action action, TimeSpan dueTime);
    }

Okay, so what's different?  Well, you can't "EnqueueAll" and schedulers can't keep track of their own subscriptions.  On the other hand, Schedule returns a Disposable: every scheduled action should be cancellable.  The scheduling by time is a little less directly usable, but fine.  It's nice to see that "Now" is abstracted out, but it seems to be a bit of a strange implementation detail.  Also, when you actually examine the code, you discover that CurrentThreadQueue, for instance, is hardwired to Scheduler.Now anyway, which renders it all a bit pointless.

So far, this is all pretty similar to Retlang.  But the equivalent of ISubscriber is rather unexpected:

    public interface IObservable<out T>
    {
        IDisposable Subscribe(IObserver<T> observer);
    }

Ignoring all of the overloads subscriber has, there's one big thing missing: you specify the listener, but not the fiber.  That's because an IObservable can include a thread.  The command for this is "observable.ObserveOn(scheduler)"*, which returns another observable.  (There's also observable.SubscribeOn(observer, scheduler).  As far as I can figure out, this causes the subscriptions to run against that scheduler, which isn't really what you want, but is likely the routine you'd try first if you were a Retlang user.)

Finally, while Channel in Retlang is a combination of IPublisher and ISubscribable, ISubject does the same thing in Rx: it's the combination of IObserver and IObservable and hence is the natural place to put a buffer.

Some Interesting Implementation Details

If you look at the implementation of CurrentThreadScheduler, you'll see it uses a priority queue rather than the Timer model Retlang uses.  This is actually really rather cool.  In particular, it means that you can actually schedule events to occur in the past.  It may seem like that's a stupid thing to do, but in practice it means that you can cause certain messages to be processed out of band, which is actually pretty powerful.  Imagine, for instance, a set of workers responding to HTTP connections.  This would allow you to prioritize messages from existing (keep-alive) connections.  (Let's be honest, this is a bit of a hack, but out of band messages can be useful.)

There's all sorts of simple and useful classes running around here: BooleanDisposable, CompositeDisposable.  It's a pity they aren't just in the .NET framework to begin with.  Of course, Microsoft give with one hand and take with the other.  PriorityQueue is marked internal, continuing Microsoft's long tradition of not letting you near the code.  Equally, AnonymousObserver, a simple class that takes three delegates and constructs the obvious IObservable, is marked internal, making writing your own Subscribe overloads just that tiny bit less convenient.

Different Sorts of Schedulers

As you can see, the basic building blocks for a Retlang-style structure are all there.  The various sorts of schedulers/fibers are also worth looking at:

  • Scheduler.Immediate: Pretty much the same as StubFiber.  Runs on the current thread and blocks if you schedule something in the future.
  • Scheduler.CurrentThread : Okay, I'm just perplexed by this one.  As far as I can tell, it behaves identically to Immediate, but the code is significantly more complex.  It's not thread-safe: you couldn't enqueue something from another thread the way you can with ThreadFiber; it's got this curious EnsureTrampoline function that I frankly don't understand.  I imagine this will become more clear at some point in the future when I understand Rx better.
  • Scheduler.DispatcherScheduler:  WPF support
  • ControlScheduler:  WinForms support
  • EventLoopScheduler:  ThreadFiber with fewer options.  Would it be so hard to put a Func<ThreadStart, Thread> into the constructor?
  • Scheduler.ThreadPoolScheduler: PoolFiber
  • TaskPool scheduler: for integration with the parallel tasks library.

This is all harder to figure out than it should be.  There's no source code download and no documentation worth speaking of.  My personal opinion: if they can open source ASP.NET MVC, why can't they open source this?

Technorati Tags: ,

Contract Killers

Seriously, is this the most succinct way of expressing this?

[InheritedExport]
[ContractClass(typeof(RequestFailureContract))]
public interface IRequestFailure
{
    string FailureMessage { get; }
}

[ContractClassFor(typeof(IRequestFailure))]
public class RequestFailureContract : IRequestFailure
{
    public string FailureMessage
    {
        get { 
            Contract.Ensures(Contract.Result<string>() != null);
            return null;
        }
    }
}

Personally, I would prefer to write this:

public interface IRequestFailure
{
    string! FailureMessage { get; }
}

If you're currently thinking there's something very wrong with code contracts in C# you'd be right.  What's wrong is that code contracts aren't actually a language feature.  They're implemented as a post-compilation rewrite.  Don't get me wrong, it's really impressive that they achieved it this way, and they're still an improvement on what went before, but I can't help feeling they lost sight of the ultimate goal on this one.  Worse, enforcing code contracts is achieved by a DevLabs download that's feature-locked to particular versions of Visual Studio.  I'm pretty sure I missed the part of the Liskov Substitution Principle that said "This doesn't apply if you happen to be running Visual Studio Professional".

Handling Parameters with Caricature

I'm still finding my feet with RSpec, but early experiences are good.  I wish I could say the same for Cucumber.  Sadly, you can't really use RSpec's built-in mocking with the CLR, so we need an alternative.  Luckily Caricature takes the strain.  There's not a great deal to say that isn't covered by Ivan's introductory articles, it's that simple to use.  However, one thing I couldn't figure out was how to return a value that was dependent on the inputs.  I ended up contacting Ivan, who was extremely helpful and even updated the gem to help me out.

Turns out the code I needed was:

@factory = Request::IRequestFactory.isolate 
@factory.when_receiving(:GetHeaders).returns { |x| HttpHeaders.new x }

(Note the new syntax with .isolate rather than Isolation.for)

One thing that I'm still not up to speed with in Ruby is API discovery.  All the documentation and even source code in the world isn't as convenient as hitting auto-complete and seeing the help come up.  On the other hand, Caricature has extremely readable specs.  If I'd found this earlier I wouldn't have had so many problems.  This, of course, one of the points of RSpec: tests you can actually read.

Running RSpec and Cucumber from C#

You might be wondering why on earth you'd want to do this, but it's trickier than it looks.  I've published a gist for it.  There's a couple of comments about how to get things set up as well.

The bad news is, RSpec works, but Cucumber doesn't.  Instead I'm getting this:

can't convert Array into java::util::List (TypeError) 
C:/Program Files/IronRuby 1.0v4/lib/ironruby/gems/1.8/gems/gherkin-2.0.2-universal-dotnet/lib/gherkin/native/ikvm.rb:37:in `new' 
C:/Program Files/IronRuby 1.0v4/lib/ironruby/gems/1.8/gems/cucumber-0.8.3/bin/../lib/cucumber/cli/configuration.rb:29:in `parse!' 
C:/Program Files/IronRuby 1.0v4/lib/ironruby/gems/1.8/gems/cucumber-0.8.3/bin/../lib/cucumber/cli/main.rb:94:in `configuration' 
C:/Program Files/IronRuby 1.0v4/lib/ironruby/gems/1.8/gems/cucumber-0.8.3/bin/../lib/cucumber/cli/main.rb:43:in `execute!' 
C:/Program Files/IronRuby 1.0v4/lib/ironruby/gems/1.8/gems/cucumber-0.8.3/bin/../lib/cucumber/cli/main.rb:25:in `execute' 
C:/Program Files/IronRuby 1.0v4/lib/ironruby/gems/1.8/gems/cucumber-0.8.3/bin/cucumber:8 
C:\Program Files\IronRuby 1.0v4\bin\cucumber:19:in `load' 
C:\Program Files\IronRuby 1.0v4\bin\cucumber:19 

Yes, you're reading the first line correctly.  The downside of IKVM, of course, is Java Exceptions that mean nothing to you.  This is quite frustrating for me, because I really wanted to start using Cucumber.  So if anyone's managed to crack this, drop me a line.  Believe me, I've tried enough different versions of the gherkin and cucumber gems to last a lifetime.

On the other hand, I'm rather enjoying RSpec.

A Future for Inversion of Control Containers

.NET has a lot of IoC containers.  This is undoubtedly a good thing: diversification is a sign of strength in the ecosystem.  MEF is showing how specialization can be used to solve specific issues.  I'm going to try to set out one vision of how IoC containers should evolve.  This is not my opinion of how they all should evolve, it's simply my vision of what I would like from one of the many containers.

I've argued for some time that containers are all about configuration.  If you don't believe that, then you should consider why Yadic isn't your favourite container.  I've also been agitating for some time to get environmental deltas included in Castle Windsor, but actually this doesn't even go far enough.  Let's talk about what's still wrong:

  • There's no standard way to specify external configuration.  (i.e. you can't separate configuration from wiring)
  • There's no standard way to examine the runtime configuration.
  • There's no standard way to report of the validity of the runtime configuration.
  • There's no standard way of having a centralised configuration source.
  • There's no sensible way of changing the configuration other than hacking files around.
  • You can't change the runtime configuration and have that reflected in the state of the container.  Live.

Binsor achieves the first in a general way, but in a fashion that isn't really acceptable.  nServiceBus thinks this through a lot, but only solves it for the case of using nServiceBus.  StructureMap provides the second at an API level, but not at an ops level. 

Live Diagnostics

StructureMap has, at the time of writing, much better diagnostics than Castle Windsor, which still firmly believes in the "try activating the component and see if it throws an exception" model.  However, why can't we do the following?

  • Ability to embed a web server in your application.  Seriously, why do we keep writing remoting+winforms applications just to do simple diagnostics?
  • Ability to see all of the live configuration options.
  • The ability to modify the live configuration options.
  • Ability to see which configuration settings have issues.

You might think I'm talking futurology here, but Sam Newman and his mates have already built it.  (You should watch the entire talk.)  There's no reason that work can't be generalised in an opinionated manner that requires developers to think about these issues up front.

A possible objection is that this functionality is pretty much orthogonal to what containers do right now.  That's correct to a certain extent, but it ignores a fundamental issue: every container I've used (let's ignore MEF for now) works on the basis that, ultimately, the configuration for an application can be expressed as an XML file.  This expresses itself in a number of ways:

  • Things are strings that shouldn't be.
  • Configuration can't change during the lifetime of the container.
  • The container can't recreate objects based on a change in configuration.  There's no model for even expressing how that lifecycle should work.

Viewed from a certain angle, IoC containers at the moment are basically better XML configuration files.  There's no reason they can't be a better WMI.

DR is for Disaster Recovery

I was having a drink with a supplier the other day and he was explaining that during the outage "all of our settings had been replicated to DR, so we couldn't switch to DR either."  The implication was that automatically replicating settings to DR was a bad idea, precisely to avoid a scenario such as this.  Personally, I think it's more about understanding purpose.  What do you want to do with your DR solution?

  • Deal with a complete physical loss of your primary site?
  • Handle hardware failure on one of your machines?
  • Handle a software failure on one of your systems?

Of these, only the first is actually disaster recovery.  The second is known as redundancy and the third rollback.  "Switching to DR" is intended for disaster recovery scenarios only.  The problem is, these are all problems and all too often we try to solve all three at once under the heading of "DR" without actually thinking about what the letters stand for.  Sadly, this tends to lead to solutions that achieve none of the aims.  It's not a question of balancing competing priorities here, it's a question of recognizing that the problems are separate and so should their solutions.

Technorati Tags:

Shard Balancer: Another Crack at QueueChannel

For those of you bored with this problem, I promise this is the last time I post about this.  That's because I've finally come up with a solution I actually like.  Here, the model is that the message order still matters, but the state is in a database, rather than in memory.  This allows us to change our minds about which thread runs which orders.  I got the idea from reading about sharding and resharding strategies.  One of the approaches to resharding is to assign everything a "virtual shard" where there are a lot of virtual shards and only a few physical shards.  Then, when you need to expand, you can move virtual shards between physical shards.

Here, the "virtual shard" is the identification of the order.  The "physical shard" is the Retlang fiber.  Only one virtual shard is assigned to a physical shard at any given time, so the metaphor breaks down a bit.  Nonetheless, the basic idea is that when a worker is ready, it gets assigned all of the messages for a shard.  Then that shard is locked, so that messages just back up until the worker is finished.  After that point, any new messages can be assigned to a worker, and the worker is available for work again.

This is actually pretty similar to the keyed batch concept in Retlang, which led me to support subscriptions using Action<K, IEnumerable<T>> as well as Action<T>.  The former could, for instance, choose to only commit its changes at the end of a batch.  To support this use case properly, I needed to introduce a maximum batch size as well.  The batch size does matter: the larger the batch, the more efficient the code is.  If you don't need batching at all, you can always set it to int.MaxValue.

Performance

If you take a look at the test code, you'll see it calculates a rough "efficiency" number.  That's the time it took compared to the fastest possible.  In my tests, I'm seeing 98% efficiency.  There's a couple of commented out lines that test QueueChannel in the same manner.  That shows a figure of 99%, so the order preservation and batching does impose some overhead.

Going back to the model of the system, we assumed that there was no in memory state.  In practice, that doesn't have to be the case.  The shard itself can be mutable.  In fact, if you're tracking renamed objects (e.g. cancel/replaces in Fix) you're going to be doing this the whole time.  You can modify the shard during processing, but if you modify state during processing and assignment you're going to need a lock.  This is much more lock-free parallelism than I'd hoped to achieve with this approach, and pretty much renders the previous approach irrelevant.

A Walk Around The Code

I've decided to try publishing the code as a gist this time.  It's certainly less effort for me to do than the usual HTML code, but it's not as flexible: the syntax colouring isn't up to much and there's no control over the presentation of the files.  So, the implementation can be found in the following files:

  • ShardBalancer.cs
  • ShardConsumer.cs
  • IConsumer.cs

Arguably, IConsumer doesn't need to exist.  There's extremely tight coupling between the balancer and the consumer so the only purpose it serves is to highlight how the balancer calls to the consumer.  I wrote this version from scratch, which explains the change in coding style, but the basic pattern is still Retlang's hungry consumer implementation.

The test harness consists of:

  • Program.cs
  • Execution.cs
  • SummationService.cs
  • OrderDb.cs

Observations

I think I've finally come up with a solution to the problem with which I'm happy.  This solution actually balances very well whilst still observing necessary message ordering.  If you watch it in action, it actually behaves in a similar to processor assignment of processors at the OS level.  I'm particularly pleased that it can be used to batch messages, which could enable some problems to outperform a QueueChannel implementation under the right circumstances.

  • The balancer has two locks.  It's extremely important that the implementation of Wakeup does no work itself.  Otherwise the solution would deadlock.
  • If KeyValuePair<T, V> was immutable, the batch dequeue could return KeyValuePair<TShard, IEnumerable<TValue>>.  This highlights a problem that's going to be with us for a long time: the APIs were never built with the new type system in mind.
  • The DataToActions code is a bit on the unreadable side, but the alternative is two different implementations of ShardConsumer, which didn't appeal.
  • The mechanism to empty the queue is vastly more elegant than the FlushAndWait implementation in the previous version.  EmptyEvent isn't threadsafe in .NET 4.
  • The code needs a trivial change to compile with Retlang trunk.

You could experiment with switching strategies if the number of queued shards was less than the number of available threads.  Instead of choosing randomly, you could pick the longest queue and throw the entire thing onto a thread.  This is particularly attractive if you're working with a terminating process as this would speed up the process of closing down at the end of the day.  Something that this code does badly, as does Retlang in general, is report exactly what's going on internally.  I'm going to see if I can come up with a solution for that.  Don't hold your breath, though :)

Technorati Tags: ,

Printing the Current ClassPath from the Clojure Repl

I'm still wrestling with getting something acceptable running Clojure under Windows, however you may find the following useful in your own travels:

(->> (java.lang.System/getProperty "java.class.path") (re-seq #"[^;]+") (map println) dorun)
Technorati Tags:

Understand What You Measure

I never really intended to start blogging about my work kitchen, but it keeps providing me with anecdotes.  Today, I was thinking about the milk.  There are six fridges in the office, and typically you'll find that they each have eight semi-skimmed jugs of milk.  There's also a single jug of skimmed milk.  Sometimes there's none.  So, someone somewhere has made a decision as to the appropriate amount of skimmed and semi-skimmed milk to provide.  Now we enter the realms of pure speculation, but bear with me.

superman-got-milk-ad-commercial

How do you determine what the mix should be?  Well, the obvious googley answer is: you measure consumption and use that as feedback.  I'm pretty sure that's what someone at the supplier is doing.  The thing is, there's an absolutely tiny amount of skimmed milk being supplied, far less than you'd expect given a glance at the supermarket.  So what's wrong?

Ask anyone who drinks skimmed milk at the office and they can tell you: the skimmed milk tastes bad.  I mean, really bad, so bad you think it's off.  Typically, the first time you try it you pour it down the sink.  The second time you realize it always tastes like that.  Every so often you see someone new do the exact same thing.

Selling Ice Cream In Winter

Famously, if a government department wants to kill something, they'll run a study under circumstances that make it look incredibly unpopular.  But even when someone isn't deliberately trying to manipulate the figures, unexpected factors occur.  The good news is, for user interfaces, there's an easy answer: learn a bit about usability testing.  With internal systems the problem is harder.  Worst of all is with performance questions.  Just because you've reduced one number, doesn't mean you haven't increased another.  Ultimately, you need to understand your numbers holistically if you've got any hope of doing something other than performance whack-a-mole.

Technorati Tags: ,

What is n, anyway?

I've got about 500 CDs, and they're in a complete mess.  So, I want to sort them.  What sort algorithm would you use?  Ask a computer science graduate and he'd tell you that heapsort is the single threaded algorithm of choice.  If you had people helping you, you could consider mergesort.  Of course, neither of these approaches is actually sensible.  Here's an idea that actually works:

  • First go through the CDs and sort them into buckets according to the first letter.
  • Insertion sort the CDs belonging to a particular letter.
  • Stick the sorted CDs back on the shelves in order.

So why does our complexity analysis fail for this problem?  The reason is our assumptions are incorrect.  All the analysis of sort algorithms assumes that it is effectively free to move the objects (because they can be pointers).  Hence, all analysis of sorting algorithms optimizes for the number of comparisons With a physical CD to move around, this is no longer the case.  Equally, we assume all tests on our keys are equal.  With sorting CDs, it's relatively low cost to determine the first letter of the band name.  A full comparison, including ordering by date of publication, is significantly harder.*

Units Matter

The point is, although we tend to ignore this, the units of n matter.  At different times, the performance of a system can be dominated by different factors

  • Number of simultaneous requests
  • Size of most complex workflows (think saga if you're an nServiceBus guy)
  • Number of workflows
  • Sheer number of messages
  • Memory Consumption

This isn't theoretical, I've got a single system where I've needed to optimize on all of these bases at one time or another.    If you want a system to scale, out or up, you've got to understand what units you're dealing with.  Next, you've got to track them.  Probably the biggest performance secret of all time is good diagnostics, live and historical.

*There is also the problem that doing heapsort by hand is likely to be error prone, but that's not the point I'm making.

Technorati Tags: