Clojure has a Problem with Async

Clojure, like node.js, is a very opinionated platform.  The funny thing is that almost every opinion is different. 

Clojure embraces Java as a platform. 

  • Originally, every declared identifier was overrideable on a per-thread basis. 
  • There’s many features (e.g. Futures and Reducers) that allow you embrace multi-threading at a high level.
  • Data is immutable.
  • Data is globally shared between threads.
  • It adds STM to Java’s already extensive thread-synchronization primitives.
  • Everything’s a function. 

Node, conversely embraces Javascript

  • It’s aggressively single thread and asynchronous.
  • If you want another thread, you’ll have to start another process.
  • Everything’s mutable, even class definitions.
  • Share data between processes?  I hope you like memory mapping.
  • Synchronization barriers?  You don’t need them. 
  • Everything’s an event with a callback.

Clojure and Node have completely different sweet spots: clojure is truly excellent at computation, node at IO.  Like it or not, multiple threads aren’t really a good solution to blocking IO solutions.  Which is a pity, because all the main Clojure libraries feature blocking IO (e.g. clojure.java.jdbc, ring).  That’s not to say there isn’t some amazing stuff being done in Clojure, just that it could be even better.

JDBC is an interesting case because it’s a Java problem that works its way through to Clojure.  Node.js made a virtue of being the only API on a new platform.  However, it introduces a couple of oddities of its own.  For instance, the jdbc library can only have one open database connection at once.  Usually the case, but sometimes undesirable (try performing a reconciliation of a million records between two databases).  To some extent, this is a hangover of Clojure being envisaged as an application language that used libraries written in Java. 

There’s nothing stopping you from writing Clojure code in a node-like style, as long as you’re prepared to write your own web-server (Webbit, Aleph) and DB libraries (er… no-one).  Equally, implementing a feature like co-routines wouldn’t actually be that hard, but you’d lose bindings, which is a problem for any library that assumes that they work.  And you’d still need all of your libraries to be async.

For all these reasons, I don’t think we’re going to be seeing a proper Clojure async solution any time soon.  Ironically, I think it’s the complete absence of async DB libraries that is really holding it back.  Without that, solving most of the other things isn’t really that useful.

WebForms In Retrospect

I’m not much for technology recommendations.  Most technology choices should probably be driven by familiarity and cost.  For instance, Ruby on Rails is easily the most mature web stack out there, but if you’re a Python programmer with no familiarity with Ruby, Flask or Django is likely to be a better choice.  It’s not helped by the fact that a number of “technology change” stories turn out to be stories of replacing bad code in one stack with good code in another.  The rest tend to be stories about reducing running costs by moving off Windows or Heroku.  Most technology choices, especially in the web space, tend to be “good enough”.

There’s one huge exception to this rule: Asp.Net WebForms.  It’s poison for productivity and it’s poison for production.  It took me a long time to accept this.  I had been developing uSwitch.com for nearly four years.  We had developed the majority of the site in WebForms (this was before MVC was even an option).  Except, weirdly, for the most profitable parts.  They were in an unholy mix of ASP, XSLT and C# for the business logic.  Although a pain to work with, I’m no longer convinced it was actually worse than WebForms where it counts.

The Choice That Destroyed a Company

Now, rather than the implementation success stories that glut the internet, let me tell you about a failure.  uSwitch is now a successful project run by ForwardTek.  However, it’s shocking what happened to uSwitch in the last years I was there.  The firm was bought for $366 million in March 2006 with big plans for expansion.  In July 2007, these plans were dust and there were huge layoffs (I was already gone).  Within two years, with no expansion in sight, February 2008 saw the purchaser effectively write down the firm to zero.  Now, the reasons for this are always many and varied (buy me a beer sometime), but a fair proportion of this horrible crash can be ascribed to one project, and a fair proportion of that to its technology choice.  I’m not going to claim that there weren’t bad project management decisions, or that all of the code we delivered was perfect, but choosing WebForms was a mistake, and a big one.

The truth is, I thought the writing was on the wall by the start of 2007.  The project known as the redesign, which began at about the same time as the buyout, had been live for a couple of months.  It was a disaster:

  • It had frozen the entire website for over six months, allowing competitors to eat our lunch.
  • Our flagship energy product was actually slower at the end of it than at the beginning.
  • The new structure wasn’t actually flexible enough to handle the changes we were then asked to implement.
  • It took over 50 developers, including quite a few contractors, about seven months to deliver.  We had to pay contractors to work weekends.  When the project was over, so little leave had been taken the firm had to institute buy-back.  The cost of the project was just plain more than the firm could take.
  • And, worst of all, the uptick in conversion the project promised just plain didn’t happen.

(There is an irony associated with that last point.  The original project proposal a 10% improvement (don’t quote me on the exact figures after all this time).  Some quick-win patches we instituted in February garnered about half of that.  This left us with a large project to gather the other 5%.  With that in mind, it’s not clear the project was worth doing even at the start.  That and the decision that we couldn’t have an inconsistent look on the site really, really hurt us.  I did say there were other factors.)

This post is getting extremely long, but hopefully I’ve got your attention.  I distinctly remember one day in July, the fourth month of this project.  I had just finished a two hour debugging session with another developer on some issue to do with dynamically generated content, one of the many issues that plagues WebForms development, when it occurred to me that after three years of working with the technology, I was far from convinced the productivity benefits of learning it were there at all.  It’s pretty tough accepting that you’ve spent the last three years driving in the wrong direction, but the longer I thought about it, the more solid my opinion became.

Wait, What Has This Got To Do With WebForms?

You’ll note that I haven’t said the code was bad.  It wasn’t.  It had a fair number of automated tests against it, some nice automated health checks in Watir.  A little while after release, it even gained an automated deployment system.

WebForms has a number of things that just make it a horribly inappropriate technology for a dotcom website.

  • It wants to generate all of your HTML.  This is a serious problem if you’re trying to produce a skinnable application.  Yes, I’m fully aware of the skinning technology in WebForms; it’s completely inappropriate for the kind of work dotcoms actually want to do.
  • It generates huge hidden state fields.  This means your programmers end up fighting WebForms every time they need a page to be fast.  Sadly, this is all of the time.
  • It generates large numbers of complex munged IDs.  In addition to making your page slow, it obstructs debugging.
  • The standard model is to post back to the same page and then redirect to the next.  This involves reconstructing the state of the previous page in order to process the events.  This is way too slow at a computational level and involves too many round trips.
  • Specifying your own URLs was generally regarded as deep magic until 2010.  Seriously.

Seriously, there’s no other web stack I can think of that makes it so hard to just deliver some HTML in a form and get a post back.  Now, if you’re a WebForms expert, you’ll be taking a look at the previous list and thinking “but there’s ways around all of these issues”.  You’d be right, but that’s missing the point.  There’s no other web stack that requires expertise to get these things right.  But even if you can crack these issues, your problems are just beginning.

For instance, have you heard of the “off by one” issue?  Let’s say you have a form that changes the number of controls on the page under certain circumstances.  It’s quite easy to get into a situation in which it’s showing the wrong stuff on the page.  If you add a dummy button, pressing it will correct the page.  Debugging issues like this is a nightmare.  Things get worse if you have data driven controls.  While we’re on the subject, I remember watching one developer trying to replicate the functionality of repeater.  Even after decompiling the original code, he couldn’t get it to work with the declarative syntax, leading us to believe that the repeater is actually magic.  The kind of magic that comes out of Neville Longbottom’s wand.  Oh yes, and you’ve got to understand the ASP.NET event model, a construct more complex than Cloud Atlas and significantly less fun to read.

Oh yes, and I haven’t even mentioned theses issues yet:

  • there’s no support for any kind of CSS or JS asset pipeline.  If you want some, you’ll have to roll your own build system.  Rails comes with this stuff baked in.
  • testing?  You have two options: Selenium or clicking on a web browser.

WebForms: Just Say No

WebForms is productivity poison; in fact, it’s so bad it’s probably not just hurt little firms like mine, it’s a technology (again, part of a larger picture) that killed Microsoft as a cloud platform.  Someone reading this undoubtedly believes WebForms is better in 4.5.  It probably is, but it remains the wrong idea in the first place.  It’s quite hard to fix that.  I don’t write that many websites these days, but even for internal stuff I never, ever, touch WebForms. 

Of course, it’s worth pointing out there are other sites on the internet that used WebForms at the same time.  Not many, but it’s worth considering that the two I can think of are Orkut and MySpace.  They were both technical disasters and unable to match the pace of a firm like Facebook that had delivered its entire functionality in PHP, a technology that’s closest relative in the Microsoft world is classic ASP. 

Technorati Tags: ,,

New Coding Standard: Don’t Be A Jerk

Unlike many, I do actually believe that code aesthetics matter.  On the other hand, I’ll also admit to being as much of an arrogant elitist about code as I am about music.  This is why I like CoffeeScript and Anton Webern.  2 spaces vs 4 spaces?  Spaces vs Tabs?  Indent case statements within a switch?  I’ve got an opinion.  The fact remains that it’s much more important that everyone uses the same conventions than exactly what that convention is.  So I keep my stylistic tics to myself and listen to Polly Harvey on headphones where the noise doesn’t frighten my wife.

One funny thing is that people don’t seem to extend this logic out of their team or project.  There’s usually a dominant answer to these stylistic questions for whatever programming language you’re using.  Don’t like it?  Write your own programming language while listening to Laurie Anderson (or whatever you’re into).

For the most part, just doing what everyone else does answers most aesthetic coding standards questions that may arise.  It reduces stupid fatiguing stylistic discontinuities and lets you get on with actually reading the code.

Standards Creep

So, do you need a coding standards document?  Well, that’s really going to depend on how many violations you’re seeing.  Bear in mind that beginners aren’t going to perceive the patterns as easily as the competent.  However, here’s where the spectre of best practices rears its head again.  Once you start to document what people’s code looks like, it’s very tempting to move into what their code actually does.  Again, you end up creating a document that, at best, annoys your best programmers.  At worst, it creates a culture in which your best programmers are regarded as loose cannons and mediocrity is regarded as the highest goal.

So here’s my own contribution to “best practice” coding for readability: don’t be a jerk.  At the risk of being prescriptive and contradicting everything I’ve already said:

  • Don’t write bad confusing code and then a long comment explaining it.  Write better code.
  • It’s OK to have one character identifiers with tight scope.  Especially if you’re going to use it repeatedly.
  • When have wide scope, write words out in full rather than using a contraction, unless it’s a very well understood one.  Searching for stuff is hard when you have to guess.
  • And needless to say, spell things correctly.  If you’re not sure, type the word into Google.  I do.
  • If jargon doesn’t make things more concise, don’t use it.  That said, use commonly accepted terminology.
  • Don’t reuse identifiers to mean different things at different times.
  • Don’t be vague.  isActive and shouldBeActive are different concepts.  It’s amazing how many people just call the identifier “active” is both circumstances.*
  • Identify things by what you’re doing with them, not what they are.

Unless, of course, following these rules to the letter would make you act like a jerk.

*To get to nitty-gritty style point, the standard in C# is that you would identity “is active” with “active”.  Still, don’t ever identity “should be active” with “active”.  Equally, if you’re in lisp, “should be active” shouldn’t be named “active?”.

Code Fatigue: Performance Edition

It’s a truth universally acknowledged that sometimes you have to compromise your coding practices in order to eke out more performance.  However, I think it’s not appreciated how often just writing code well actually results in better performance.  For instance, I’ve seen more systems that would be faster with a more normalized schema in my career than systems that were crying out for denormalization*,

Let’s see how the two implementations of triangular number calculation from the last post compare in terms of performance.  In my unscientific testings:

  • The naive recursive algorithm calculates (t1 25000) in about 3 milliseconds.
  • The threaded sequence processing algorithm calculates (t2 25000) in about 5 milliseconds.

Well, it’s not looking good for readable code right now.  Although the fact that t1 can’t calculate (t1 26000) without blowing the stack indicates that it has bigger problems. 

Jean-Louis suggests an alternative: the tail-recursive implementation**

(defn t3 [n]
(loop [n n result 0] (if (zero? n) result (recur (dec n) (+ result n)))))

Let’s compare this to the original r1 in terms of readabilty:

  • We’re now using clojure functions zero? and dec, which requires a bit more clojure knowledge but reduces noise (and hence code fatigue)
  • We’re using a well-known accumulator pattern.  This doesn’t impose a cost on anyone familiar with how recur is used.
  • We’re still dealing with a branch and a recursive statement that interact.  On the other hand, the recur expression now only goes two deep.
  • We’ve got to keep track of positional parameters.  Not really 

So, in terms of readability, it’s better than t1 but we’d still prefer t2.  However, the figures speak for themselves: t2 can calculate (t3 25000) typically in about 1.5 milliseconds.  As I said at the start, sometimes the most elegant solution isn’t the fastest.  I doubt I could squeeze much more out of the sequence processing approach, although in the real world the application of fork-join reducers may outperform any explicit TCO solution.  However, is it possible to create a much faster implementation if we simply allow ourselves to assume more knowledge?  This time, though, we need more knowledge of our business domain (mathematics) than of our programming environment.

(defn t4 [n] (/ (* n (inc n)) 2))
(defn t4threaded [n] (-> (inc n) (* n) (/ 2)))

That calculates (t4 25000) in 0.1 milliseconds.  Needless to say, it can handle numbers much larger than any of the other implementations without breaking a sweat.  From a readability perspective, again, it’s excellent providing you have the required background knowledge.  The last example may feel like cheating, but the truth is that solutions like that are pretty common in the real world: insight into your problem area can radically simplify your code and improve your performance.

SIDEBAR:  I’m afraid I’m having trouble with the comment system on this site.  My apologies if this is affecting you (it isn’t affecting everyone).  It’s certainly affecting my ability to respond in anything other than a full post.

*although there is a special circle of Hell reserved for key-value pair implementations on SQL

**I should point out this differs stylistically from Jean-Louis’ original implementation, but the performance characteristics are the same.

Code Fatigue

Let’s say you needed to calculate the triangular number of n.  The triangular number of 4 is (1+2+3+4).  Which of the following implementations would you consider the more readable?

(defn t1 [n] 
(if (= n 0)
0
(+ n (t1 (- n 1)))))
(defn t2 [n] 
(->> (inc n)
(range 0)
(reduce +)))

If you answered t1, I’m willing to bet that by “readable” you meant “requires less experience to read”.  I’m going to argue that this is a bad way of evaluate readability.  Let’s assume for the moment that your command of Clojure is perfect, what are the challenges to comprehension then?

  • The first has a recursive step
  • The first has a branch
  • The first contains an expression that goes three deep.

Worse, each of these interact, meaning you have to hold them in your head all at once.  If you’re trying to solve a problem significantly harder than computing triangular numbers, sticking to “basic” code results in significant more lines of code and significantly more of these things that you have to simultaneously track.  Whilst each individual component is easy enough to parse, the overall effect is fatiguing.  This is bad news for humans, because they’re bad at maintaining mental function whilst processing large numbers of minor items.

Favour High Level Code

Let’s now assume that everyone has excellent command of the language we’re using.  What impedes readability in these circumstances?

  • The longer you need to track the value of a variable, the harder it is to understand.*
  • The more levels of control flow you need to track, the harder it is to understand.
  • The less code you can see on your screen at once, the harder the code is to understand.
  • The more times you see the same code expressions repeated with possible minor variations, the harder it is to understand.

Writing basic code in any language favours the comprehensibility of a single over the comprehensibility of the whole.  Not only that, but since each construct contains the possibility of error, using a basic style is much more likely to result in bugs.  A much better set of guidelines for writing readable code would be:

  • Use values close to their definition.  Make it clear that they are out of scope after that point.
  • Favour standardised control flow constructs such as reduce in Clojure and LINQ in C# over writing everything in terms of branches, loops and recursion.
  • Favour concise code over verbose code
  • Aggressively eliminate common sub-expressions

And next time trying to evaluate code readability, take the effects of fatigue more seriously and don’t worry as much about trying to compensate for lack of experience.

Technorati Tags: ,

*If it’s global and mutable, your chances of tracking it are nil unless you’re extremely disciplined.  Action at a distance is very hard to read and extremely error prone.

Clojure: Stages of Enlightenment.

I’ve tentatively identified seven stages of enlightenment in Clojure sequence processing.

  • Uses recursion
  • Uses recur
  • Uses loop/recur
  • Uses Clojure API functions such as filter and map
  • Uses reduce
  • Uses all Clojure API functions and understands implications.  At this point you can consider yourself a 4clojure 1st Dan.
  • Uses clojure.set as well

There may be higher levels, but Jay Fields hasn’t blogged about them yet.

Technorati Tags:

Sexism in IT

Let’s talk about Larry.  If you’re lucky, Larry isn’t in your team.  But he’s in a team you work with.  You find yourself trying to avoid dealing with that team because there’s a good chance you’re going to end up working with Larry.  Larry is a pain in the neck.  It’s not that he’s incompetent, he just doesn’t seem to care.  Nothing he puts together works, and when it does work it requires settings he forgot to tell you about.  Larry is the bottom tier of men in technology.

And yes, he’s a man.  90% of people in technology are.  What would happen if it was only 50%?  Well, frankly, Larry would be out of a job.  In his place would be a better woman.  Don’t get me wrong, there’s plenty of women out there as useless as Larry, but in a 50/50 world, they wouldn’t have a place in technology either.

We Are The 50%

Let’s talk about how we got to this point.  Women cannot be considered an “educationally disadvantaged minority”*, so we don’t have that excuse.  Computing was 90% male in 1967, when female participation in the workforce was much lower than it was today.   That was after a sexist purge of women programmers in the 1950s.  The gender ratio of computer science graduates in 1984 was 60/40.  So we’ve slid back from the 50/50 dream really quite dramatically.

It’s hard to avoid the conclusion that we (men) are fostering an environment that is subtly hostile to women.  I could spend all day adding to that particular list of links.  We need to stop.  Yes, we, meaning you right there and me right here.  I’ll be honest, it’s hard not to have a sexism bias when everyone you work with is a man.  That means the job is harder, not that it’s meaningless.

Full disclosure: I’ll admit that I’ve always been aware of this issue, but haven’t regarded it as my problem until the birth of my daughter.  It shifted my perspective quite dramatically.  I don’t aspire for her to follow in her father’s footsteps, but it offends me that chances could be denied her because of stupid rubbish like implicit sexism.  We can do better.  Moreover, we need to stop thinking of this as a problem that women have and we don’t.  The exclusion of women in the tech workforce affects us all and we’ve all got something to gain: better co-workers than Larry.  They probably won’t be as brilliant as Grace Hopper but, frankly, neither are we.

Unless, of course, you consider yourself to be in the bottom half of male programmers by ability.  Then you probably want to be as sexist and unwelcoming as possible.

*The observant will notice they’re not even a minority.

Post Script:  If you haven’t read Reg Braithwaite’s article about his mother, you really should.

Technorati Tags:

Clojure Macro: defn-curried

So, the winner of the first Clojure Macro Challenge (magic partial) is Rich Hickey.  Check out the macro defcurried on line 132.  I don’t imagine for a moment he read this blog, it’s just that expressing the reducers library without it would have been an exercise in frustration and obfuscating syntax.  However, his implementation is a bit special-case; it only produces the curried version and only curries on the last parameter.  So I decided to rip off his code and see whether or not I could come up with a more general solution:

(defn- curry 
[[params1 params2] body]
(cons (vec params1)
(if (empty? params2)
body
(list (apply list 'fn (vec params2) body)))))

(defn do-curried [symbol to-fn params]
(let [result (split-with (complement vector?) params)
[[name doc meta] [args & body]] result
[doc meta] (if (string? doc) [doc meta] [nil doc])
body (if meta (cons meta body) body)
arity-for-n #(-> % inc (split-at args) (to-fn body))
arities (->>
(range 0 (count args))
(map arity-for-n)
reverse)
before (keep identity [symbol name doc])]
(concat before arities)))

(defmacro defn-curried
"Builds a multiple arity function similar that returns closures
for the missing parameters, similar to ML's behaviour."
[& params]
(do-curried 'defn curry params))

(defmacro fn-curried
"Builds a multiple arity function similar that returns closures
for the missing parameters, similar to ML's behaviour."
[& params]
(do-curried 'fn curry params))

Incidentally, this was written in the LightTable InstaRepl, which is excellent for developing this kind of abstract hack.  The one feature it’s missing is proper destructuring support.  If you take a look above, you’ll see an identifier “result” that only really exists to allow the InstaRepl to process it more easily.

Anyway, I hope someone finds this useful.  I know I will.  After using ML/F#, writing functions that return other functions feels like an anti-pattern.

Best Practices and The Dunning-Kruger Effect

I was recently witness to an interesting twitter exchange between Stu Herbert and Dan North on the subject of development practices.  It occurred to me that the disagreement was fundamentally that they were addressing different groups.  Stu felt that the term “best practices” implied “optional” to too many people who should be using automated testing.  Dan, on the other hand, dislikes the term best practices because it’s devoid of context.

As Dan explained in an old Oredev talk (watch it if you haven’t), the problem with best practices is that their purpose is to drag people at the start of the Dreyfus model up to competency.  However, there’s a downside: enforcement of best practices exerts a drag on people beyond the competency stage.  From an expert’s point of view, everything is optional and is simply a matter of cost benefit analysis.  The problem is, good advice for experts is lousy advice for novices and vice versa.

I Don’t Understand It, So It Must Be Easy

The Dunning-Kruger effect is an observation that, the less competence you have at something, the easier you think it is.  I suspect it’s about to be renamed the “Apple Maps” effect.  Here’s where we really hit Stu’s problem: pretty much everyone thinks they’re an expert, except possibly for the experts.  Everyone with a stand up meeting is an authority on lean and agile.  I’ve seen team after team in my career who thought they had the experience to determine whether or not automated testing was appropriate to their project when not one of them had even tried it.  (In fairness to these teams, most uncoached first attempts at TDD are disasters.)

An Idiot Is Someone Who Doesn’t Know What You Learned Last Week

So, how do you push teams like this along the path of skills acquisition?  Well, usually, you proclaim that something is The One True Way and convince them that, by doing these things, they will be superior to other teams that haven’t embraced The Truth.  Despite having an element of the cargo cult about it, this will actually be true.  You’ve managed to circumvent the DK effect for a while, but it’s not gone.  As with all cognitive biases, they’re never, ever, gone.

If you haven’t ever used an IoC container, TDD or CI, I pretty much guarantee that trying them will make you a better developer.  Usually in ways that you’ll find hard to explain to people who haven’t.  If you’re stuck in the tar-pit, Scrum and other prescriptive agile methodologies will get you further along the path of understanding how to deliver value.  But any rules-based understanding is only half the journey.  Once we’ve climbed the hill, we’re still only at the competent stage.  All too often, we’ve set up camp.  If you walk in any direction, we’re going down again.  But anyone who cares to look can see beautiful mountains in the distance.  Let’s go visit them.

Self Diagnosis

The story of how this blog went down is short. (I rushed something and made a mistake).  The story of how I got it back up is epic. (Anyone who thinks Microsoft tools are easy to set up may be interested in some timeshare opportunities I’ve got available for a limited time only.)  However, a small but crucial part was played by this page.  It’s extremely boring: it just checks that Lucene has the permissions it needs to work.  The page above could have been achieved by a log file, but making it visible front and centre had a great advantage: I could use it to show my service provider what the problem was and give them a way to check that they’d fixed it. (Before this point, it was trial and error.)  It’s amazing how many systems fail when they’re misconfigured without telling you what the configuration problem is.  Diagnostics aren’t about data, they’re about communication.

Technorati Tags: