The LINQ vs Generators Shootout, Round #1

I have to admit, Python is growing on me.  I’m still not entirely convinced of the utility of IronPython, especially given that Boo exists (why don’t more scripting languages allow you to meddle with compilation?).  However, Python as CPython and Jython are actually rather interesting beasts, with some very cool stuff being done on them (I really like the look of Pylons, for instance, I’ll probably write something up on that in the future.)

I thought I should probably expand on my remark that LINQ was more powerful than list comprehension.  It was pointed out to me that Python supports both lazy and greedy evaluation (it calls one list comprehensions and one generators).  LINQ is purely lazy, although adding “ToList” onto the end will typically deal with that if it is a problem (and it would be if you used it naively).

So, how is LINQ a better form of list comprehension?  Four reasons:

  • It’s implemented as a data model, allowing stuff such as LINQ to NHibernate to exist.
  • It supports group by
  • It supports order by
  • It supports intermediate assignments through let and into

The first is probably the most technically impressive, but it’s also the most controversial.  It means that LINQ is much more than just a list comprehension system, but no-one’s got enough experience of it yet to know exactly how these features are best used.

The grouping is cool, although I have to admit I’ve rarely needed it.  The ordering, on the other hand, is huge.  Python’s model for sorting is the IComparable model in C#.  If you’ve ever tried to sort by three keys, you’ll know the problems with it.  In contrast, you can just specify the keys and let LINQ sort it out for you. 

The final one is probably the most useful of the lot, even if it seems minor.  Take a look at the following code:

public static string Deduce7ZipDirectory(IEnumerable<string> output)
{
    var regex = new Regex(@"s(?<Folder>[^s\]+[^:])[\]");
    var result = (from value in output
                  let match = regex.Match(value)
                  where match != Match.Empty
                  select match.Groups["Folder"].Value)
        .FirstOrDefault();
    return result;
}

I actually wrote this code to parse the output of 7zip’s command line list function and I think it’s pretty elegantly declarative.  I’m not entirely happy with the debugging story, however.  You can put breakpoints within the LINQ statement, but seeing the local variables doesn’t seem to work for me.  Ironically, this is a bigger problem for C# than it is for JavaScript or Python, simply because it’s possible to write rather complex things in these statements.

Personal Note

I made the mistake of posting shortly before I went on holiday for several weeks.  I’d like to thank everyone who commented, I learnt a lot,  Amongst the things I learned was that I really need to get around to writing an “About Me” page, mostly because of my aversion to posting noise rather than signal.  For the record, my name is Julian Birch and I live in London.