NPM: There is no Spoon

I’m sure most people coming to node.js immediately go looking for the equivalent of gems.  They’ll soon find npm and discover that it does exactly what they expected (with a couple of tweaks along the way to avoid running absolutely everything as root).  However, the chances are, there’s a simpler alternative they haven’t considered yet.

If you’re a C# developer, you should be familiar with the term “DLL hell”.  It’s important to understand that gems do, in fact, suffer from versioning issues.  They don’t bug out at the first sign of trouble like .NET assemblies do, but any centralized repository can and does suffer from this problem.  Ruby Gems does have a way of addressing this, but it doesn’t work because people don’t use it.  To think of this another way, when was the last time you recommended an assembly be put in the GAC?

What’s worse, you probably haven’t figured out yet how you’re going to get your code deployed to live.  I suggest you do so now rather than waiting for the inevitable surprise.

How Require Works

First, we need to understand how “require” actually works.  Require works through the search path, which includes anything in the environment variable “NODE_PATH”.  It will match the following:

  • anything with that exact path name
  • anything with that path name plus “.js”
  • anything with that path name plus “/index.js”
  • a native extension

99% of the time you’re looking at one of the first three.  Once require has found the correct file, it runs the file and populates the variable “exports” (aka module.exports), which is the return value of “require”.

The last is a bit more complex. 

What NPM does

Now, NPM by definition can do what it likes, up to and including deleting everything on your hard drive.  But what do most packages actually do?  Well, typically they create an index.js file and download the files.  There’s complex packages like expresso that are more work, but 99% of the time they’re doing very little at all.

So, instead of installing an npm module and messing around with a global repository, why not keep it simple and download the code yourself?  You just type “git submodule add https://github.com/brianc/node-postgres.git” and “require ‘./node-postgres/lib’ (assuming you’ve got your NODE_PATH set correctly).  This is just as easy as using NPM in the vast majority of cases (quite a few projects actually include an index.js file in the root of their repository to make this even simpler).  You can now do the following:

  • Keep an exact version of the package that works versioned with your main repository.
  • Switch between released versions easily
  • Maintain a branch with quick fixes to bugs you find (this will happen more often than you’d hope)
  • Deploy easily

This is incredibly easy to do providing the NPM package isn’t complex, which 99% of them aren’t.  Native extensions are more tricky to do this way because you need to automate their build system.  However the benefits are just as great.

Technorati Tags: ,,,,,

A Database Upgrade Script in Node

If you’re starting a database project, one of the first things you need is the ability to recreate and upgrade the database.  Tarantino has features to do this in the .NET world, but I just needed something quick and dirty.  You can also use this code as an introduction to Brian Carlson’s Node Progres library (connection pooling coming soon…).  Note that, in common with most Node projects, it’s almost impossible to get work done without underscore.js and async.js.

# pg = require 'postgres-js/lib/postgres-pure'
pg = require './node-postgres/lib'
eyes = require 'eyes'
_ = require './underscore'
async = require './async'
fs = require 'fs'

process.on 'uncaughtException', (d) ->
  if d.stack?
    console.log "Message #{d.message}"
    console.log d.stack
  else
    eyes.inspect d

getConnection = (d) ->
  conn = new pg.Client d
  conn.connect()

  conn.on 'error', (err) ->
    eyes.inspect err

  _.extend conn,
    execute : (text, c) ->
      q = @query2 text, c
      q.on 'end', c
    singleValue : (text, c) ->
      q = @query2 text, c
      q.once 'row', (r) ->
        for k,v of r
          c null, v
          q.removeAllListeners 'end'
          return
      q.on 'end', c
    query2 : (text, c) ->
      console.log "Executingn#{text}"
      q = @query text
      q.on 'error', c
      q
   conn

conn = getConnection
  database : 'db_name'
  user : 'postgres'
  password : 'password'

getId = (f) ->
  m = /d+/.exec f
  return 0 if m == null
  parseInt m[0]

passErrorsUp = (c) -> (e) ->
  if e != null
    c e

async.waterfall [
    (c) ->
      @foundSchemaVersion = false
      q = conn.query2 "select tablename from pg_tables;", c
      q.on 'row', (r) =>
        @foundSchemaVersion ||= 'schema_version' == r.tablename
      q.on 'end', => c null, @foundSchemaVersion
    (d, c) ->
      console.log "Found table: #{foundSchemaVersion}"
      if d
        c()
      else
        conn.execute "create table schema_version ( version integer );", c
    (c) -> conn.singleValue "select version from schema_version", c
    (d, c) ->
      if d == null
        conn.execute "insert into schema_version values (0);", -> c null, 0
      else
        c null, d
    (v, c) ->
      console.log "Current Version #{v}"
      fs.readdir './sql', (e, d) -> c e, v, d
    (v, files, c) ->
      files = (_ files).chain()
        .select((f) -> v < getId f)
        .sortBy(getId)
        .value()
      if files.length == 0
        console.log "No files to read!"
        c()
      else
        commands = _.flatten _.map files, (f) -> [
          (c) -> fs.readFile "sql/#{f}", 'utf8', c
          (text, c) ->
            console.log "Read file #{f}"
            text = "#{text};update schema_version set version = #{getId f};"
            conn.execute text, c
          (d, c) ->
            console.log "Executed file #{f}"
            c()
        ]
        async.waterfall commands, passErrorsUp c
], (e) ->
  if e == null || !e?
    process.exit 0
  else
    eyes.inspect e
    process.exit 1

Some notes for Node newbies like me:

  • It’s really important to ensure that the callback (c) gets called exactly once for every step in an async waterfall.  This is particularly messy when you kick off a waterfall within the other waterfall.
  • The node_postgres API missed some convenience functions I’m used to.  So I added them.  This is JavaScript, after all.

Postgres and Node

I’ve just been getting started with database access with node, and I’ll save you some time.  There are three competing node Progres libraries at the moment, and the right one is the one you’ll probably try third.  I appreciate that “right” is something of a strong statement, but it’s completely justified in context.

First up, we have postgres-js.  This is by the incredibly talented Tim Caswell aka creationix.  It’s a pure js library, which is a good thing, but it’s missing a couple of things: an example of how to make it work, and a test suite.  The former isn’t as big a problem as the latter, because sadly it doesn’t work.  Relatively simple queries will break as it tries to parse the response stream.  To be honest, I seriously doubt this is much of a focus for him, its last update was in August.

So, how about one by Ryan Dahl himself?  Postgres_Node (note the underscore) tries a different approach and goes to the C-level Postgres API.  Thinking about it, you’d kind of expect this from the guy who wrote node in the first place.  However, it’s in my opinion needlessly complex to do it this way, and although it does have a (short) test suite, I rapidly found myself breaking it badly.  It’s got a great big sign on the front page saying it’s experimental.  He means it.

Finally, Brian Carlson has written Postgres-Node (note the dash).  This is inspired by Tim Caswell’s take, but has an extensive test suite.  It even has (occasionally confusing) documentation.  Even better, it’s the only one that actually allows you to stream results.  To date, I haven’t managed to break it at all.  I highly recommend using this one.

Technorati Tags: ,