Well, he’s right and he’s wrong. He’s right in that data models (and especially database models) can and should be private to the application. In practice, they never are. There’s always someone adding a “quick fix” piece of functionality on the side. Let’s look at the main categories:
- Importing data into the system. (Including data entry systems.)
- Adding a separate program e.g. Invoicing added to a tracking system
- Exporting data to other systems.
Now, of these, the first is the only one with a right to exist. The only real problem is that people don’t treat it as a proper project. The second is a hugely bad idea, because the database is not a good API integration point. The data model you’ve used for the tracking system isn’t the model needed for invoicing, you’ve basically started down the route of trying to develop the canonical One True Data Model for your business. (Udi Dahan makes a remark in his latest post about this being a red herring.) I’m going to slightly labour this point, but it’s important and counter-intuitive to most developers, including the younger me. Beyond a certain point, trying to standardize the data that your system uses is an exercise that possibly results in intellectual satisfaction and may look, on paper, very impressive, but in fact makes communication extremely difficult and tedious. If you try to build the Tower of Babel, Babel is exactly what you’ll get.
The third is functionality that needs to be there, but it shouldn’t be hitting the database directly. No, not even a “quick export”.
Reporting, on the other hand, is the devil. Anyone who, like me, worked on VB3 back in the 90s will know that Crystal Reports is the devil. Unfortunately, some people remember technical limitations as being the problem, not that reporting packages are inherently diabolical. Business Objects, MS-SQL Reporting Services, the lot. That’s not to say they’re not powerful and occasionally useful, but I highly recommend using a long spoon.
So, why are they so dangerous. Well, we need to think about what they’re actually used for. In practice, they’re usually used as a short-cut to add an extra feature (Management Reporting, Invoicing) or as a data export feature. (MS-SQL Reporting Services is actually really good at this.) And how do they achieve it? By direct data binding. Yup, the very technique that agilists, alt.netters and anyone with an ounce of self-respect has been trying to eliminate from our arsenal of techniques for years.
Direct Data Access
Now, lets go back to Oren and Stephen’s disagreement. Let’s look at why people want to access the database directly.
- It’s easy to develop.
- It’s ready right now.
- It’s stable (if the database wasn’t there, you wouldn’t have an application anyway)
The problem is, you’re running up a huge balance on the credit card of technical debt. Here’s some problems:
- You’ve just made whatever database structure you have right now a de-facto specification.
- It’s unlikely you have any real idea of who is accessing your database or why.
- The external systems don’t synchronize through your business logic (this is not so bad for read-only scenarios, but can be lethal in writeable scenarios.)
- You’ve no control over exactly what locks are getting put on your DB.
Now, I have to admit, when I need to get data from an external system, I often request direct database access. Why? Because I’m not the one who’s going to suffer when these problems come up. On the other hand, I fight tooth and nail when the shoe is on the other foot.
Getting that spoon out
One of the smartest things my last company did was to develop a database specifically for management reporting. It was being constantly changed, but the truth is that those costs would have been there anyway, just not as visible. It also enabled us to highlight that concepts in one part of the business were subtly different from what appeared to be identical concepts in another part. Of course, it can be quite hard to get people to buy into this for “just a quick report”, which means you’re usually better arguing that it would be cheaper to add the functionality directly to the application. After all, the implementation of a proper reporting package can be quite time-consuming.
Arguing for a separate database to contain data “you’ve already got” can be a hard sell, especially to non-technical types. As with all sales, however, you’re probably best off differentiating the systems. Selling an OLAP cube is, ironically, easier, even if you don’t think the requirements really justify it. However, reporting requirements only ever grow. Seven year old reports are still being used by someone in the organization, even if you don’t know their name.
And if you’ve already got a system that reporting has got its tentacles into? Then I’m sorry, but it’s going to take a lot of spade work to dig out of that hole.