Let’s Fire The DBAs!

I am being deliberately provocative in my title. I love databases, and I think SQL is fun. I love data modeling, and I have often expressed the opinion that many code developers do not have proper respect for the database. They tend to build tables and throw them under their code in a haphazard and ill conceived manner. This may not be your opinion, and that’s fair, but it is mine.

I have also expressed another opinion. Database administrators are a pain in the hindquarters for developers. In a developer role, I have had a couple of DBA refuse to understand or discuss my requirements. For example, I needed to store application specific information in the database, and the DBA told me that that is not what a database is for. Please understand that he has long since retired, and he was a dinosaur even then, but I would have been much happier not to have to deal with him.

Another DBA seemed uncomfortable with the idea of letting me write my own stored procedures. After all, the database was his domain. He also expressed a desire not to have to maintain my stored procedures – “databases live a long time in this company, and I do not want to tie myself to the needs of the application you are building.” He was also disgusted by my team’s use of “Microsoft” products.

Another time, a DBA brought the database down to move it to another server, and then refused to help me connect to the new instance unless he received a formal request from my manager. I explained that I had had access before, but he explained that we had to comply with the policy. I was two days away from delivering new functionality, and I missed my deadline because of the time I lost. The most inflexible, antiquated and unhelpful people I have ever had to deal with have database administrators – not so much now as five to ten years ago.

Personally, I love databases. I love designing them, and using them, but now I can see that the relational database is not the only solution. Recently, I played with pytables, and I can see using it for storing and analyzing large datasets. I have been watching Microsoft’s LINQ initiative, and I have a serious lust to give it a try – I am excited about doing in code what I could do with stored procedures before.

As I contemplate all of the tools out there, including Oracle’s Berkeley DB, I notice that everyone is bragging about unattended execution, or zero administration. Nobody loves the poor old DBA. My choice for an embedded database would be DB4O. DB40 provides the means to query the database. The product is much faster than a version played with almost 2 years ago, but the selling feature for me is that the product provides the means to query the database using objects. I am looking forward to looking at LINQ when it becomes available, but DB4O is not tied to the .NET framework – it works in Java as well as .NET, and it can run on Linux.

There are some new things to get used to with these frameworks. Constraints are expressed differently, and, modeling these applications could be a challenge. Obviously, you can use UML to model objects, and you can create new stereotypes to express new concepts, but an existing library of best practices for this new approach does not exist. It’s still the wild, wild west folks. Developers continue to be in demand as much for their ability to learn as for what they know. In fact, mere knowledge, or the ability to use certain tools, has a very short shelf life.

If write applications that persist data, if you have never felt comfortable with SQL, or if you want to fire your DBA, have a look at DB4O. The day of the relational database is not over, but perhaps it is time to realize that an RDBMS is not the answer to every problem. DB40 provides the means to manage transactions, and I can see few objections to using it for at least some projects, or at least part of any project.

If I ever get to use DB4O, my biggest challenge would be that I have spent almost 15 years becoming proficient with old-style relational databases. I was doing fine until these new approaches came along. However, change is good – at the very least, not all change is bad. Give it a chance. Stay curious, or perish. Given the choice between DB4O and SQL Lite, an other open source embedded database, I would choose DB4O just for the novelty.

However, SQL Lite is a wonderful alternative for anybody who wants to leverage years of doing things using SQL. It handles transactions, and supports triggers. I suppose, if you need to support concurrent access, you can handle that in the code that accesses the database. I mention this product because it is blazingly fast and I like it. There are plenty of cases where SQL Lite will do, and I do better writing SQL than I do learning new ways to query my data.

By the way, DB4O is available under the GPL, but it also possible to buy a commercial license if you want to close the code. Is that open-minded enough for you? SQL Lite is available even if you want to use it for a commercial product with closed source. As readers of this blog have noticed, I am interested in many technologies, and I try to play with them when I can just for fun. I believe it keeps my brain nimble and able to keep learning new things. Let a thousand flowers bloom!

SQLAlchemy: Continuing to Learn

Ok – my work with SQLAlchemy is continuing. So far, the experience has been one of advantages and disadvantages.

The Advantages/The Plus Side:

  • My framework can use several different database engines without my having to change my code. I simply change a connection string.
  • I have a way of thinking that is peculiar to me, and this experience with SQLAlchemy is helping me to understand how others think. I frequently need to work with code that was written by others, and SQLAlchemy represents an approach that is not going to go away. I might as well learn about it. (The last project I worked on used a framework that had many of the same features as SQLAlchemy.)

The Disadvantages:

  • If I had developed my code using the approach I am used to using, I would be done now. (Of course, I would also have tied myself to a specific database product.)
  • I am having to flip through manuals to write the simplest snippets of code. I already know how to write the SQL, and I already know how to write stored procedures to do what I want to do. I am frustrated at times, but I will keep with it.
  • I can already see places where performance degrades because of SQLALchemy. I pride myself on databases that perform very well – and I like to tune databases.

But, in the end, having the freedom to use different database engines means a lot to me. I like the fact that I have postponed the decision of commiting to an underlying database. I cannot say that I regret using SQLAlchemy. I may want to share my code at some point, and I prefer to let users of the code choose whatever relational database they like.

On the other hand, I like using special features of a database – SQLAlchemy ties my hands a little.

Many people develop a database along with an application. This means that the database and the application are of a piece. I often think of the database as separate – it may be used by multiple applications, and it should be designed with this in mind.
In any event, the work continues.

Persistence Vs Analysis

I finally got it! I have been looking at persistence frameworks and various other toolkits, and I have been mentally comparing them to relational databases. Even if the framework worked with a relational database, I have been making the comparison, concluding that I would rather write stored procedures than use the framework.

In fact, I created a quick and dirty little database with the Berkeley DB, and the first thing I did was moan about my limited ability to query records out of the database. However, now I get it. Use a relational database when you need to analyze data; but, use a persistence engine when all you need to do is persist data.

Many applications are based on data that is saved and retrieved by objects, but it is never really analyzed. The only SQL statements that are used are used to retrieve, save, delete and update data. There is no continuing need to write queries against the database.

In that case, using DB40, Berkeley Db, or some other tool makes sense. Even if the tool interacts with an underlying relational database. Perhaps it makes sense not to require a skill that is not necessary – in the long run, that can save money.

I have been working with SQLAlchemy, and I an fairly impressed. This toolkit can be used with two popular Python based web frameworks: Turbogears, and Django. The old dog is learning new tricks. I may even play with Hibernate or NHibernate at some point, but the learning curve seems steep – or, that’s what people say.