?

Log in

No account? Create an account

The Case for the Relational Database

« previous entry | next entry »
Jul. 18th, 2007 | 04:25 pm

My framework of choice at the moment is called "everything". Everything has very few users in the world, but it holds to a certain number of
design characteristics that I like.

  • It deals with entities as objects
  • Objects are inheritable
  • Easy to hack
  • Revision control on objects

    If I were a Java developer, which I have not been in more then a
    decade, I would use Hibernate. Hibernate has many of the design characteristics that I list. Similar frameworks exist for PHP and Python.

    Why do I like the approach of using objects? Objects are entities that I can serialize and store in caches as a single discrete item. Object caches are common in web architectures now(aka Memcached and other similar creatures). Object store works well, and it is constantly improving.

    Many web infrastructures can be run out of object store systems, but not all.

    Many, and I would go as far as to say most, web applications require search. Some search like capabilities can be done using Lucene or other fulltext search engines. Searches from these systems are fuzzy. Google, which uses a full text search, ranks and filters based on the loose structure of objects. It does this well.

    Systems like Google and Lucene can frequently be "good enough" for simple problems.

    However full text approaches cannot give you the exact number of objects that you have with a specific attribute with assurance.

    Needing this assurance is a requirement that requires an accurate search. Frequently this requirement reflects the need for a more analytic approach to finding results, or a need to find ranges of results.

    A search for an exact result in complex objects requires a relational database. This is an accurate search, not a fuzzy search.

    The requirement for a relational database is grounded in the need for an accurate results.

    Knowing that you need an accurate result leaves with the question of how you will make use of a relational database.

    Fundamentally there are two approaches to object store and search:

  • Serialize all aspects of the object into a relational database.
  • Store objects serialized into a container and pluck out components for search.

    I go with the first approach. Relational databases have a long history of handling durable store. Adding a new quick search means adding an index (in object stores you lack indices, so you can conceptually consider a non-indexed relational query and an object store search as being equivalent). Once you have an index, you have a speedy vector to search on.

    The relational nature of storing objects in a database means that you can look at partial stores of objects. You can even de-normalize tables for further refinement.

    For instance, if I have a "user", I can look at a subset of the attributes of "user" to find, for instance, males who like climbing, and who live in the city of Seattle. By keeping all elements de-serialized in a relational database, I never have to worry about whether an attribute is in the database or not.

    If I use an object store, then I am required to map out my attribute storage ahead of time. There is no flexibility in this approach, since I need to know going into the design which attributes will need to be searchable.

    There are advantages to leading with object stores.

    Object stores have none of the overhead of having to rebuild themselves from many tables (or even attributes from a single table). This is a big win, and it’s required if your web site/application in usage (everyone wants to be Google right?). Object stores excel at speedy retrieval of single objects (or non-grouped objects).

    The approach I lead with makes good use of the nature of object stores without harming the flexibility you gain by using a relational database. In this approach you cache objects in object stores, but maintain durability and accurate search in a relational database. Having durability be maintained in your database means that you can boost the object store's performance design by not having to be concerned about this issue in your object storage.

    The reverse of this approach would mean that you could decide to not make your databases durable. Choosing to not have your database be durable can be expensive. Rebuilding your database is considerably more expensive, then generating new objects from it.

    The big win for me is flexibility. Having access to all attributes in the relational store at my finger tips is why I stick with relational databases. That flexibility is something I require because it saves me time. Also, as tools go, SQL has been around for a while. I am not in love with SQL, but it is very flexible, and it is well understood. Tools for SQL are plentiful and easy to build.

    So when I am asked "is there a case for getting rid of relational
    databases?", I answer no.

    Web architecture cannot be done in a way that leaves you with a flexible environment without using a relational database.
  • Link | Leave a comment |

    Comments {0}