TAMINO vs relational database

Hello,

I would like to start a discussion on a problem I met during using a TAMINO!

What do you think, when it is usefull to use TAMINO(XML database) and when it is more reasonable to use relational database.

For example: is it reasonable to use TAMINO for storing clients information or any other information, that usually were stored in 3-4 tables in SQL database?

best regards,
Mark

This is an excellent question but not an easy one to answer quickly!

There are a number of factors: for example, you will typically have a wide variety of different kinds of data in your organisation, some of which are more suited to an XML database and others more suited to a relational database. So do you use the XML database for everything, or the relational database for everything, or have two databases?

The kind of data that is best suited to Tamino is, in my view, data that you can naturally think of as “documents”: things like patient records, health and safety reports, invoices, job applications, product data sheets. For the simple data (what I like to call the “ledger-books” - the row and column data - a relational database will do the job better; but that doesn’t mean you shouldn’t use Tamino for this kind of data if you need it there to do the more complex stuff.

And of course Tamino isn’t just a data store, it’s an integration platform that you can use to bring different kinds of data together.

I agree with Mike Kay’s points. Remember that the whole point of Tamino is to store XML data easily, reliably, and so that it can be queried efficiently. If you simply have an application and need to persist its data, as a practical matter the RDBMS world offers a better-understood model, more powerful methodologies, a much wider range of tools, etc. If you can normalize your data into 4 tables, there’s no compelling reason to use Tamino … it will work, of course, but it would be hard to present a business case showing any tangible advantage of Tamino over some other DBMS.

XML and Tamino really start to “show their stuff” when the data model becomes too complex to easily fit into the relational paradigm. I believe that E. F. Codd proved a theorem that any data can be normalized into relational form, but as a practical matter relational normalizations can explode into hundreds of tables, and queries become correspondingly complex, when a corresponding XML representation is much more straightforward. Recursive data is one particular area in which SQL becomes very awkward, but XML handles cleanly and gracefully. For example, a “bill of materials” application is a classic “hard problem” for SQL; RDBMS programming books typically devote whole chapters to normalizing and querying data that is natively represented as a tree … for example “find all the components that contain a given subcomponent somewhere inside them” … or “find all the subordinates of Vice President Schmidt in the organization chart” are queries that are very challenging in SQL but simple in XPath.

There are deeper mis-matches between the relational model and the XML approach that I haven’t seen discussed much, nor have I thought through myself. For example, the relational model has a well-defined discipline for designing databases to minimize redundancy. In XML, you have various options, none of them without problems. For example, you can generally push potentially redundant data common to a number of elements to a position in the XML tree above all those elements, but this doesn’t always work, and might lead to designs that are semantically misleading. Or XML provides a number of mechanisms – external entities, XInclude, XLink embedding – for referencing a single master source of data in multiple contexts… but we haven’t worked out how to do the equivalent of a JOIN so as to make queries work easily, efficiently, and properly with this kind of construct.

Likewise, XML has no methodology for enforcing referential integrity … in fact, one might argue that XML’s implicit definition of “integrity” is in terms of documents as a whole rather than abstract data relationships. This is very murky to me, but it may suggest that XML/Tamino is more useful when you have “documents” that are defined, stored, digitally signed, etc. AS A UNIT … where RDBMS models/systems are more appropriate when you have a bunch of data that is used in different ways by different applications, and you need a way of enforcing a more complex model of “integrity”.

As for performance, the more complex and hierarchical the data, the more Tamino will outperform an RDBMS (even and XML-enabled RDBMS). Conversely, the more easily data fit into neat rows and columns without extensive normalization and optimization, the more likely an RDBMS is to out-perform Tamino.

You mentionned the prerequisite to go for a XML server instead of a RDBMS:
- document kind of data (recursivity, sequentiality, textual attributes (paragraphs, bold, italic, URLS, tables, …))

Given those pre-requisite, for a solution-provider (versus a solution-consummer) there is a very good reason to go for an XML server: he is anyway going to go for
- XML for modelization/analysis
- XML for information exchange
- XML for XSLT rendition techniques
- XML for structured authoring
…so it is going to implement a XML serialization of his objects anyway.

Going for Tamino means for implementing searchable persistence “no more work than serializing the data and deciding what to index”.

→ a XML database server is interesting for a solution-provider because it suppresses the cost of having to map in an optimal way the data in persistent stores + indexed fields. :stuck_out_tongue:

Moreover, the database-level structure in an XML server is directely understandable by the document-analyst of the customer. :cool:


Software AG Belgium, Professional Services Division