Problem with loading a large XML document

Hello,

unfortunately I am not able to load a large XML document (ca. 125 MB, 3.500.000 Lines) into my Tamino DB.

I am using Tamino 3.1.2, Windows XP Prof. and I have a 2,4 GHz CPU and 512 MB RAM.

After starting the loading process in the Interactive Interface it takes circa 15 Minutes until an error message like “unknown error” or “Transaction aborted because it has taken too long” appears.

When I cut the large XML-file into 16 smaller files (each 7-8 MB, 220.000 Lines), I can successfully load one after the other within seconds. So there is no problem with the XML-file itself or with the Schema of my DB.

I have also tried to use the Mass Loader Utility, but this did not help.

Is the performance of my PC insufficient? Do I need more RAM? Or is there another reason for my problem?

Best regards

Thorsten (NSDB)

Hi,

on a technical level you’re likely to encounter a transaction timeout. This is a server parameter you can adjust (I suggest to a very high level).

On a design level my experience is that such a big document-model has a design flaw. Think about SQL for a minute: SQL is atomic values in restricted length table rows. The flexibility comes from storing all data of an entity together, no less, no more.

Back to XML: When you have a 100 MByte document you’ve probably lumped together things that don’t belong together.

Try to re-design your data model with the aim of creating entities / documents that are less than one MegaByte in size. You gain the flexibility and speed of the database. Tamino operates much faster on a big set of documents than on one document. In your design, you would probably end up with ONE document being the answer to all your queries. But the gain comes from the database sifting through stacks of documents and finding the selected few that match.

Best regards,

Andreas