the massload utility inoxmld is for loading masses of xml documents, not for loading one (or a few) massive large xml documents:
The massload utility inoxmld can be used to load masses of xml documents into Tamino, e.g. 7000 documents with a total size of 230 MB in one file.
I set up a 10 GB Tamino DB containing 155000 documents with inoxmld and it worked fine. It’s usually the document size that causes problems:
In general IT IS NOT A GOOD IDEA to use large xml documents (e.g. 10 MB). Using one large xml document (instead of many small documents) is bad -XML / database design and has many disadvantages:
- Time to retrieve the whole documents is high
- time to update the document is high
- amount of memory needed is high (try to view a 10MB xml document in Internet Explorer and look at the amount of memory this operation consumes…)
- locking prevents other users from working: usually a user will be changing only a part of a very large document. Other users cannot look at / change other parts while the document is locked.
I strongly recommend to change the XML design because one runs into serious trouble when working with very large documents. Here an example how several schemas instead of one big can reduce problems:
OLD XML design:
1 schema, 1 document stored in the whole database, size of the document: 30 MB
this is really bad because the whole database contains just one document !!
NEW XML design:
4 schemas, 15000 documents
this means by using 4 schemas instead of just one, the number of documents increased from 1 to 15000 ! This is a much better database design. Who needs a database / XML server if there are only a few documents stored in the whole database / XML server ?
Of course one might argue that one doesn’t need an XML server (but can use a relational DB instead) if one has to use several schemas anyway. Well, between the number of tables you need for a relational DB and the number of schemas needed for Tamino are BIG differences as Tamino needs much less schemas than a RDB needs tables. Thus Tamino’s performace is much
better (especially when the xml structure is complex) than any RDB (and Tamino is much easier to handle of course).
I hope this prevents others from running into trouble in the future,