I was wondering what is the limit of the number of files you can have in a database.
I have 50G of data, should I create ten million of 5K file, or 1 million of 50K file, or 100,000 500K file, or 10,000 5M file?
I did some tests, seems like the biger per file, the longer it takes to load the data. It means, if ten million is a number Tamino can handle, I’d take that option.
I have enough diskspace, and memory(4G)
the maximum number of documents tamino can handle is about 4 billion, so you are well below the limit with 10 million
I can actually split the file however I want, but the question is which
one is better:
10,000,000 5K files or 100,000 500K files?
Which one will have a shorter loading process?
Which one will have a better search performance?
From what I’ve read (and my own limited experience tends to support this) Tamino handles many small documents better than few big documents.
I must admit that I would expect the 5K documents to be faster for storage/retrieval purposes.
If indexes are used then the size of the document should not really affect the speed of searches.
The general rule of thumb I would apply is that a lot of small documents is generally preferable to one very large document. Remember to apply indexes for search fields.