I want to load a large number of XML documents (250,000). I am doing it using javaloader. As I understood that javaloader is not so good at handling so much of data, I have written a shell script which recursively calls javaloader. In each call to java loader I am loading a single instance. The loading went quite well until it has loaded 1000 documents. Later it seems to be hanged. There is no error message. I am trying to stop the execution by pressing Ctrl+C, there is no response for this either.
Kindly help me with suggestions.
the amount of documents you are trying to load does in fact suggest something other than the java loader.
the point is: the java loader loads the documents one-by-one, each individual document is inserted into the database, the index is updated per-document as well. this is not the most efficient way to do it.
please consider the mass loader inoxmld instead. the basic idea behind it is: it will load all the documents in one go, build an indermediate index and merge it with the existing index.
if your input data is not in the right format, i would assume that it would be worth while writing a small SAX sequence that reformats your data to the mass loader format.
please check the mass loader documentation for further information concerning this topic.
if you have further questions: fire away.