universal receive service

I have the need to develop a universal receive service which will receive multiple content-types: xml/plain, text/xml, application/x-wmflatfile, etc… submitted via multiple protocols: http, ftp, filepolling, or email.

Sound familiar, TN would be nice here, but long story short don’t have it and can’t get it.

I have been looking into several POC’s to prove out a solution and I am running into some issues. Would like to avoid having to write custom content handlers for every content-type we need to recieve. I have placed restrictions on the number of content-types we will allow, but due to short comings at our trading partners setting content-types properly or setting them at all seems to be a difficult if not impossible task for them.

So in general my requirements are to implement a single service to receive xml or flat files with several content-types (currently restricting content-types to: text/xml, xml/plain, text/plain, application/x-wmflatfile) and upon reciept of file check the size against a threshold and terminate when the threshold is reached. Otherwise based on content-type and sender and filename received via email, http or ftp convert the data into a wM document. ie( mapping contentStream, ffdata, or node to wM services convert to a document and proceed with business logic processing)

current poc involves a java service that is check for contentStream, ffdata or node being populated then check the size of that against a threshold (fully realizing attempt to restrict inbound files to a particular threshold is almost pointless considering the content handler has already loaded it into memory.) So would like to find a way to restrict large documents.

Any suggestions would be greatly appreciated

One suggestion would be to not go down this path.

Rather than have one service try to figure out everything, it does not seem unreasonable to have a handful of entry services that “normalize” the input to some form/var and then call a common service that does what you outline–convert to an IS document (no such thing as “wM document”) and invoke the appropriate service for the logic processing.

Then you can be selective on which, if any, content handlers you override to handle large docs. The built-in large doc handling should take care of most of the needs, and you can augment the places where it does not. For example, possibly creating your own XML content handler to write the data to disk instead of creating a node object.

Just a thought.