Is it possible to stream data via a remote invoke?
I ask as according to the docs: ‘All current pipeline inputs are passed to the remote service.’
Yet testing a remote invoke with a stream always fails, but the same services using a byte object works!
The purpose of this is we need to transfer large files between different wm servers, around 800MB+ per file, so the intention was to simply stream the data from one server to the other.
If streaming doesn’t work via remote invoke, is there alternate method to get large files from one wm server to another?
To test this I wrote a simple flow service, this accepts a string (only a few bytes in length for testing), this does a pub.stringToBytes, then runs a pub.remote.invoke to a flow service on another system, this remote service does a pub.bytesToString and a pub.flow.debugLog mapping the string to message. I figured keep it as simple as possible for testing. This works fine, so I know the connection, remote server alias etc is all OK.
I then added a pub.bytesToStream to the end of the local flow service, and a pub.streamToBytes to the start of the remote flow service. Expecting it to carry on working, just using the stream rather than the byte object to transfer the data. But I get this error:
Could not run ‘dataFeedTesting’.
(‘dataFeedTesting’ being the name of the local flow service that runs the remote invoke)
If I disable the two stream steps, it all starts working again!
Streams are not serializable. A stream is only valid in the context of the JVM in which it was created. A stream, particularly a stream associated to a file, cannot be passed from one JVM to another.
If you want to use pub.remote:invoke, the data you pass between the servers must be completely in memory. IS is essentially serializing the pipeline from A to B and B reconstructs the pipeline. An 800MB+ byte array is not a good candidate for this sort of operation.
To be memory efficient, you can use http or ftp services that accept streams as input (on IS A). On IS B, you can manage how the incoming request manages the data (refer to the wM docs).
Another approach is to have IS A tell IS B where the file is and let B go get it. For this to work, you’ll either need a shared network device that both have access to or B can ask for the file from A using FTP services (that approach is a bit easier to manage the data than is having A push to B).
Hope this helps.
Thanks for that, I think I’ll go down the root of of A telling B there’s a file, then B pulling the file.
I’ve already written a token based routing process, for handling large files. (Creates a token, then passes the token through wM rather than the data), so I can adapt this process to work via remote invokes.
We can’t use shared network devices or FTP due to security constraints, so I’ll probably use sftp (ssh) instead, as I’ve already written transfer processes for sftp.
So create a token on A, pass token to B as a byte object via a remote invoke, B reads token which tells it where the file is on A, so B initiates an sftp pull of the file from A to B, then continues processing from that point onwards.