The pipeline is passed between services?!?

Hi,

I’m new to WebMethods so please forgive if I ask questions with obvious answers…

I have been developing before using C, C++ and more recently Java and C#. The flow language presents something new: the pipeline.

1st thing I discovered is that the pipeline will just accumulate all the existing variables unless there are “dropped”. Fine, this shows what variables exist and actually it forces you to clean them as soon as you don’t need them, otherwise it’s a mess - let’s say this is good.

2nd thing is much more “surprising”: the pipeline is passed from services to services… What this mean is that when service A calls service B, A not only passes B’s input variables to B but it also passes all the variables in the pipeline.

Concrete example: B takes in input x and returns y
1. Pipeline before A calls B: a = 3, b = 5
2. A calls B and set x to 7
3. B has in the pipeline: a = 3, b = 5, x = 7

So the normal processing could be:
4. Some flow stuff in B
5. Pipeline is cleaned of the new variables created in B
6. B sets y to 8 and returns to A
7. Pipeline in A: a = 3, b = 5, x = 7, y = 8

A few problems occur then:
i) What happens if B also deletes or overwrites a variable with the same name as the one in the pipeline of A before it was called (b for instance)? Then A will not find the expected value when B will return!!!
ii) If B doesn’t clean up its pipeline then some unexpected variables will be returned to A

----------------------------

So that’s for setting the context, no my questions:
1. How can we solve i) and ii)?
2. Why is the pipeline passed from services to services??? Shouldn’t only inputs be passed?


Thanks for the help


Everything you’ve observed is true, and it serves as a reminder why ‘proper’ languages have the features they do.

One mitigating fact you’ve missed is that of scope. A service works with a COPY of the pipeline. Its outputs are then merged with the caller’s pipeline. So as long as services don’t have any outputs they don’t declare, it’s mostly alright. Make sure you have a step to clean the pipeline as the last step of any flow service!

So if I understand your point the pipeline is being copied over from services to services. This should solve question i) above: if the pipeline of a sub-service is properly cleaned then nothing will be copied back except the input and output and all is good.
However, after playing around a bit with Developer I noticed that this doesn’t work for Documents. If service B called from A overwrites a doc existing in A then cleaning the pipeline at the end of B won’t work. A will still get the overwritten value…

Regarding ii) it cannot be expected that all services will correctly clean the pipeline and it shouldn’t be expected: services are black boxes and the caller doesn’t know the implementation of the service being called. It should only have to care about the service signature: input / output / functionality.
It seems to me that this pipeline thing is just removing any sense to the concept of service.

WebMethods seems to have a lot of good stuff in it but this just confuses me… Help…

Interesting spot there about documents. I’ll look out for that.

What you’ve discovered is what I’m sure most of us know - Flow is messy and it’s easy to introduce subtle bugs to Flow code.

My recommendation is to use Flow when you have simple tasks that lend themselves to its strengths (mapping from one document structure to another). Everywhere else, write Java services. It’ll keep you sane.

Edit: I note that I’ve contradicted the main thrust of the sticky thread “Integration Server and Java”. Best keep that discussion there I suppose.

Here is a view that may help: the pipeline is a global variable pool that is visible to every service within a thread of execution. Keeping that in mind, along with understanding how you can limit pipeline scope when appropriate, may be helpful in understanding the behavior.

Ok, I have now accepted that we have to trust whatever service we call that it cleans properly the pipeline. I don’t like it but be it.

I have now investigated the document thing mentioned above and I have the answer: string and literal values are set by value but document are set by reference. What this means is that dropping a document at the end of a service is not enough to make sure you won’t interfere with the service above.

Am not vey good at explaining so I’m pretty sure I have lost everyone. Let’s go for an example instead:
1. A has document “doc” in the pipeline with a string element “test” having a value “valueA”
2. A calls B, B gets A’s pipeline
3. B creates a document “doc” with a string element “test” and set it to “valueB”
4. B drops doc
5. B returns, B’pipeline is empty so nothing is copied back to A’s pipeline
6. Value of “doc\test” in A is now valueB!!!

How come the value of “doc\test” has changed even if nothing was copied back? The reason is that A actually passes “doc” by reference to B and so whatever changes B makes on it is actually directly impacting A’s pipeline.
This is not what was expected: A still expects to have “valueA” in “doc\test”.

Question: when manipulating documents, how can we make sure that our service is not changing other services pipeline?

Aggressive pipeline management is indeed a good practice. Services should drop vars as soon as possible when finished with them. Some people like to use clearPipeline a lot but I much prefer dropping vars explicitly as soon as possible to keep the pipeline minimized.

You’ve described the behavior of variable handling perfectly. Documents are copy by reference, primitive var types are copy by value.

  1. It is not “A’s pipeline”. It is “the pipeline” that all services within the thread will use. All vars in a pipeline are visible to all services (there are exceptions: transformers and services called with a scope do not have full visibility).

  2. Service B just replaced “doc” for everyone within the thread.

There is no “copied back”. There is only the pipeline.

Calls to other services do not have a stack, like you’re used to in C/C++, C#, Java, etc. It “appears” that services accept “parameters” and have explicit return variables, but they do not. The input and output declarations are design time helpers only. They are used to identify what variables a service expects to be in the pipeline at run-time and which variables will be added to the pipeline. But at run-time, there is no enforcement of this and parameters don’t need to be declared on the input/output tab for a service to function correctly.

You’ve discovered one of the fundamental “gotchas” of IS. When everything is a global variable, it can be challenging to keep services from stomping on each other. But this isn’t really as big of a deal as it seems.

As you get into designing services for your integrations you’ll find that global and local var names (conceptually) naturally emerge. Use descriptive names (e.g. purchaseOrder) rather than generic names (e.g. doc). As you get familiar with the built-in services you’ll get a good feel for var names to avoid as globals. For example, “doc” won’t last very long in any sequence of calls. Using doc locally for a few steps is okay, but not for long-lived var over multiple services. It will eventually get clobbered by a call to a built-in service of some sort.

HTH

You’re absolutely right - Strings are copied by value and Documents are copied by reference. (There is an exception to this rule, and that is if a String is mapped to an Object.) And the amazing thing about this is that you didn’t even have to map your “doc” to flow B - it just went as part of the call to the service!

There is no easy way to prevent this behavior. Awareness of the problem is your best defense.

Not creating or modifying doc/test in both parent and child services is one option :).

Another option is using pub.flow:clearPipeline as the first service in Flow B. That way it will dump all copies and references to variables from Flow A, except for the ones you tell it to preserve. As a result, when B creates doc - it is a brand new instantiation and does not affect the existing doc from Flow A.

Oh yeah, and what Rob wrote while I was typing my reply so slowly. :slight_smile:

Hmm. Are you sure about that? I need to refamiliarize myself with that behavior.

Positive - just tested it to make sure.

Flow A creates doc/test and sets to valueA
Flow B starts with clearPipeline and then creates doc/test and sets to valueB. It then drops doc/test.
Flow A regains control and doc/test is still set to valueA.

If Flow B does not drop doc, then of course the valueB flows back and overwrites valueA. But if it is dropped, then it does not flow back, and since the copy by reference was removed via clearPipeline, the new doc is just that - a new doc.

Great to see some posts on this threads, thank you guys for you thoughts.
--------

After testing it, clearPipeline seems to be the work around I was looking for. Thanks for pointing this out.

But now, back to my original question: why aren’t only input fields passed to a flow service and output fields the ones that should be expected?
Reamon, you write that the service signature in Flow is only useful at design time but why was the flow language designed like this? A “service” in general should be fully defined by its signature…

Is this a “feature” wanted?

As to they why it works this way, who knows for sure. I suspect it has to do with how the product was originally created and behaved (back in '98 or so). At this point, I imagine changing the behavior is all but undoable–too many installations would break.

We just need to understand the behavior and handle its quirks.

You’re right that a service should be fully defined by its signature. Good rules of thumb:

  • Always define inputs and outputs. Should never assume the presence of a variable that is not explicity declared as an input. (There are some exceptions to this, e.g. TN interactions in some cases.)
  • Be aggressive with keeping the pipeline clean. Don’t be a pipeline litterbug.
  • Avoid generic variable names to the degree possible.

I imagine other folks have some additional rules of thumb that they can share.

As mentioned by Rob, be aggressive in keeping the pipeline clean by dropping variables as soon as they are no longer needed and don’t rely on clearPipeline as it rolls all the unnecessary variable to the end of flow.

Also webMethods flow steps can leave behind “hidden variables” –variables that are output by a service, but are not listed in the declared outputs. The technique to get rid of these hidden variables is to add them to the pipeline and explicitly drop them. So at the end of the service, the only variables left in the pipeline should be the variables declared in the output tab. If needed, add a MAP (Clean up) step to the end of each service and drop any extra pipeline variables there.

On smaller flows, or with variables containing large amounts of data, I favor dropping variables aggressively.

On longer flow services, however, I don’t drop variables on each step. If you do, troubleshooting later can become very tedious. I favor a modified version of Talha’s suggestion: put a MAP step in every five steps or so and drop unneeded variables there. You can comment the MAP step with “Dropping variables here” to make them easy to find.

I started doing this after having to look line by line through a 400+ line EDI handling flow several years ago…

On the other hand, if a Flow service has 400+ lines, it’s probably time for a Re-factoring Party! :slight_smile:

  • Percio