Impact of calling structure on performance

I heard there is a difference in performance between the several ways of calling a subprogram.

Which is the one that has the best performance and why ?

There is only one way to invoke a subprogram, namely a CALLNAT. Are you interested in things like cost of transferring data (PDA (direct and BY VALUE), Stack, etc)?

There is only one way to invoke a subprogram, namely a CALLNAT. Are you interested in things like cost of transferring data (PDA (direct and BY VALUE), Stack, etc)?

Yes, I would like that. Also, if possible, if it would be better to include the code within the program as subroutine io subprogram.