SPoD Crashes

Hi all members with SPoD Crashes in the Terminal Emulation.
On Unix/Linux Server check the Environment Variable TERM or NATTERM (if set). Set NATTERM to vt220 before start the NDV-Daemon. If NATTERM is not set -> TERM Variable will be used. If TERM is set to XTERM or other Values as vt220 SPoD’s Terminal Emulation will crash sometimes !

Mathias 8)

And whats the right Solution if the NDV runs on BS2?

Hello Mathias!

Normally I get about 3-5 crahes of SPoD per day. Today I changed my natdvsrv-settings as you described. But I can see no improvements. :frowning:

Natural still crashes sometimes …

Regards

Matthias

Our developers continue to report frequent crashes and freeze-ups in their SPoD client sessions (against NDV214 on z/OS). As noted in another thread, we cannot reproduce the issues at will but they occur repeatedly with regular use by working developers. This issue is threatening the success of our adoption of SPoD, as developers get frustrated, lose confidence in the tool, and revert to 3270.

Are other installations seeing similar issues?

Is this something SAG recognizes and is addressing?

Maybe if we could post information about environments and versions we could start narrowing down things a little? and post what is crashing, is it the terminal emulator, Natural Studio itself?

It’s a tough thing trying to narrow down what the problem is when there are so many pieces involved.

For one of the earlier patch levels of Studio we were having trouble with the terminal emulator not responding, most of the time it was the first time it was started after a PC startup. It got to the point that I could actually reproduce it on a FAIRLY regular basis and looked like it was because NDV was trying to send info back before the terminal emulation server had fully initialized (NATPccServer2.exe). After I did probably half a dozen traces on both sides the developers still couldn’t seem to find anything wrong, but in the last few patch levels of Studio the “Natural Host Server Emulation” runs as a windows service and I haven’t had the problem since, although after saying that a co-worker just had his terminal emulation session puke out, initially indicating that the session could not start. The NDV log shows some errors like:
0000006D NDV Error: PAL Error, RC:-10, ErrClass:0, Reason:1121
0BABD4B0 Sys Error, errno:1121 errno2:0x74500442 EDC8121I Connection reset.
0000006D NDV Error: PAL Error, RC:-10, ErrClass:0, Reason:0
0BABD4B0 Sys Error, errno:140 errno2:0x76697242 EDC5140I Broken pipe.

I am using the server installation option in our shop for our developers, and it looks like the “Natural Host Server Emulation” service does NOT get installed on the client. I guess my next thing to try is see if it can be installed manually (like the bufferpool service) and watch if the problems persist.

The point of all this is just to say that the more info we can provide the better the chances we can get things changed/fixed. I know it can be very difficult to get much for details, especially when it’s someone else that seems to be experiencing the problem!

Currently we are running:
Natural Studio v6.1.1 PL19 (WinXP Pro)
NDV 2.1.4 PL03 (z/OS)

Sorry for the long posting…

That’s what our developers and administrators have done in collaboration with Software AG. We did several service requests the past 5 months - without any success.

Our environment:
Server: SunOS 5.8 with Natural 6.1.1.17 and Natural Development Server
Clients: WinXP SP2, WinNT SP4, etc. with Natural 6.1.1.17

I wrote an excel-file to get an overview about the crashes and took some screenshots. I got over 90 crashes from 2006-02-07 to 2006-04-10 - and I didn’t use Natural every day. I found out the following:

  • All crashes happend while a SPoD connection was active.
  • In 32% of the crashes, no error message or anything else is displayed. The Natural session simply disappears. In all other cases an error message is displayed (like “Terminal Emulation: Failed to create empty document.” or “drwtsn32.exe …” or “Natural 6.1.1 Patchlevel 17 … send problem report”). In both cases, the Natural session is lost and the developer has to unlock all edited sources.
  • Obviously, the Terminal Emulation causes a huge number of the crashes. But on the other hand, it’s not clear to me why a SPoD session can crash during use of the Natural for Windows online-Help (e.g. during double-clicking in the index of the help).

Here is the short version of my “crash statistics”:

with error   w/o error   last action
----------   ---------   -----------------------------------------------
        30          19   click on RUN (Terminal Emulation needed)
         8           6   Natural for Windows Help
         4           2   STOW/LIST/EXECUTE/EDIT/SAVE
         3           0   Crash in the backgroud (during 
                         use of a non-SAG application)
         3           0   double click on upper left 
                         corner to close session
         3           0   click on the upper right 
                         cross to close session
         3           0   connect/disconnect to SPoD-Server
         2           1   click on RUN (Terminal Emulation not needed)
         2           1   "find objects..." by content
         2           0   jump to Natural session via Alt+Tab
         1           0   double click on map for editing
         1           0   double click on Library
         1           0   copy LDA per drag & drop
         1           0   click on the upper right cross to close source
         0           1   Rename of a source

Matthias,

Did Software AG ever get you to run using the debug version for a while to try and trace what’s happening?
Do you do the full install on each client, or use the file server option? Do you use NSC? What version of NDV?

I also just got done upgrading our file server install to NAT6.1.1 PL19 and all hell broke loose when people tried using NDV 2.1.3, people couldn’t stow “larger” programs, couldn’t unlock source, and couldn’t copy/move source between environments. We switched to NDV2.1.4 (which is what I did most of my PL19 testing against - whoops!) and things are good again.

We never did use PL17 with NDV2.1.3, went from PL14 to PL19.

How many that are seeing the crashes use the auto-save and/or the auto-refresh options? You may have tried it already, but try disabling them for a while.

Back with PL05 (?) there was a problem where if you had the autosave on and had a map open and ??? it would cause Studio to crash…

First of all: Thanks for helping. We appreciate any good tip!

Yes, our system administrator did something like this! IIRC the result was: The tracefiles were quite big but Software AG said, that everything is OK.

Yes, we did a full install of Natural 6.1.1 PL17 on each Client just using the original CD.

Natural Security? No! We’re using only Predict as an additional product.

Do you know a possiblilty to get the version of natdvsrv? I found nothing on the help-page. But the last installation-logfile of PL17 says:

Found IDL 6.1.1.0
Found NAT 6.1.1.14
Found NDV 2.1.4.14
Found PRD 4.3.2.6

So I guess we’re using NDV 2.1.4

Autosave is OFF. Auto-refresh was ON but I turned it OFF for a test. Let’s see …

After a few hours I have to say: Nothing changed. :frowning:

Brick, have you found any difference in the stability of the various patch levels?

Maybe one of the forum moderators could check the support request logs to see how widespread these “spontanious” crashes are… it would make more sense for them to try and keep track of the reports to see if the problem could be narrowed down at all.

Like Brick mentioned before, this really can threaten the success of the product… it’s already difficult enough to get some people to start using the product, how many times do you think they are willing to lose changes & time because of a crash.

Even if people haven’t opened a support request, how about adding to this topic with info about your environment (products & versions) if you have been experiencing the same problem(s)?

It’s sort of weird, but you will get used to the SPoD crashes…
You will press CTRL+S automatically after coding a few lines, you will click on SAVE before you click on RUN. You will open up a new Natural session to call the Natural online help. Or you will simply use another Terminal Emulation (like PuTTY) and do your stuff directly on the host… And you will make jokes about Software AG products. :wink:

But you’re right: It makes no good impression to new Natural-Developers. The SPoD-philosophy would be a really good thing …

Yes, Matthias, you are right. People will get used to these crashes and accept it as a “feature”. That makes me think of my past when working with PowerBuilder. From time to time the development environment (and only this, not the generated executable started alone!) crashed with a general protection failure (GPF). The developers get used to call it “Great Powerbuilder Feature” :lol:

Our server environment is z/OS 1.4, NDV214, NAT414, NSC414. Our clients are on a range of PLs of 6.1.1. We have the client installed on 80-100 machines, and currently average about 40 developers connecting to the server each day.

I don’t yet have detail on patch-level differences in levels of stability. I am about to begin an effort to get more specifics and detail on the extent of the problem with the client; currently I’m getting limited anecdotal feedback, and the number of regular users has not risen as quickly as we might have hoped.

I’m interested in the degree to which other installations are seeing similar issues in order to clarify whether our issues are due to the product or to idiosyncracies of our installation and use of it.

The kinds of issues we see include:

  1. when running/executing a program, a blank terminal emulation screen appears without any output, and the client application is frozen. The only way to recover is to kill the app, re-connect, unlock your objects, recover any lost changes, etc.

  2. occasionally the app will just disappear without a trace.

  3. We’ve had numerous instances of client failure where it puts up a dialog with the text: The instruction at “0x5ad71531” referenced memory at “0x00000014”. The memory could not be “read”. The same memory addresses are referenced on multiple machines, so seem significant. However, since we can’t reproduce it on demand, SAG Support was unable to proceed with it. The detailed error message was apparently of no use to them. We also see this kind of error referencing other addresses as well.

As I said, we intend to investigate this entire issue more thoroughly with our developers and hopefully will be able to get a more specific analysis of the extent and nature of the problems. (Maybe we can get some of them to maintain a spreadsheet like Matthias!)

Thanks for all the comments.

I know 1) and 2) very well! But this 3) is completely new to me. Maybe this error only occurs in connection to mainframes …

My spreadsheet got the following columns:

  • Date/Time
  • last action (e.g. “click on RUN”)
  • effect (crash with error, disappear, only message)
    Here it would have been better to distinguish between “crash” and “hangup”.
  • locked sources
  • link to screenshots

Is something missing? Maybe Software AG can tell us …

We have had the
“The instruction at “0x5ad71531” referenced memory at “0x00000014”. The memory could not be “read””
error, but it was after a
"The instruction at “0x7c911e58” referenced memory at “0x00000000”. The memory could not be “read”.

It happened while working in the LDA editor.

Brick, your 1) sounds like what was happening before the “Natural Host Server Emulation” service was being used. Have you had a chance to check on the machines this is happening to if that service is installed?

Chad,
Is this a 6.2 thing? I don’t see it and can’t find any reference to it. I have no service by that name, nor see a task running during emulation which might be this. Can you tell me more?

Thanks.

Brick,

I am thinking you are right, but it’s weird… I checked a few other of our installs of Nat6.1.1.19 and they don’t have the service running either.

I do have 6.2 installed on my workstation in parallel with 6.1, but the service points to the 6.1 version of the NATPccServer2.exe. The 6.2 version also uses a different name for the executable, NATPccServer3.exe. I could see the path to the executable being dynamic, but the name itself? Maybe there’s more info in the registry about the service…

I wonder if I could set the NATPccServer2.exe up to run as a service on one of my other machines to see if it makes a difference…

I think the NATPccServer2.exe should be running during emulation, that was the process I was playing with that seemed to cause the terminal “hang” when I had the support request open before.

OK, did a little playing and found that the “Natural Host Server Emulation” service can be installed manually using the NATPccServer2.exe with the -install switch. After that I just went to the list of services and set it up to start automatically. Also found that the -remove switch works to get rid of the service.

I am pretty sure I did not do this on my workstation, so this is still a mystery…

I’m going to try setting this up on a few of our machines that use the client install, with any luck it won’t be a bad thing :wink:

I’d recommend to open a support request to narrow down these instabilities. Maybe multiple requests if you can identify several dedicated situations.

The required materials to analyze such a sporadic abend are

  • a NDV server trace (with TRACE_LEVEL=31+27)
    a Natural Studio trace
    and for problems with the TE also a TE trace.

The traces must correspond to each other. The best is, to identify a client having stability problems frequently.
Start an exclusive server for that client with the corresponding TRACE_LEVEL and configure the Studio to produce the corresponding traces. Now let the client work until a problem appears. Package the corresponding tracefiles together with the description how the problem did appear to the client.
If the problem only happens on a server serving multiple clients you can limit the server trace to a particular client using the TRACE_FILTER definition.