I have a natural program which reads the DB file and after some processing write the details in output work file and updates DB file. However after processing some records the job fails, so when i restart i want the program to write the records to the work file in such a way that it should overwrite the already written records (only the selected) from the desired record position. You can consider the below scenario
READ
Process
Write Fail
Update
END-READ
The failure occurs after the data is written to flat file and before the update has happened…so the question is how to overwrite the data written in the file when we restart the job.
Suppose after writing 20 records in the output file the job failed but when i restart program should write the data starting from 18th record.
Using DISP=MOD will append the records at the end of the file but not from the desired position.
This is not directly possible because the work file (which is a seq. fil) is not part of the database transaction logic.
If you self know which records that must stay and which to skip, you can at re-start of your program read the first 18 recs from the previous file to another file and your program can continue to write to this file.
You can also bring your seq. file into the database (define a file for your output in the database). Then it will be covered by the transaction logic.
The solution is to make the WORK records uniquely identifiable, adding a sequence number if necessary. Each time you restart, “duplicate” records will be written - in your example, records 18, 19, and 20. In the next job step, sort the records by the unique key, and use parameters
SUM FIELDS=NONE,EQUALS
to remove the duplicates.
If you needed to add a sequence number to the end of the record (let’s say after byte 80), it can be removed with
Another option. You can change the write for a move to a internal buffer(array of records). When the transaction is finished sucessfuly, you write those records to the seq file, otherwise you can clean the buffer.
That won’t work, Alexbj, because if you issue the ET and the program fails before the WRITE, you’ve lost a buffer full of records and you won’t recover them during the restart.
You need to execute the WRITE within the committed transaction, which puts you back to the original problem of executing the WRITE and having the program fail before the ET.
Shre, in my opnion, if a fail occured in the midle of a transaction, all the detail data of that uncompleted transaction that were written to the file is useless. By the way, all DB data updated was backed out.
Using a buffer, you guarantee that all data written is a result of a sucessful transaction. If the failure arises between the update and the write of the buffer, you can use an on error clause to handle it.
Regarding the restart, it will occur at the next immediate record of a succesful transaction, so your buffer will be filled again by the detail you lost in the last failure.
This is not guaranteed. Your job step is just as likely to fail from an x37 (space allocation) or S222 (out of cpu time) as a logic bug, but these system ABENDs are not trapped by the ON ERROR.
For restartable steps that write WORK records, you must expect duplicates, and you must deal with them.
Mogens offered the only real alternative to avoid duplicate records - STORE the records in Adabas as part of the transaction logic, and add a subsequent job step to extract them to a WORK file. Then add another step to empty the Adabas file. But I presume that Shre wants to adjust his logic rather than overhaul it.
Thanks to all for responses. I was trying to use temporary workfile or arrays as suggested by Alexb but as per Ralph it will be useless if job fails due to system errors like Space allocations, Resource unavailable, Time out errors etc… in which case this restart mechanism won’t work and obviously needs a manual intervention…