Checksum for replicated data?

We will replicate very soon a 140-million-record file from Adabas on the mainframe to Adabas on Windows. Tests with smaller files went very well and comparisons between the sending and the receiving file didn’t show differences (except for floating point).

Now, our management asked whether something like a checksum or a control program for our initial replication is available to make sure that the sending and receiving file is identical.

Any ideas? How do other companies compare? Just trust the software? SAG supports a checksum for ZAPs, for example.

Thanks,
Dieter Storr

Hi Dieter,

I not sure exact which level of comparison you look for?

You can after the initial state compare the number of records on Windows with the number of records on mf. But it requires that no inserts/deletions are currently being replicated (on the way from mf to Windows).

Comparing record contents with some sort of a checksum I think would be impossible because of the different representation of data on the two platforms.

We use Adabas Event Replicator to replicate.

We wrote a Natural program, which reads an Adabas file on the mainframe
and compares it with the same replicated file on Windows. Entire-Network
makes the data conversion possible. To compare 5 million records took
approx. 10 minutes.

We don’t want to run this program everytime files were replicated. It was
just part of a POC.

And yes, during a test phase companies compare FTP’ed datasets or files.

“Better safe than sorry” or “Trust is good, control is better”

Dieter Storr
Storr Consulting, Inc.

Last year, I started a discussion about why to compare the data on the source (subscription) with the replicated data on the target (destination) database. You said that you should trust the software.

Now, we are glad that we wrote a Natural program and compared fields and number of records on both sites.

The number of subscription records are higher than the destination records for one file. Also the Dollar amounts don’t match.

Question:
The file on the source is defined with REUSEISN=YES and on the target with REUSEISN=NO. Could this impact the following replication:

  • Delete ISN 10 on the source database
  • Replicate this delete on the target
  • Store a record with ISN 10 on the source
  • Would ISN 10 be replicated on the target?

Does the Replicator replicates with PLOG logic or uses Adabas commands?
Does someone use Insight to view the commands on the Replicator Engine? (Yes, Insight works - so CA).

Any ideas?

Thanks,
Dieter Storr

We replicate our mainframe Adabas files to Windows by using the ISN.

So it makes sense to me to define the same parameters on both source and target DB.

Regarding the log file of the replicator engine, we received a lot of RC 113, which indicates that it couldn’t find the source ISN on the target.

By the way, we now wrote a Natural program to compare all records of a file on the source and the target database by calling USR4011N and create A20 hash value for variable input.

Dieter Storr

Jim asked on SAG-L why both files must be use REUSEISN=YES.

If on both ends are Adabas files (source and target) and especially, as in
my case replicated by ISNs then the files should be defined in the same
way.

Adabas on the mainframe with REUSEISN=YES and Adabas on Windows with
REUSEISN=NO can lead to errors like this (from the replicator engine’s
log):

ADAF18 N2 cmd to DBID 187 FNR 14 RSP 113 subcode ISN
237213
ADAFCV The record to be inserted already exists on the target DBID/file
ADAFCY The record will be updated.
ADAF54 2009-02-10 16:19:49 Replication error: Adabas destination D187017

ADAF18 A1 cmd to DBID 187 FNR 18 RSP 113 subcode
ISN 38
ADAFCU The record to be updated does not exist on the target DBID/file
ADAFCX The record will be inserted.
ADAF54 2009-02-10 16:07:10 Replication error: Adabas destination D187018

Both replicator logic is wrong and leads to corrupted files.

Dieter Storr

The question from Jim ‘bewildered’ Wisdom on SAG-L ‘Why does RESUSEISN need to be on if they are?’ is very valid and let me think again.

A STORE (N1) on the source DB changed to a STORE with ISN (N2) on the target DB to be sure the same ISN will be used. The question then would be why does the requested ISN already exist?

A STORE with ISN (N2) also received a RC98, unique descriptor value already present in index during update. This doesn’t point to a problem with not using ISN reusage on the target database.

Maybe SAG can shed some light on it.

Which file parameters should be the same on both source and target databases?

Thanks,
Dieter Storr

Now, it seems that other factors played a role in our RC113 and RC98.

The Replicator Engine ran out of NAB space and we lost data.

Larry started a thread at the Community Discussion Forum under
‘Nabs needed for Replicator.’ You should read it.

Very interesting situation and I think that SAG Replicator developers
should read this. The documentation points out that you only lose data if
the size of the work dataset is too small. Now, NAB is another factor.

It is not fun to re-start replicating files with 140 million records.
We have to FTP large files and load it into Adabas for Windows.
The ‘Initial State’ would take too long.

I hope that I am wrong and we missed some parameters to prevent this
situation.

Dieter Storr

Regarding the original question about the checksum, we used Replicator’s initial state and replicated a small file to Windows.

We also FTP’ed the same file and loaded it into Adabas on Windows under a
different database number.

We closed the destination for this file and wrote a program to create a
checksum

READ MULTI-FETCH ON DDM BY ISN /* read by isn is important

CALLNAT “USR4011N” #4011

END-READ

Output:

DBID Records Checksum CPU Time


187 323 F2E6CB086F0E38EB71B39D9D0076EE2B41B1C49E 00:00:00.0
250 323 F2E6CB086F0E38EB71B39D9D0076EE2B41B1C49E 00:00:00.0
Counters match

Examples are on SYSEXT.

Dieter Storr