Hi all,
how do you Sort your Workfiles? Do you use Naturals internal Sort, or use some external sort Programms. And if last one how about negative numeric fields, binary, floating point and so on?
Greetings Sascha
Hi all,
how do you Sort your Workfiles? Do you use Naturals internal Sort, or use some external sort Programms. And if last one how about negative numeric fields, binary, floating point and so on?
Greetings Sascha
A few years ago, I had the problem to sort 1.7 Million records by an ascii-string. I solved this with a Unix-sort, because it was significantly faster.
Generally, negatives and decimal points are no problem for the Unix-command sort -n. But numeric fields in terms of Natural could be a problem. If necessary, I would be best to write an own sorting-algorithm (e.g. in perl) for that issue.
We are using SyncSort on HP-UX. we use it as external and as internal sort with Natural.
No problems with “negative numeric fields, binary, floating point and so on”.
I asked because i don’t know much about Natural for Unix Workfiles. But to the Sort-question. AFAIK sort also as grep and many other unix-command-line tools works linebased. What if i use the following Program:
define data
local
01 #workfilestructure
02 #binary-first (b4)
02 #text (a80)
02 #nummeric (n14.7)
02 #floating (f8)
02 #anothertext (a8)
end-define
#binary-first = H'0A'
#text = 'Hi Community'
#nummeric = -12345.789
#floating = 0.000001
#anothertext = '01234567'
write work 01 #workfilestructure
end
if i use hexer it looks like:
00000000: 0a 20 20 20 48 69 20 43 6f 6d 6d 75 6e 69 74 79 . Hi Community
00000010: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
00000020: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
00000030: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
00000040: 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
00000050: 20 20 20 20 30 30 30 30 30 30 30 30 30 31 32 33 000000000123
00000060: 34 35 37 38 39 30 30 30 70 8d ed b5 a0 f7 c6 b0 45789000p.......
00000070: 3e 30 31 32 33 34 35 36 37 0a -- -- -- -- -- -- >01234567.------
hexdump -vC is:
[code]
00000000 0a 20 20 20 48 69 20 43 6f 6d 6d 75 6e 69 74 79 |. Hi Community|
00000010 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | |
00000020 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | |
00000030 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | |
00000040 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | |
00000050 20 20 20 20 30 30 30 30 30 30 30 30 30 31 32 33 | 000000000123|
00000060 34 35 37 38 39 30 30 30 70 8d ed b5 a0 f7 c6 b0 |45789000p.
I almost forget about your sorting problem. Here is an example of a sorting-algorithm in perl. To keep it simple, I only used 15 Byte per record containing three fields.
#!/usr/bin/perl
use strict;
$/=\15; # treat 15 byte as one record
my @lines=(<STDIN>); # read whole standard input into an array
for (sort mysort (@lines)) { # run thru sorted array
print; # write it on standard output
}
sub mysort { # sorting algorithm (compares $a with $b)
my @fields_a;
my @fields_b;
@fields_a = unpack "A5A5A5", $a; # split line into single fields
@fields_b = unpack "A5A5A5", $b; # split line into single fields
$fields_a[0] cmp $fields_b[0] # compare field 0 as ascii
|| # if field 0 is equal
$fields_a[1] <=> $fields_b[1] # compare field 1 numerical
|| # if field 1 is also equal
$fields_b[2] cmp $fields_a[2]; # compare field 2 descending
}
advantages:
disadvantages:
Sorry to resurrect an old topic, but has anyone else had experience of the third-party SYNCSORT product on Unix? If so, how does it compare to the native Unix sort, in terms of performance and features?
This is for SORTs within Natural itself AND sorting workfiles outside Natural, and we need the equivalent of typical mainframe SYNCSORT statements e.g. INCLUDE, OMIT, SUM etc
we are using syncsort on hp-ux (pa-risc and itanium) without problems. it is more ore less like the one on ibm-mainframe, and we did not want to migrate all sort-jcls. thats one of the reason we decide to use syncsort instead of the unix-home-made sort. the other reason, afair, was that the unix-sort was not able to handle binary- and floating-value-fields in a file.
KlaBue