Windows NT's file sort command is slow and inflexible, like most operating systems' sorters, but NT users needn't wait and suffer. Like their mainframe and Unix counterparts--who have long employed third-party programs--NT users have a number of enhanced sorting alternatives from which to choose.
PC Week labs tested four NT sorting applications for performance and features. Innovative Routines International Inc.'s Cosort Version 4.5 proved to be the leader of the pack in performance and flexibility in tests, but Opnek Research's NitroSort 1.0A, Opt-Tech Data Processing's Opt-Tech Sort Version 1.7z and Robert Ramey Software Development's Postman's Sort 3.25 also improve greatly on the NT sort command.
Other products may also improve on NT's sorting capabilities, but the four we tested are specialized for this task.
All four packages provide an essential ingredient to establish NT as an enterprise platform: Their higher performance allows scaling of NT systems to handle files in the gigabyte size range. Perhaps more importantly, all four have the file-handling capabilities to provide a bridge to integrate legacy systems with client/server systems.
Enterprise developers can use any of these packages as components in a number of applications, such as preparing data loads for data warehouses or preparing data sets for batch updates, reports or archiving.
The packages vary widely in the features they offer. So buyers should weigh the needs of their sites against the capabilities provided by each sorting program.
We tested each sort package on a 200MHz Pentium Pro machine with 64MB of RAM running Windows NT 4.0. Each of the four clocked sorting times that rival minicomputer and mainframe performance.
For example, the NT sort command took 2.73 seconds to sort 10,000 180-byte records and 54.66 seconds to sort 100,000 records in our tests. In contrast, each of the four third-party offerings took less than 1 second for the 10,000-record sort, and times for the 100,000-record file ranged from 4 to 9 seconds.
NT's sort routine proved incapable of completing the 1 million-record test in our tests. It consistently ran out of memory in midprocess and ended up generating a 0-byte output file. By comparison, the four packages we tested easily completed the million-record tests, with times ranging from 201 seconds to 412 seconds.
Cosort's times were at or near the top in all tests. Postman's Sort's results trailed the pack except for the 100,000-record test, where it turned in the best time.
Knowledge of data improves sorting results. All four packages offer a variety of tuning options to tweak performance. But option settings that improve performance for one data set might be counterproductive with a different data set. Factors such as the data, type of sort keys, distribution of key values in the file, record type, record size and file size interplay with option settings to affect performance.
For example, we found that allocating 32MB of memory for the 1 million-record sort caused thrashing between memory and disk files. Reducing the memory allocation to 16MB dramatically improved performance. Memory allocation had no effect on sorting the 10,000-record file, which fit completely in the test system's RAM.
An application with similar files that are processed repetitively could benefit greatly from experimenting with tuning options. Unfortunately, all the packages we tested use the trial-and-error method: We had to set and reset tuning options to find the best settings. Many users will find the default settings more productive if they perform frequent ad hoc file sorts.
Fixed-record formats outperformed delimited formats by as much as 20 percent in tests. We attribute this to the sort routine not having to parse the record for delimiters to find the key field in the fixed format.
Beyond the basics
The four tested packages don't stop at merely bettering the performance and size of files handled by NT's sort. They also offer interface and integration options, among other features.
NitroSort provides an interactive GUI with syntax checking to ease ad hoc sorting sessions. Cosort and Opt-Tech Sort provide a character-based interactive interface. Postman's Sort uses a command line interface. Custom applications can be developed with all packages through a 32-bit Windows dynamic link library that can be called as a subroutine from most Windows languages, including PowerBuilder, Visual Basic and C++.
While NT's sort routine works only on fixed-text files, the four sort packages we tested offer variable-length text-file support, and all except NitroSort handle delimited text files. Opt-Tech Sort is the only package to explicitly handle dBASE files. It also handles Micro Focus Inc.'s Micro Focus and Computer Associates International Inc.'s Realia COBOL files.
Cosort has the strongest file and data type support, including a rich set of supported COBOL data types.
Cosort's ability to handle EBCDIC and packed-decimal data types makes it ideal for integrating mainframe files with NT systems. Cosort also provides strong date-handling features that can be used to convert legacy files for year 2000 compliance.
John Shumate manages enterprise configurations for a major government agency and can be reached at firstname.lastname@example.org.
NitroSort's GUI (top) makes the product very easy to use, whereas Cosort uses a character-based interface (bottom), an unwelcome throwback to its Unix roots.
Programs for a different sort
| ||Cosort v4.5 , Innovative Routines International Inc., Melbourne, Fla. (800) 333-7678 Price: $990, plus run-time royalties
||NitroSort v1.0A, Opnek Research, Hackettstown, N.J. (908) 852-3277 Price: $299, no run-time royalties
||Opt-Tech Sort v1.7z Opt-Tech Data Processing, Zephyr Cove, Nev. (702) 588-3737 Price: $249, plus run-time royalties
||Postman's Sort v3.25, Robert Ramey Software Development, Santa Barbara Calif. (805) 569-3793 Price: $149 (includes 50 run-time licenses)|
|MERGE MULTIPLE INPUT FILES|| Yes|| Yes ||Yes|| Yes|
|ELIMINATE DUPLICATE KEYS || Yes|| No|| Yes|| Yes |
|RECORD FILTERING || Yes|| Yes|| Yes|| No|
|OUTPUT REFORMATTING || Yes|| Yes ||Yes|| No |
|FILE/DATA TYPES|| Yes|| Yes ||Yes|| Yes|
|FIXED TEXT || Yes ||Yes|| Yes ||Yes|
|VARIABLE TEXT || Yes|| Yes ||Yes|| Yes|
|DELIMITED TEXT || Yes|| No|| Yes|| Yes|
|DBASE ||No ||No ||Yes|| No|
|COBOL ||No|| Yes|| No ||Yes|
The million-record difference
All four programs keep sorting long after Windows NT's routine has given up
Benchmark results--all times in seconds
| ||10,000 records, 10-character alpha key
||100,000 records, 10-character alpha key
||1 million unique 180-byte records, 10-character alpha key
||1 million unique 180-byte records, 7-character integer key
||1 million unique 180-byte records, full 180-byte alphanumeric key|
|Windows NT|| 2.73|| 54.66|| NA|| NA|| NA|
|Cosort ||.31|| 7.89|| 300.66 ||297.33 ||201.34|
|NitroSort|| .28 ||6.94 ||296.1|| 294.71|| 270.67|
|Opt-Tech|| .54|| 9.27|| 313.33|| 295.31|| 291.52|
|Postman's Sort|| .36|| 4.19|| 398.21|| 410.14|| 412.08|