collperf

collperf is a test program for comparing collation performance and key lengths of ICU, Windows native collation and Unix/POSIX collation. It operates on a file of lines (names, for example), and performs one of three tests:

  1. Sort Key generation. Report on key lengths and key generation times.
  2. Binary search. Report the average time required to look up each of the names (file lines) from the file in a sorted list of all of the names.
  3. Quick Sort. Report the time required to sort the file in memory, using the C library qsort function. The file order is randomized prior to the sort.

Usage Summary

collperf -help

Usage: strperf options...
-help                  Display this message.
-file file_name        utf-16 format file of names
-locale name           ICU locale to use. Default is en_US
-langid 0x1234         Windows Language ID number. Default 0x409 (en_US)
                       see http://msdn.microsoft.com/library/psdk/winbase/nls_8xo3.htm
-win                   Run test using Windows native services. (ICU is default)
-unix                  Run test using Unix strxfrm, strcoll services.
-uselen                Use API with string lengths. Default is null-terminated strings
-usekeys               Run tests using sortkeys rather than strcoll
-loop nnnn             Loopcount for test. Adjust for reasonable total running time.
-terse                 Terse numbers-only output. Intended for use by scripts.
-french                French accent ordering
-norm                  Normalizing mode on
-shifted               Shifted mode
-lower                 Lower case first
-upper                 Upper case first
-case                  Enable separate case level
-level n               Sort level, 1 to 5, for Primary, Secndary, Tertiary, Quaternary, Identical
-binsearch             Binary Search timing test
-keygen                Sort Key Generation timing test
-qsort                 Quicksort timing test

Example

C:\>collperf -loop 200 -file latin.txt -keygen -shifted -level 4
file "latin.txt", 7604 lines.
Sort Key Generation: total # of keys = 197704
Sort Key Generation: time per key = 4253 ns
Key Length / character = 1.730054