Notes on compilers for fine tuning Zoo

Robert Harper harper at NIC.FUNET.FI
Wed Jul 24 02:25:09 EST 1991


>Xref: funic comp.compression:1066 comp.binaries.ibm.pc.d:7375
>Path: funic!fuug!mcsun!uunet!zaphod.mps.ohio-state.edu!uakari.primate.wisc.edu!crdgw1!crdos1!davidsen
>From: davidsen at crdos1.crd.ge.COM (Wm E Davidsen Jr)
>Newsgroups: comp.compression,comp.binaries.ibm.pc.d
>Subject: zoo performance vs. compiler tests
>Keywords: zoo, xenix, gcc, benchmark
>Message-ID: <3545 at crdos1.crd.ge.COM>
>Date: 22 Jul 91 15:09:40 GMT
>Reply-To: davidsen at crdos1.crd.ge.com (bill davidsen)
>Followup-To: comp.compression
>Organization: GE Corp R&D Center
>Lines: 135


                      zoo 2.10 Performance Testing
                             Bill Davidsen

  I was curious about the improvement possible in performance due to
generation of better code, and was going to see if I could easily hand
optimize the assembler output by the C compiler and thereby improve the
performance. While I haven't done that yet, I got some interesting
results along the way.

  I first compiled with the xenix compiler, using the -M2l option to
generate 286 code, and the -M3 option to generate 386 code. I also used
the gcc for part of the test, as noted in the following. All xenix
compiles were run with the -O option, results labeled gcc are with -O,
results labeled gcci are with -O and -finline-functions.

	Table 1: mix of C and object, zoo*.[co]
	  Results in user CPU-sec, std and high compression

	prog	std	high
	========================
	x286	15.72	31.80
	x386	 5.10	15.16
	gcc	 5.02	13.64

  I then recompiled with the -finline-functions option enabled to allow
more optimization. In all cases I had to compile using -traditional with
gcc, due to the nature of the xenix include files. At this point I also
increased the size of the dataset to compress, and tried info-zip ZIP
v0.9 beta, as compiled with the xenix compiler with -O.

	Table 2: mix of C and object, zoo*.[co]
	  Results in CPU-sec, high compression
	  Zip tested using the -9 option for highest compression
	  Note that object files have changed since table 1.

	prog	high
	================
	x286	31.98
	x386	15.20
	gcc	13.60
	gcci	12.42
	zip	12.48

	Table 3: larger mix of C, header, and object files
	  Results in CPU-sec, high compression and -9 options
	  This is a superset of table 2, about 480k

	prog	high
	================
	x286	93.38
	x386	45.22
	gcc	40.34
	gcci	36.46
	zip	38.06

  After this I decided to test the extract speed. I tested only the
cases where the entire source and object set had been compressed,
removed the files before starting, and did a sync before starting the
extract. Note that all results have been checked and represent the
smallest of three runs. In general the results were within a few percent
of one another, with no particularly high or low cases noted for any
tests run.

	Table 4: extraction results
	  CPU-sec for extracting default and high compression
	  Zip extract of archive made using -9 option

	prog	std	high
	========================
	x286	10.50	17.04
	x386	 5.10	 6.98
	gcc	 7.66	 6.12
	gcci	 7.42	 5.86
	zip		10.34

Conclusions
===========

  First caveat: this represents the results using UNIX compilers and
UNIX sources, one of which is not yet available to the public. It is not
claimed to apply to programs running under MS-DOS, and does not include
any tests against programs partially hand coded in assembler, such as
PKZIP. I don't believe that any conclusions about ZOO vs ZIP should be
drawn, this is a set of tests to show what can be done by playing with
the compiler.

  All versions of zoo, however compiled, produced the same output byte
for byte. This is not surprising, but reassuring. In all cases the high
compression option of zoo produced a file which was smaller than any
file which could be produced by zip.

  Use of compiler options can result in measurable improvement in
performance. After looking at the code generated by gcc, I suspect that
hand tuning it could gain a bit more. I didn't bother. The improvement
for 32 bit versions was about 30% in high performance compression and
15% in extraction. These option hurt extraction speed for files
compressed with the default option.

  Zoo compression methods used 32 bit intensive algorithms. Just by
using the 32 bit code the performance can be improved by a factor of
two. This may be true of zip as well, but I have not generated a 286
version.

  Zip is faster to compress than zoo. Comparing the same compilers I see
about 15% less CPU in zip. The zip compressor has had months of work by
a team, and while it is virtually all the work of one person, it has
been examined and critiques by a team. The compressor in zoo has had
less time and exposure, and the major effort has been on making it
produce smaller code, rather than faster. I believe that zoo has more
room for improvement in both compression and speed than zip, based on my
understanding of the code.

  Zoo is clearly faster than zip at extraction. I see about 30% between
comparable zoo and zip versions. Zoo has had a bit of team work on the
extractor, and there may well be room for improvement in the zip
extractor.

Notes and Details
=================

The Xenix compiler identifies itself as 5.10.010 (beta), while the gcc
is 1.37.1. The system is a Dell 325, 386-25, 16MB memory, ~700MB disk,
387. All timings were from the Korn shell time builtin rather than
/bin/time.

The zip is 0.9 beta, no patches, the zoo was 2.10 distribution version,
unzip was 4.07 beta, rather than 4.10 public release, because I couldn't
find the source when I looked for it here.


-- 
bill davidsen	(davidsen at crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
  GE Corp R&D Center, Information Systems Operation, tech support group
  Moderator comp.binaries.ibm.pc and 386-users digest.




More information about the Bio-soft mailing list