open-source software for bioinformatics (was Re: Unix vs Linux - the movie.)

Ted Byers rtbyers at bconnex.net
Tue Aug 1 09:16:33 EST 2000


I have a couple observations that may prove useful or interesting.  I am a
theoretical ecologist who has had to develop programs for environmental
modelling projects for close to twenty years.  I am currently working on an
application involving long term, short range distribution of contaminants
throught the environment. and another major application involved a custom
GIS integrated with automated data collection and processing and hardware
control(for specialized agricultural equipment).

First, almost a decade ago, I submitted a paper to an international, peer
reviewed journal.  During the review process, one of the reviewers reported
back to the editor that he could not reproduce the results, and because the
editor found the result interesting and important (if verified) I was
contacted and asked for source code required to produce the result.  I did
so, stripped down to be free of all user interface stuff, so that all that
remained was the code required to prepresent the example model and the
algorithm required to produce the result.  I used fairly standard numerical
methods, so it would be easy to verify that I had implemented the methods I
used correctly.  Not long later, I received word that the reviewer in
question, after having studied my coe, had corrected his own, verified that
the code I used was correct, and reproduced the results I reported.

Second, if a paper is to be published based primarily on the basis of a
given program developed in the lab eporting the result, then I would regard
it as unprofessional if that program is not thoroughly documented.  While
there are many ways to document software, UML provides a useful model for
documenting software design.  If an application is properly documented, it
is not such a big problem to examine even a medium sized application of 500
kLOC.  Examining even a small program of only a few thousand lines of code
can be a nightmare without proper documentation however.  And note, on such
a program, you would normally have a team developing it, the members of
which will be reviewing each other's code.  It should also be noted that in
most applications, most of the code involves the user interface and ought
not to influence results produced by the algorithm that is actually
responsible for the interesting scientific results.  I do this sort of thing
routinely in developing new applications and debugging applications written
by others.  And remember, I am an ecologist by training, not a software
engineer.  The bottom line is that if a paper depends on a program developed
by the authors to be correct, then it is irresponsible for either the editor
or the reviewers to approve it without proper examination of the code, and
if that code is not properly documented, then I would be inclined to reject
the paper, with the advice that I would reconsider if they supplied properly
documented code.

As for reviewers being too busy to properly review such a paper, I'd suggest
that if they are prominent enough to be asked to review a given paper that
requires examination of the code, then they probably have a decent sized lab
and can assign a member of their lab to do a preliminary review which the
reviewer can then check if they are really too busy to do it all themselves,
But they have a responsibility to see that the results are reproducible, and
that requires evaluating and testing the code.  If they don't have the
expertise themselves, and do not have it in their lab, then they are
probably not the right people to review the paper since they have no access
to the expertise required to do the job right.  Both those producing code
and those evalauting code ought to have the expertise required to do a code
walk through and construct regression tests.  There is no excuse for either
to neglect the quality assessment side of software engineering.

This is just my opinion.  You can take it for what it is worth.

Regards,

Ted

R.E. Byers, Ph.D., Ed.D.
rtbyers at bconnex.net








More information about the Bio-soft mailing list