Best programming language for biology students?

Dean Pentcheff dean at tbone.biol.scarolina.edu
Sun Apr 16 22:19:22 EST 1995


Time to add my two cents to the ongoing fray.  I'm a biologist who's
spent a lot of time doing computer or computer-related work in
conjunction with my biology.  My programming background is strongly
procedurally-oriented, so I can't speak at all to the desirability of
OO or functional approaches.

My first assumption is that we're talking about teaching programming to
biologists who won't do that as a major or primary part of their work.
I assume that those folks will explore and become expert in a broader
range of tools.  What we're addressing is people who will do little
bits of programming to address immediate data processing needs, or who
will make customizations to existing larger systems.

In my experience, most quickie programming by biologists tends to
revolve around manipulating datafiles to get them into appropriate
formats, or to get appropriate summary results out of them.  Examples
are data from a real-time system that one wants to feed into a
statistics program, or thousands of field observations in semi-textual
format from one wants to generate summaries.  I can't think of a better
tool for these sorts of manipulations than Perl (particularly now that
Perl5 has things like the "English" module to get it a bit further away
from its Unix Classic roots!).  Its use can be highly structured, and
should be taught that way.  For OO aficionados, Perl5 has OO
extensions.

But, there's a flip-side.  Modification of existing systems, in my
experience, is likely to mean having to be conversant in C.  Yes, I
know that there's a huge mass of legacy FORTRAN numerical code out
there, but most of that is in pretty hermetically sealed libraries.
Most newer packages that have source code available (i.e. are
downloadable from the Internet) are in C.  What I'd advocate is
teaching enough C to understand and make minor modifications to such
packages.  That also should include learning about the standard Unix
superstructure of compiling program packages (a bit about make, patch,
xmkmf and building [not programming!] X applications, basic shell
scripting for compilations, etc.).  The biologist may not use Unix as a
primary operating system, but it's usually easy to make the transition
to another platform once the Unix tools are understood.  The point here
is to allow the biologist to have a fair chance at downloading,
tweaking, and compiling the many fine program systems out there.

One way to achieve that might be to use Perl5 as the primary language,
then segue into a bit of C.  Initial exercises could include
downloading and compiling Perl itself, and then maybe the gcc C
compiler and some associated tools.

One element missing from this recipe is some form of graphical output.
I don't know much about Tk/TCL, but might that be a way to add that?


More information about the Bioforum mailing list