What is a Bioinformatician
John M. Greene, Ph.D.
John_greene at HGSI.com
Sun Apr 6 11:25:54 EST 1997
A follow up to all the posts on "What is a Bioinformatician..." a
little elucidation on Keith Elliston's point of view, from an earlier
Next Wave on Bioinformatics Role Models. My own experience is that many
of these companies seem very unsure of what they want! I have had
Pfizer feel that I lacked sufficent computer experience for their small
group, and Monsanto felt I lacked enough biological knowledge (despite
17 years of laboratory experience). It's going to be an individual fit,
"John Greene writes:
> What computer skills do you
>specifically see as being the best to learn first in a
>switch to bioinformatics from molecular biology?
>HTML/CGI/perl? Formal programming in C or C++? Unix skills?
Keith Elliston replies:
This may be contrary to the conventional wisdom, but I would add
a note of caution here, in terms of learning new computing
skills. The term Bioinformatics has been widely used to
describe everything from computer science to molecular biology.
There are a number of very clear skill sets needed to run an
effective Bioinformatics group.
First, the core resource of a Bioinformatics group is databases. The
public sequence databases are a major part of this, but it is the
skilled analysis of these databases, and the integration of data across
the various types of genome databases that is most interesting. Often,
it is important to reorganize the data into another form, in particular,
into relational or structured databases. This requires expertise
in database design and development, as well as in sequence
analysis if one is to add value to the database. I have found
few "retrained biologists" that have a formal background in
database design. Sure, many can throw a database together... but
comparing that type of effort to a database designed by a database
professional, is apples to pc's (oops.. oranges).
Second, to architect a powerful yet flexible computing environment,
requires a formal background in systems design and management.
Again, a retrained biologist can put something together, but comparing
this to a system designed and maintained by a formally trained
systems person, there is little comparison.
Third... If you are serious about real software development. You
need to find an experienced professional software developer.
These people do not grow on trees. They have typically been
trained over a large number of years, and have experience in
developing scientific software for numerous applications. Again,
a comparison with retrained biologists is difficult.
Finally... you need people to make use of the systems, software
and databases you have now developed [- a bioinformatics research
analyst]. Here, you need people with
an intimate knowledge of the biology / biochemistry / genomics.
They also need to be very facile in the use of the tools and
databases you have designed, and they need to be able to
communicate requirements to the core computer scientists of
the group, so that you can have systems that meet the needs of the
science. Do these people need to be programmers? NO... but they
need to understand programming. Do they need to be database
developers? NO, but they must understand the structure, utilility
and contents of the databases. In short, these people need a
very formal scientific background, but more of a survey and
working knowledge of computer science.
I have found these last group of people to be hard to find, as have
many others... however, they can be made, by identifying people
with the strong computational bent, and then building their
experience and knowledge base by on the job training and directed
coursework. People like Jeff (Aaronson, formerly at Merck, now at SB)
are also hard to find. Jeff's formal background
in computer science has been applied to biological problems for
a long period of time. This experience has come from working
with biologists on biological problems, where he has applied
his considerable computational skills to these problems. Scientists
with a formal computer science background are so
valuable, because they develop generalized and extensible solutions
to problems that a biologist would solve in a one-off fashion.
By combining the generalized solutions, the synergism that develops
is far more powerful than the collection of the parts. It is for
this reason, that multidisciplinary teams in bioinformatics are
More information about the Comp-bio