IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Alignments, gaps and phylogeny

Robert J. Forster ac562 at FreeNet.Carleton.CA
Fri Jul 5 09:39:36 EST 1996

Unfortunately I am in a dangerous position, I have some knowledge of
phylogenetic analysis, a fair bit of data and some powerful programs
which I could use to abuse my sequences. I am working with 16S rRNA
sequences, most of them around 1450 bp, others pulled from genbank which
are anywhere from 1200 to 1500bp.  I've dutifully aligned the sequences,
taking care to insert gaps where appropriate and respect secondary structure.

After reading the phylip manual I decided to use dnaml to derive the
distances and to reconstruct phylogenetic trees.  I trimmed the 5' and 3'
ends so that most of the sequences start and end in nearly the same
places.  I would like to include in the analysis the genbank sequence
which has about 250 bp missing near its 3' end.  Do I need to trim all the
sequences back to this point?  To compare the dnaml tree I decided to
use SeqBoot to get a bootstrap data set, then DNADIST with Jukes-Cantor
model, Neighbor and then Consense.  

I've consulted a number of manuscripts which have analysed 16S sequences
and it seems that the bootstrap method is the most popular.  In one
manuscript approx. 100 bp at the 5' end were omitted to eliminate errors
from the extremely variable V1 region.  I've taken great care to get good
sequence from the V1-V3 variable regions because I thought that was where
the important information was held.  Some researchers eliminate regions
due to length variations in the sequences, others restrict their analysis
to sequences which have 90% of the sequence available, whilst others
eliminate all sequence which cannot be aligned without gaps (my choice of
an outgroup species for a root would certainly eliminate a fair bit of
sequence using this option!).  

What to do, what to do?  I've tried a few different methods and
yes I get slightly different trees with each method. Does the J-K distance
method require gaps to be removed?  My general feeling
is that I could use any of the above options and not be too heavily
criticized?  Is this correct or dangerous?

Your opinions would be appreciated.


Bob Forster
Centre for Food and Animal Research
Ottawa, On  Canada

R.J. Forster					Bob Forster
Centre for Food and Animal Research		ac562 at FreeNet.Carleton.ca
Agriculture Canada				tel (613) 759-1725
Ottawa, On K1A 0C6				fax (613) 759-1765

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net