Benchmark protein-sequence data sets

Fernando Gonzalez fernando.gonzalez at uv.es
Wed Feb 14 09:54:47 EST 2001



Mark Ragan wrote:

> Colleagues,
> 
> Can anyone point me to sets (matrices) of aligned protein sequences 
> for use in benchmarking software for phylogenetic inference?
> 
> I'm hoping to find something similar to the Green Plant phylogeny 
> group's 232-sequence rRNA "challenge" data set -- only with protein 
> sequences.
> 
> Something on the order of 12 to 40 protein sequences, of 100-400 aa's 
> in length each and without too many alignment gaps, would be ideal. 
> The data might be either real, or generated according to a specified 
> model.
> 
> As I'm not a subscriber to this list, please email me directly at:
> 
> m.ragan at imb.uq.edu.au
> 
> Many thanks,
> 
> Mark Ragan
> Institute for Molecular Bioscience
> The University of Queensland
> Brisbane, Qld 4072 Australia
> 
> http://www.imb.uq.edu.au/Ragan.html
> 
> 
> ---
You can find a reference database with 142 protein alignments (Thompson 
et al. NAR 27:2682 , 1999) at http://www-igbmc.u-strasbg.fr/BioInfo/BAliBASE

Good luck!

-- 
Fernando Gonzalez
--
**************************************************************
Dr. Fernando Gonzalez Candelas
Instituto Cavanilles de Biodiversidad y Biologia Evolutiva
Dept. de Genetica / Serv. Bioinformatica
Universitat de Valencia       Phone: (+34) 963 983 653
Apartado de Correos 22085     FAX (+34) 963 983 670
E-46071 Valencia SPAIN        e-mail: Fernando.Gonzalez at uv.es
**************************************************************








More information about the Mol-evol mailing list