Benchmark protein-sequence data sets
Fernando Gonzalez
fernando.gonzalez at uv.es
Wed Feb 14 09:54:47 EST 2001
Mark Ragan wrote:
> Colleagues,
>
> Can anyone point me to sets (matrices) of aligned protein sequences
> for use in benchmarking software for phylogenetic inference?
>
> I'm hoping to find something similar to the Green Plant phylogeny
> group's 232-sequence rRNA "challenge" data set -- only with protein
> sequences.
>
> Something on the order of 12 to 40 protein sequences, of 100-400 aa's
> in length each and without too many alignment gaps, would be ideal.
> The data might be either real, or generated according to a specified
> model.
>
> As I'm not a subscriber to this list, please email me directly at:
>
> m.ragan at imb.uq.edu.au
>
> Many thanks,
>
> Mark Ragan
> Institute for Molecular Bioscience
> The University of Queensland
> Brisbane, Qld 4072 Australia
>
> http://www.imb.uq.edu.au/Ragan.html
>
>
> ---
You can find a reference database with 142 protein alignments (Thompson
et al. NAR 27:2682 , 1999) at http://www-igbmc.u-strasbg.fr/BioInfo/BAliBASE
Good luck!
--
Fernando Gonzalez
--
**************************************************************
Dr. Fernando Gonzalez Candelas
Instituto Cavanilles de Biodiversidad y Biologia Evolutiva
Dept. de Genetica / Serv. Bioinformatica
Universitat de Valencia Phone: (+34) 963 983 653
Apartado de Correos 22085 FAX (+34) 963 983 670
E-46071 Valencia SPAIN e-mail: Fernando.Gonzalez at uv.es
**************************************************************
More information about the Mol-evol
mailing list