Colleagues,
Can anyone point me to sets (matrices) of aligned protein sequences
for use in benchmarking software for phylogenetic inference?
I'm hoping to find something similar to the Green Plant phylogeny
group's 232-sequence rRNA "challenge" data set -- only with protein
sequences.
Something on the order of 12 to 40 protein sequences, of 100-400 aa's
in length each and without too many alignment gaps, would be ideal.
The data might be either real, or generated according to a specified
model.
As I'm not a subscriber to this list, please email me directly at:
m.ragan at imb.uq.edu.au
Many thanks,
Mark Ragan
Institute for Molecular Bioscience
The University of Queensland
Brisbane, Qld 4072 Australia
http://www.imb.uq.edu.au/Ragan.html
---