ClustalW: number of sequences

Kevin Karplus karplus at cse.ucsc.edu
Sat Mar 4 03:19:21 EST 2000


Julie Thompson-Maaloum <julie at titus.u-strasbg.fr> wondered what my
alignments of 10,000 sequences looked like.  Well, they are generally
fairly redundant.  Although each sequence is unique (from NCBI's NR
protein database), they often have a lot in common.  I have large
alignments of immunoglobulins and of zinc fingers.  Perhaps the least
diverse alignment started with 1ce4A, which is a domain of an HIV coat
protein.  There are over 16000 similar sequences in NR, with only tiny
differences---I think that reducing to sequences which have no more
than 90% similarity would reduce this set to one sequence.

-- 
Kevin Karplus 	karplus at cse.ucsc.edu	http://www.cse.ucsc.edu/~karplus
life member (LAB, Adventure Cycling, American Youth Hostels)
Effective Cycling Instructor #218-ck
Anything below this line is junk added by others without my approval.





More information about the Bio-soft mailing list