Prediction the function of Novel gene(s)!

Malay curiouser at ccmb.ap.nic.in
Fri Jun 15 06:42:24 EST 2001


Functional prediction is an art. I'll suggest you to avoid Prosite. It will
bias your opinion. The most important tools are PSI-BLAST and the homology
modelling. And the most important database is BLOCKS. There is no strict
protocols and you need to be very lucky to get ant meaningful information.
Briefly this is what the successful people did-

#1. Search Genbank with PSI-BLAST with cutoff 0.01 until the search mearges.
#2. Align all the sequences from the merging by CLUSTAL.
#3. Take the most conserved sequence region and search against BLOCKS
database and see whether you can pick up any conserverd block in your
sequence.
#4. You can even remove coiled-coli structure from your sequence and try
homology modelling with the sequence hits from the blocks.
#5. Trying different combination of the previous steps.
#. Cross your finger whenever you submit your sequence in PSI-BLAST :-)

All the best.

-Malay

Malay Kumar Basu
Centre for Cellular and Molecular Biology
Hyderabad 500007
I N D I A

Fax: (00-91)40-7171195
Phone: (00-91)40-7172241
-----
Peace through superior firepower.
-----
curiouser at ccmb.ap.nic.in




----- Original Message -----
From: "Frank O. Fackelmayer" <Frank at Fackelmayer.de>
To: <methods at hgmp.mrc.ac.uk>
Sent: Friday, June 15, 2001 4:03 PM
Subject: Re: Prediction the function of Novel gene(s)!


>
>
> "R. Jayakumar" wrote:
> >
> > hi..
> >    partial sequences will do fine, to find what kind of gene they are.
what
> > you should do is this - First of all make sure there is no sequencing
errors
> > there.  The check out the ORFs in all possible frames (6 possibilities)
If
> > the sequence is an internal part of a gene, then you should get an open
ORF
> > with no start or end.  But take care to see whether there are any
sequencing
> > errors like insertion or deletion of a bp which can cause a frameshift.
> > Take the open frame, translate the protein for that frame and use a
normal
> > blastp or a blastx (if you want to use the DNA sequence) and that should
> > pick out the gene from the database.  Sometimes, i submit the sequence
as
> > such and do a FASTA (at www.ebi.ac.uk) with it for identifying the
sequence.
> > But this is not adivsable because of codon degeneracy and ATGC bias
problem.
> > So it is always advisable to translate it into a protein sequence and
then
> > to do the BLASTing.
> >     I normally use the FRAMES tool in GCG for the ORF checking.  But
there
> > are other softwares at www.ebi.ac.uk website called GENEMARK (but not
very
> > satisfactory) for doing this.  You can also search for ESTs within your
> > sequences to check out whether they are significant.   You can also do a
> > motif search in the protein sequence.. you should try PREDICTPROTEIN or
> > PROTPARAM with the translated protein.. that should help a lot.
> >    best of luck
> > jayakumar
>
>
>
> That will, of course, only work for known genes, and I guess the
> original poster already did it. Otherwise he wouldn´t be able to say it
> is a novel gene...
> As to defining the function of a really novel gene, the first approach
> would be to do a prosite search for funcional domains. Note that not all
> hits you´ll get are meaningful, and for defining the function only the
> hits to functional sequences may be considered (not those for
> modifications, even though these MIGHT be helpful at a later step)! With
> luck you find reasonable homology to a known domain, e.g. a catalytic
> domain of an enzyme. When you don´t - and that is not uncommon for a
> really novel gene from an organism with no or limited genomic
> information - you will have to resort to benchwork.
>
> A good approach is to use any of the methods to identify interaction
> partners of your new protein, e.g. immunoprecipitation (and
> identification of co-precipitated proteins by e.g. mass spectrometry) or
> two hybrid experiments. With luck, your protein interacts with a known
> protein, and you´ll have a first hint as to the cellular pathway your
> protein might be involved in. Without luck, you still don´t know
> anything after a year of hard work...
>
> Frank

---




More information about the Methods mailing list