Prediction the function of Novel gene(s)!

Frank O. Fackelmayer Frank at
Fri Jun 15 05:33:58 EST 2001

"R. Jayakumar" wrote:
> hi..
>    partial sequences will do fine, to find what kind of gene they are.  what
> you should do is this - First of all make sure there is no sequencing errors
> there.  The check out the ORFs in all possible frames (6 possibilities)  If
> the sequence is an internal part of a gene, then you should get an open ORF
> with no start or end.  But take care to see whether there are any sequencing
> errors like insertion or deletion of a bp which can cause a frameshift.
> Take the open frame, translate the protein for that frame and use a normal
> blastp or a blastx (if you want to use the DNA sequence) and that should
> pick out the gene from the database.  Sometimes, i submit the sequence as
> such and do a FASTA (at with it for identifying the sequence.
> But this is not adivsable because of codon degeneracy and ATGC bias problem.
> So it is always advisable to translate it into a protein sequence and then
> to do the BLASTing.
>     I normally use the FRAMES tool in GCG for the ORF checking.  But there
> are other softwares at website called GENEMARK (but not very
> satisfactory) for doing this.  You can also search for ESTs within your
> sequences to check out whether they are significant.   You can also do a
> motif search in the protein sequence.. you should try PREDICTPROTEIN or
> PROTPARAM with the translated protein.. that should help a lot.
>    best of luck
> jayakumar

That will, of course, only work for known genes, and I guess the
original poster already did it. Otherwise he wouldn´t be able to say it
is a novel gene...
As to defining the function of a really novel gene, the first approach
would be to do a prosite search for funcional domains. Note that not all
hits you´ll get are meaningful, and for defining the function only the
hits to functional sequences may be considered (not those for
modifications, even though these MIGHT be helpful at a later step)! With
luck you find reasonable homology to a known domain, e.g. a catalytic
domain of an enzyme. When you don´t - and that is not uncommon for a
really novel gene from an organism with no or limited genomic
information - you will have to resort to benchwork.

A good approach is to use any of the methods to identify interaction
partners of your new protein, e.g. immunoprecipitation (and
identification of co-precipitated proteins by e.g. mass spectrometry) or
two hybrid experiments. With luck, your protein interacts with a known
protein, and you´ll have a first hint as to the cellular pathway your
protein might be involved in. Without luck, you still don´t know
anything after a year of hard work...


More information about the Methods mailing list