homologue vs orthologue?

Deitiker, Philip R via methods%40net.bio.net (by pdeitik from bcm.edu)
Wed Feb 9 17:18:49 EST 2011

The first step is to Blast the peptide sequence, this can be done through PubMed as follows: 
First search gene or nucleotide for Oct1, this should get you to this page:


Next the red bar under the diagram of the gene is the protein sequence icon. 
Right click on the red bar and select 'Properties'
A new screen appears and scroll down to  Blast protein sequence. BLAST Protein:	NP_012788.1
In the species Box type Homo sapiens. Then Blast. 

That gives a blast result, the closest homolog is the MIP CRA_b isoform

You can also use the MIP CRA_b sequence to search the human database for other similar proteins. 

To do a domain search you need to know where the domain boundaries are or have some 

This page shows the known components:

By the same method, clicking on properties (say on the active site) and selecting blast proteins, then select Homo sapiens and search. {the other way is to wait until a pop-up menu appears, there is a small 'i' in the top right corner, carefully move the mouse over an click on this, it will take you to the 'links' screen.

This gives you other proteins like
neurolysin (metallopeptidase M3 family)
choline-o-acetyl transferase


For areas not listed in the graphic, for example the region in Dcc but not in M3-MIP you simply blast the protein but only include 58-285 in the run, this gives you proteins in humans that have homology to the leader sequence but not the metalloproteinase. 

This gives hits on human MIPs, HEAT motif proteins and interferon-induced protein.

It helps to have a good 3D model of domains, and the domain limits to refine the search. For better searches one should use the more conserved regions of the domain, this gets rid of random hits. To limit domains remove sequence between domains that link the domains, generally 5 to 20 amino acids. In addition one can use blast to search for short peptide homologies. This frequently finds similarities that are not otherwise apparent and works well with short peptide sequences between 6 and 20 amino acids, particularly well conserved regions. 

Here is an example. Within the active site there is a zinc binding region of 20 nucleotides, when blasting this go to the bottom and select algorithm and make sure the optimize for short peptide sequences is clicked. That's it then click blast. A large number of metalloproteinases show up, like septin-9, MAPK, metastasis suppressor 1- isoform CRA_a.

In this way you can search the entire protein, both within and between domains, if that's what you want. 

The better choice however is to look at the evolution of the human homologues in mammals, vertebrates, and animals to see how the protein evolved. Aspects of the protein may have other sources that the Oct1/hMIP last common ancestor, pieces may be added as homologous that are piggy backing on the general homology between two proteins. To rule this out reconstruct the evolution of the homologues so that one can surmise better pieces that have been added to either branch along evolution. 

This is a start:

There are many other resources available, also. One can go to the SNP map and look for human variants. In addition there is a Hap-Map available that can detail linkage in the gene region. There is currently 1000 genome projects and there is growing detail about chromosomal structure around genes and variants. 


In terms of Paraloques one has to remember everything is relative, and of interest is what is relatively closest to the yeast protein. Consequently, knowinq what other homologuous proteins exist in yeast. Once a set of ortholoques in human are uncovered, one then needs to compare these (or their most similar domains) to the similar instances in Yeast, using any of several cladistic programs you should be able to describe a gene tree. 
Since yeast are not ancestors of humans one then has to create a branched tree diagram going back to a common ancestor. If it done correctly there should be parallel branches from that ancestor to both yeast and human one of the branches should be more related to yeast and may split into several subbranches on either side, and so the tree will need to be pruned to the probable and possible proteins related to each other through this last common ancestor.  

Oct1p, YKL134Cp-like, Prd1p, proteinase yscD, YCL057W, Ylr224wp, YL224_YEAST, and TPA:F-box protein are proteins found in Yeast that exhibit similarities to human M3-MIP. 

There is no possible way to go through all the online resources that link to individual pubmed entries, just to be certain that you are aware there are many I am aware of and a whole bunch that I am not savy enough to use, unaware of or simply don't have the time to tinker with. One of the features however is the  Blast search can present with a cladogram for a given blast search. Design the search sequence carefully and your search limits (species to include) and this can be very useful (albeit with imperfections). Enough said, go play. 

-----Original Message-----
From: methods-bounces from oat.bio.indiana.edu [mailto:methods-bounces from oat.bio.indiana.edu] On Behalf Of Saima Muhammad
Sent: Wednesday, February 09, 2011 1:45 PM
To: methods from magpie.bio.indiana.edu
Subject: homologue vs orthologue?

how can we find the orthologue and paralogue of oct1 protein in human and 
compare them at amino acid level and domain level

Methods mailing list
Methods from net.bio.net

More information about the Methods mailing list