[Protein-analysis] deglycosylated protein amino acid sequence
(by MConnelly from lab901.com)
Mon May 17 16:03:36 EST 2010
Welcome to the Dark Arts of Proteins!
(And secondly, sorry if any of the information below is stuff you either already know, or is just plain unhelpful!)
Firstly, a good place to go to work with sequence data is Expasy.ch
Secondly, the Uniprot ID for your protein (Q0IKU9) will be more useful on Expasy than the Genebank accession number AY423545.
Thirdly, at first glance it seems the amino acid sequence you are using might not be representative of the protein sequence in vivo. Looking at the Uniprot entry page at http://www.uniprot.org/uniprot/Q0IKU9.html there are two things which look suspicious. Firstly, there is no region annotated as CHAIN (which would be the actual chain in vivo, is not annotated by computer), and the first amino acid in the sequence is a methionine. The methionine start codon doesn't often make it into the final protein polypeptide due to post translational cleavage.
Also, the sequence only codes for a single polypeptide chain, which is not consistent with producing multiple bands on SDS-PAGE when you treat with a reducing agent.
Some explanations for the multiple bands could be:
1) the disulphide bonds are all intramolecular, and post translational cleavage is turning one chain into several (see this page on insulin secretion to see an example of what I mean http://www.vivo.colostate.edu/hbooks/pathphys/endocrine/pancreas/insulin.html)
2) the disulphide bonds are inter-molecular, and there is another coding sequence at work
3) there are some impurities, but that would likely give multiple bands in a non-reducing gel.
Post translational cleavage doesn't seem likely at first glance, the total amino acid sequence for your protein only comes to about 170 kDa in weight, but the total mass of your bands is around 260 kDa (try using the protparam tool on expasy to calculate the weight of your protein Q0IKU9, it is a good example of one of the tools available there).
However, some other things could be going on to explain the mass difference:
1) Options 2 or 3 above.
2) Glycosylation weirdness: Your protein (thanks to cleavages) could be a lot smaller than 170kDa. but the subunits will give inaccurate molecular weights due to the presence of glycans. Glycans do not bind SDS moieties, so all your mass/charge ratio goes out of whack and they look heavier than they really are. Try treating everything with PNGase F first (NEB sell a nice kit) - on deglycosylation some of the bands should move to a lighter position. This will not only help identify glycosylated bands, but also help with getting more accurate weights.
N.B. Also, knowing the non reducing band weight will help rule out a lot of options in both lists.
The Interpro annotations (such as the Thioether region) are computer annotations, and might not be correct. You can get a better idea of what you should expect in your protein by looking at the annotations for some similar ones which have been annotated by hand.
If you want to look at similar proteins, try using the BLAST tool on Expasy, this gave me lots of possible proteins which match
B8R3M2 _IXORI Alpha-2-macroglobiln splicing variant 1 precurso... 642 0.0
Q8IT76 _ORNMO Alpha-2-macroglobulin splice variant 1 precursor... 617 e-174
O01717 _9CHEL Alpha-2-macroglobulin [Limulus sp] 607 e-171
A3QX15 _LITVA Alpha 2 macroglobulin [Litopenaeus vannamei (Whi... 557 e-156
Q60486 _CAVPO Alpha-macroglobulin precursor [Cavia porcellus (... 513 e-143
A0T1M1 _MACRS Alpha-2-macroglobulin [Macrobrachium rosenbergii... 498 e-138
D3YW52 _MOUSE Putative uncharacterized protein Pzp [Pzp] [Mus ... 489 e-135
Q61838 A2M_MOUSE Alpha-2-macroglobulin precursor (Alpha-2-M) ... 488 e-135
Q641C5 _XENLA MGC82112 protein [MGC82112] [Xenopus laevis (Afr... 475 e-131
Q76DK1 _PENJP Alpha2-macroglobulin homolog [Penaeus japonicus ... 468 e-129
A0T1M0 _LITVA Alpha-2-macroglobulin [Litopenaeus vannamei (Whi... 448 e-123
Q6TL26 _9BIVA Alpha macroglobulin [Chlamys farreri] 443 e-122
B5ACH4 _9BIVA Alpha2-macroglobulin [Cristaria plicata] 439 e-120
A0A1G5 _9BIVA Alpha-2-macroglobulin [A2M] [Hyriopsis cumingii] 434 e-119
You can then perform another analysis immediately after the BLAST search using an algorithm called CLUSTALW which will show you exactly how these other A2 macroglobulins compare to yours by lining the simlar regions together. If these have coding regions and glycosylaton sites annotated etc. in their Uniprot entries it can help give you a good idea of what is consistent within the family of proteins (such as an estimate of molecular weight for example).
However, nothing is gospel, but it can help you get a feel for the kind of protein you are working on.
Try looking at this one http://www.uniprot.org/uniprot/Q8IT76 - it shows that there are 2 chains resulting from a single initial gene (just like insulin). Try using protpparam on each chain region and see if the molecular weights are a similar match your bands maybe....
I hope some of this helps. Email me if you have any other questions or queries and I would be glad to help further if possible.
More information about the Proteins