LINEST function in MSEXCEL, explained

Deitiker, Philip R via methods%40net.bio.net (by pdeitik from bcm.edu)
Mon Mar 21 15:29:32 EST 2011


Most assays have a linear range, I have seen instances where an individual could not derive a linear protein versus color response, and it is true that at very high levels of protein such as with coomassie blue that you get saturation effects. I've seen this at very low concentrations of thick filaments from C.elegans because the molecular weight is at 10E9 to 10E10. However for globulins and globins, and most non-aggregated proteins the response will be linear over a two to three magnitude range. 

The individual who could not derive a linear range failed because he used three pipettors to develop his assay. The first, the p20 failed because he was recycling tips when pipetting proteinaceous solution. The second pipette he loaned to a colleague, who we learned after the fact was trying to use a p200 to pipette 250 ul, this supershot the preset calibration (can only be done with models made after 1990) and as a consequence was delivering 40 uls less that desired. The third pipette, a p1000 failed because it was leaking. Once the two  pipetters were repaired he got a standard curve that was almost perfectly linear. P20 and P200 can be used repeated pipetting of protein solutions (for example if one is making triplicates) when the tip is equilibrated by plunging back and forth in the reverse mode. This best example I can give is pippeting mixtures of I-125 protein A in BSA. In the forward mode the counts are often 10 to 15% below expected and those found in the reverse mode, drying the tip reveals the same amount of radioactivity remaining on the sides of the tip. The reverse mode is found on the packaging equipment supplied with a pippeter. For protein assays I recommend triplicates and using the reverse mode since the surface area to volume for each tip changes with volume.  

Certain assays often do not give linear responses, and one needs multiple points, the ELISA development assays are frequently not linear. 

In terms of the assay. The protein concentration is on the X-axis and the OD is on the Y axis, however you will determine a X unknown using Y so it is convenient to invert the linear regression. The slope of the line can be derived using the LINEST function in MSEXCEL and this also provides one with regression statistics. The correlation coefficient should be close to 1. Zar (Biostatistical analysis, 1997) recommends not averaging values of equal X component, but instead enter each data point unchanged. Protein determination of unknowns should not be extrapolated beyond the assayed points, and the estimates of line  Y = mx + b is most accurate at the medium point of the assay (close to the average X). This means that a unknown should be diluted several ways, and the Y closest to the Y value derived from the medium X. 

LINEST([Y-array], [X-array], TRUE, TRUE) after entering the mouse down on it and select 2 columns and 5 rows (including the formula), [F2], [control-shift-enter] and the statistics will show up. If you are really good at pipetting replace 2nd TRUE with FALSE, since Y_sub_X=0 = 0).Note the Y array and X-array _should_ be in adjacent columns with each X matched to each Y value correctly. To do otherwise one may get a result but that result will have no meaning.(I know they do not have to be exactly in adjacent columns, but if you are unfamiliar with Linest best to have parallel columns)

To determine Y based on X use LINEST([protein concentration],[OD values],TRUE,TRUE). To derive X then X = X intercept (second column top) + slope(fist column top)*OD. Protein concentration is therefore = X * fold dilution. 

Here are the values of the linest function (That you care for a protein standard):
- Columns - -  1  - - - 2
Rows 
 1 - - - - Slope - - - Y-Int (X-int if inverting the data arrays)
 2 -SD of: Slope - - - Y-Int ("")
 3 - - - - Cor.Coef. - Residual mean variance       
 4 - - - - F stat - -  Residual degress of freedom
 5 Variance:Regression - Residual

The probability of regression = FDIST(Row4Col1, 1, Row4,Col2)

The values of interest are R1C1, R1C2 and R3C1 relative to the equation cell (R1C1)

If you have an equation that requires 2nd order fit, you can use also excel, it's a bit trickier, but the rule is to minimize the sum of deviation square (DEVSQ function in MSEXCEL) to determine the best fit. Again, I don't recommend this, because the deviancy is generally at the high end and is a consequence of saturation effects. I do use this 2nd order solving with my amino acid composition analysis because some amino acids are often a magnitude more abundant that others and so it is hard to center them. But to get a good formula one needs at least 4 points, preferably 5 clean data points. If your only analyzing doing a single sample, dilute your protein to best approximate the mean Y value. 



-----Original Message-----
From: methods-bounces from oat.bio.indiana.edu [mailto:methods-bounces from oat.bio.indiana.edu] On Behalf Of Jayakumar, R
Sent: Monday, March 21, 2011 10:42 AM
To: WS; methods from magpie.bio.indiana.edu
Subject: RE: Can anyone please tell about slope function

Hmmmm!! the protein concentrations are on the X-axis and OD on the Yaxis.  Always dependent variables on the Y-axis. 
Jay


________________________________________
From: methods-bounces from oat.bio.indiana.edu [methods-bounces from oat.bio.indiana.edu] On Behalf Of WS [novalidaddress from nurfuerspam.de]
Sent: Monday, March 21, 2011 8:19 AM
To: methods from magpie.bio.indiana.edu
Subject: Re: Can anyone please tell about slope function

Dear Sudheer,

the slope is the "m" in your equation. In real life, colorimetric
protein measurement standard curves sometimes get a bit far from
linear, so in times of computerized curve fits by eg excel,
sigma{plot, stat} or openoffice.org, you might be better off with a
2nd order fit (y=ax2 + bx + c); simply plot the OD readings on the x-
axis and the protein standard concentrations on the y-axis, then
you'll get directly the formula to calculate actual concentrations
from the OD values when performing a regression analysis.

By having a look at the R value you may compare various regression
methods (like linear vs 2nd order) and determine which one fits the
situation better.

BTW, the linear approximation is just an approximation which fits most
needs and is easily done when you have just a ruler and a sheet of
paper (when one wouldn't want to apply for mainframe computing time
just for a protein determination :). The chemistry that converts
protein into color is not really linear in respect of dose/response.

Have fun!

Wo
_______________________________________________
Methods mailing list
Methods from net.bio.net
http://www.bio.net/biomail/listinfo/methods

This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.

_______________________________________________
Methods mailing list
Methods from net.bio.net
http://www.bio.net/biomail/listinfo/methods



More information about the Methods mailing list