Understanding statistical analysis in PET paper

NMF nm_fournier at ns.sympatico.ca
Sun Feb 29 16:23:32 EST 2004


Dear TonyJeffs,

 Here's some
 left Fusiform  x-34 y -72 z -12  zScore 3.69
>
> I get that the first three are spatial coordinates, but what's a
> zScore?

You are right.  The first three values are Talairach stereotaxic spatial
coordinates.

You need to have a basic understanding of what z-scores are before we
proceed.  I honestly suggest you do read an introductory statistics
textbook.

Z-scores are transformations of raw scores that specify each score's amount
and direction of deviation from the mean.  In other words, the z-score is a
descriptive statistic that represents the distance of an observed score
above or below the mean.  (So you require some basic understanding of what
the standard deviation and the mean are).   We also can use the z-score for
a variety of different things.  This most common textbooks examples regard
grades in a class or test.  Converting the raw scores to z-scores will allow
you to determine, using the area under the curve, how many students had
scores that were higher than your score (whatever that may be) or student
who had scores below yours.  This is the preample for using z-scores to
determine percentile ranks, etc.

Many people believe, actually they are quite incorrect, that transforming
any distribution to a distribution of z-scores will make the shape of the
distribution normal (i.e. bell shaped).  This is just plain wrong.  If the
shape of the distribution is not normal to begin with (i.e. highly
anistropic) then the distribution will not be normal after z-score
transformation.  So using the z-score is a way of expressing the relative
distance of an observed score from the mean and allow for relative
interpretational ease when evaluating the data.  For example, converting a
sample into z-scores can allow for the ease in detecting scores that are
influential points (i.e. outliers) in the distribution.  Generally, when we
work with data for parametric statistical analyses we want the distribution
to be normally distributed (not skewed by influential points) before we
proceed.

When you transform the scores to z-score you are converting the distribution
with any mean and standard deviation to a distribution that has a mean of 0
and standard deviation of 1.  In the paper that you cited, the authors
normalized each individuals metabolic score based upon a global or
population mean ( 50 ml/min/100 g ) value.  So they are able to compare the
distance that each individual raw score under the various conditions to what
the mean of the sampling distribution was calculated to be.   (there are of
course other methods to normalize data then this approach).  At the
beginning of the paper they made some assumptions regarding the nature of
their data.  First was that, "Any z scores greater than 3.29 (p<0.001,
uncorrected, two-tailed) were considered to meet statistical significance,
except in the case of the lingual or fusiform gyri..... In this case, we set
the threshold at 3.09 (p<0.001, uncorrected, one-tailed)."  What does this
mean?  Well in the first situation they were unaware of what the direction
(hence two-tailed tests were employed) of the measured or calculated values
for each brain area would be following the treatment.  For example, oxygen
utlization, say in the amygdala, could either be elevated or decreased in
certain situations.  (hence the hypothesis test would be a two tailed test).
The zscores being set to 3.29 is a statement regarding what the calculated
z-critical would be.  That is the values they obtain must be significantly
greater than this value.     Hence you can conclude that there is a
difference between oxygen utlization based upon the treatment compared to
the normalized mean.  They also set the threshold at 3.09 (in one-tailed
direction) for the lingula or fusiform gryus based upon previous literature
showing an involvement of these structures in hypnosis.    (One tailed
states that the authors are expecting the difference will lie in one
direction on the bell shaped curve, i.e. that the mean metabolic activity of
this structure would elevated significantly).

Generally in metabolic imaging studies one sets the alpha (p) level to be a
rather low number.  This is basically because the authors want to insure
that the activation of a brain region is tied to the experimental
manipulation and not due to perhaps local blood flow or oxygen and glucose
utilzation changes that would be all non-dependent and not influenced by
experimental parameters.  Having an alpha set, to say .05, may lead to
spurious results in this case.  Having said that, the problem of using such
low alpha probability values is that it is harder to obtain a statistical
difference.  Typically when one assesses statistical differences in
parameteric analyses an alpha at 0.05 or 0.01 is generally set.  In terms of
z-scores this would mean that any z-score greater than 1.96 (or less -1.96
or a two tailed test) would be considered to meet statistical significance.
In the example you cited, they found a z-score value of 3.69 for the left
fusiform gyrus thus it is likely that regional blood flow within this
structure is statistically different from the normalized global mean.  In
other words, it is most likely that there is greater metabolic activity and
activation in the left fusiform gyrus under they hypnotizable condition then
what would be normally activated in control or non-hypontizable condition.

 Then there's
> F=8.69 dF=1,7 p=.02
>
> I understand that p= in >= 2% of cases the result is likely to be
> true,
> but what's F & dF


No.  The alpha (or p) value states that there is a 2% chance that the
statistical difference that was observed is not statistically reliable.
Hence, it it most likely that if you measured the 100 times you would find
the same result 98 times.  This is tied to type I and type II errors in
parametric statistics.  A type I error is when you reject the null
hypothesis (no differences) when in fact it is true.    Type II errors are
found when you fail to reject the null hypothesis when it should be rejected
(i.e. a statistical difference is observed). When alter the value of the
alpha level, you risk the chance of committing either type I or type II
errors.  Thus, if you protect against at type II error the probability of
making a type I error increases.  (Note:  Some have argued, and I tend to
agree with this statement, that it is actually worse to make a type II error
then a type I error.  Type I errors are easily fixed and dismissed when
additional studies replicate the results.  However, if you fail to reject a
null hypothesis when it is in fact wrong (i.e. there is a statistical
difference between groups) many years may pass before such findings are
"rediscovered".)

The F value and dF (degree's of freedom) were basically the statical report
of the results from the analysis of variance (ANOVA).  Many people believe
they understand what degrees of freedom means in statistical research.
Unfortunately more often then not, these individuals really have no idea
what they are talking about.

The degrees of freedom are the number of values that are free to vary after
specific restrictions were placed on the data. In other words, the degree of
freedom represents the number of independent pieces of data that are allowed
to vary.  That is we have a certain number of pieces of independent data
that can be freely changed if the mean is to reamin constant.  For example,
say you have three numbers 6, 8, and 10.  The mean would be 8.  You can
change any numbers but the mean must remain 8.  If you change the 6 to 7 and
the 10 to a 13, the remaining number in order for the mean to be equal to 8
must be 4.  If you had 50 numbers and were given the same restrictions then
you would be free to vary only 49 of them; the 50th would be already
predetermined.  Since it is predetermined you loose a degree of freedom for
statistical calculation.  This really has to do with having unbiased vs.
biased estimates of statistical parameters, which are important calculating
estmiates of population parameters (i.e. population variance and mean) for
variance estimates.

 In this case, there are two values that they reported which is basic method
for APA style.  The between subject degree's of freedom is the first value
and the within subject degree's of freedom is the second value.  When we
make calculations on a data set there are certain parameters that must be
calculated.  For example, in the case of calculating the variance of your
sample, hence s^2, the population mean is not known and must be estimated
from the sample mean.  Once you have estimated the population mean then you
have fixed it for purposes of estimating variability.  Thus you lost 1
degree of freedom, i.e. N-1.  (refer to formula for estimating variances, if
you are confused.)  All the df's tell us are the number of independent
pieces of data.

The F statistic is basically the value that would calculate if the null
hypothesis (that there are no significant differences between the means.
i.e. All treatment means are equal) can be rejected  The calculation is
irrelevant here.  What is important is that when you perform the analysis
this value is compared to a F(critical) value based upon the degree's of
freedom and the alpha level chosen at the beginning of the analysis (i.e.
alpha set at 0.05 or 0.01, etc.) that can be obtained in statistical tables.
If your calculated value is larger then this value there is a significant
differences, i.e. the treatments means are not equal.  You can conclude
there is a significant difference.  You do not need to report the actual
Fcritical value, only the alpha is important.  In this cases saying, alpha
equal to 0.02 indicates that there is a significant difference if the alpha
was preset to 0.05.  Luckily, statistical packages can do all of these
calculations with relative easy.  So you do not require to consult
statistical tables when doing analyses.  Remember, the F statistic does not
tell you which groups are different, simply the ANOVA tells you that the
means are not equal for the groups.  One group might differ or all groups
might differ.  In order to determine this appropriate planned comparisons or
post hoc comparisons (i.e. Tukey, Scheffe) must be employed in order to
discern the nature and direction of group differences.

>>And
T=1.6
OK that's a T test.
So what does a T test tell me?

Ok there is problem with what you are reporting here.  This is not a T value
(capital).  It is simply a t value (lower case), which is actually
different.  T-scores are a standard score that is always a positive value
and is whole number rather than a decimal.  You can convert z-scores to
T-scores, for relative easy in the interpretation of statistical value from
paper and pen tests.  For example, many psychological measurements utilize T
scores with a mean set at 50 and standard deviation set at 10.  They are
useful because you can have an estimated common frame of reference.   This
is different and you should be aware of these differences.  (t statistics is
based upon Gosset's original calculations).

The value you are reporting which is actually (t(7)=1.6; p>0.14) are  the a
priori contrasts that the authors used in order to determine the nature of
the group differences that was obtained by the ANOVA.  They used a
Bonferroni t-test with an adjusted alpha based upon the number of
comparisons that were made.  The statistic they reported tells us that there
is no significant difference for the groups on this measure.





More information about the Neur-sci mailing list