IUBio

Heritability estimates from a designed experiment

/S=G.WALTON/OU1=S24L07A at mhs-fswa.attmail.com /S=G.WALTON/OU1=S24L07A at mhs-fswa.attmail.com
Mon Oct 23 20:38:30 EST 1995


> THIS IS A MESSAGE IN 'MIME' FORMAT.  Your mail reader does not support MIME.
> Some parts of this will be readable as plain text.
> To see the rest, you will need to upgrade your mail reader.

--_308bb014.59fb.0 at msaumail.att.net=_
Content-Type: Text/Plain; charset=us-ascii

I would like advice on how to obtain the most (and no more) 
genetic information from a designed experiment.  A description of the 
experiment and proposed formulation are attached.  
TIA,
     Gerald (Jerry) Walton
     /S=G.WALTON/OU1=S24L07A at MHS-FSWA.ATTMAIL.COM

--_308bb014.59fb.0 at msaumail.att.net=_
Content-Type: Text/Plain; charset=us-ascii

 Dear list readers:
 
 The following diagram represents a mating program that was designed and
 executed by an entomologist.  I have been asked to provide an analysis that can
 be used to estimate genetic contributions and heritability parameters.  I am a
 statistician, and am comfortable with random and fixed effects models, but
 pretty much at sea regarding the use and interpretation of genetics parameters
 based on variance component estimates.  The experimental design is not my
 doing, so please, no flames about what can't be changed.  I am hoping someone
 will have the time and patience to help me with advice or references so I can
 make the most effective use of this data, and will be terribly grateful for
 their assistance.
 
 In the following, A(i), i=1,2, is a fixed effects factor (diet) applied to male
 parents, and B(j), j=1,2 is a fixed effects factor (diet) applied to female
 parents.  The diagram below shows matings (diagonal lines) and experimental
 factors for both parents (blocks in the diagram) for a single "group", G(k),
 k=1,14.  S(k,l), l=1,2 for any group represents the family (egg mass) from
 which the male came, where the first subscript, k, denotes nesting within
 group.  The notation and diagram may be cumbersome, but at least it is done in
 simple ASCII.
 
                              Layout for group k, k=1,14
 Diet for males           A=1         A=2         A=1         A=2
 Male Family (E mass)     S=1         S=1         S=2         S=2 
                          ------------------------------------------
 Within female fam. (EM)  Q=1         Q=2         Q=1         Q=2
                          ---         ---         ---         --- 
                          / \         / \         / \         / \ 
                         /   \       /   \       /   \       /   \
                        /     \     /     \     /     \     /     \
                       ---   ---   ---   ---   ---   ---   ---   ---
 Female diet           B=1   B=2   B=1   B=2   B=1   B=2   B=1   B=2
                       ----------------------------------------------
 Female family         R=2   R=2   R=2   R=2   R=1   R=1   R=1   R=1
 Betw fem. same sire   T=1   T=2   T=1   T=2   T=1   T=2   T=1   T=2
 Betw fem. diff.sire   U=1   U=1   U=2   U=2   U=1   U=1   U=2   U=2
 
 
 The factors mentioned so far are sufficient to describe the experiment in a
 linear model.  Measurements were made on progeny of the matings, but no
 corresponding measurements are available for the parents.  Using C for the
 population mean, the model for the measurement taken of on one of M offspring
 from each mating can be written
 
      y(i,j,k,l,m)=C+A(i)+B(j)+G(k)+S(k,l)+e(i,j,k,l,m)+
                    +AB(ij)+AG(ik)+ABG(ijk)+AS(ikl)+BS(jkl)+ABS(ijkl)
 
 where the first line contains the main effects and error term, the second all
 possible interactions, and only A and B are fixed effects.  The ANOVA for this
 model is (and I certainly hope this is correct):
 
 Source   EMS                            df
   A      56V(A)+4V(AG)+2V(AS)+V(e)       1  AG              4(AG)+2V(AS)+V(e)      13
  AS                   +2V(AS)+V(e)      14
  B       56V(B)+4V(BG)+2V(BS)+V(e)       1
  BG             4V(BG)+2V(BS)+V(e)      13
  BS                    2V(BS)+V(e)      14
  AB      28V(AB)+2V(ABG)+V(ABS)+V(e)     1
 ABG              2V(ABG)+V(ABS)+V(e)    13
 ABS                      V(ABS)+V(e)    14
   G                 8V(G)+4V(S)+V(e)    13
   S                       4V(S)+V(e)    14
 ERR                             V(e)    112(M-1)
 
 
 The model as shown is not at all satisfactory for an analysis of genetic
 properties, and it is in this area that I seek help.  I am about to jump into
 the deep end of the pool by proposing what seems reasonable to me for
 incorporating the genetic information contained in the experiment into the
 model.  I don't think all is lost.  First, the "groups" were formed simply to
 select mates, and do not correspond to a blocking factor as though, say, a
 group of insects was reared together.  If groups were the only consideration,
 any V containing a (G) could be considered zero.  However, there are other
 considerations.  One is that the 2 families from which the males were selected
 were the same 2 families from which the females were selected, within each
 group.  (Of course, matings were not incestuous.)  I don't believe this needs
 to be accounted for in the analysis, at least I hope not, since I see no way to
 incorporate this constraint into the model.
 
 Now, to expose my ignorance, provoke your pity and hope that you will have time
 to help me make the best of this, but not necessarily time for tact :).  I
 refer to the diagram as justification for a number of guesses.  First, R, the
 "between female family component when mating with males in the same family"
 component, is entirely confounded with S, and I believe that the term V(S)
 should be replaced by a linear combination of variance components with
 subscripts involving S and R.  However, it is not clear to me that either R or
 S would "interact" with the fixed effects, so I would guess that V(AS) and
 V(ABS) could be considered zero.
 
 The diet, A, is also confounded with the "within male family" term, Q, but if I
 assume no group effect, then V(AG) would, it seems to me, be the only member of
 the interaction term, so that one could simply replace V(AG) with V(Q).  By a
 similar argument, V(BG) is confounded with U, the "between females of the same
 family when mated different males (of the same family)" component, and I feel
 V(BG) may be replace by V(U).  Finally, V(ABG) I think can be replaced with
 V(T), a "within female family within (the same) male" variance component term.
 Invoking these various suppositions leads to the reduced ANOVA table
 
 Source  df     EMS
   A      1     56V(A)+4V(Q)+V(e)
   Q     13            4V(Q)+V(e)
   B      1     56V(B)+4V(U)+V(e)
   U     13            4V(U)+V(e)
  AB      1     28V(AB)+2V(T)+V(e)
   T     13             2V(T)+V(e)
   ?     14             4V(?)+V(e) error  112M-57               V(e)
 
 Throwing all self-respect to the wind, I humbly ask you, what kind of
 "heritability" terms might be of interest to a geneticist?
 
 Since I will be asked to estimate heritabilty, and since the error terms are
 not likely to be "normally" distributed, there is the question of standard
 errors or confidence intervals for the variance and heritability estimates.  I
 have a couple of references which I have not pursued (Foulley, J.L, 1994 and
 Burdick and Graybill, 1992), but would appreciate more leads that might be
 particularly apt for this problem (however it finally comes out).  The error
 estimates (residuals) can be calculated for Q, U, T, ? and, of course, the
 "usual" residual error.  (I have not decided on ML, RML or OLS, but assume OLS
 and method of moments, for simplicity, for the moment.)  These are independent
 between rows in the ANOVA table, if the experiment is balanced and normality
 holds (actually, under less stringent conditions) under ordinary least-squares
 estimation.  I would be willing to make the independence assumption (I have a
 trick in mind), but don't want to rely on normality.  This means that any
 estimates that are functions of ms could be based on a resampling scheme that
 involves separate sampling (with replacement) of the residuals on each line in
 the ANOVA table, computing the estimate, and using the properties of the
 re-sampling to come up with standard errors or confidence intervals for the
 parameters.  Some might call this bootstrapping, but it is more of resampling
 permutation test.  This seems to me a reasonable procedure, but if the
 geneticists have better methods, or something similar, I would appreciate
 references to the literature. 
 
 There are a number of other statistical issues, but I feel that I can deal with
 these.  My primary concern is with the genetics.  Once I am clear on is how the
 genetic parameters relate to the experimental design model, and what is and is
 not possible and/or interesting to a geneticist, I believe I can satisfactorly
 resolve the remaining issues.  
 
 And thank you all who have taken the time to read this, and I do hope some of
 you will have something to contribute to my edification.  Replies to this list
 or to me by e-mail at "/S=G.WALTON/OU1=S24L07A at MHS-FSWA.ATTMAIL.COM" will be
 welcome.  I will summarize to the list if there is interest and may initiate a
 dialogue in case of seemingly conflicting advice, unless asked not to do so in
 a reply.
 
 Gerald S. (Jerry) Walton

--_308bb014.59fb.0 at msaumail.att.net=_--





More information about the Comp-bio mailing list

Send comments to us at biosci-help [At] net.bio.net