Etienne Bucher via arab-gen%40net.bio.net (by e.bucher from unibas.ch)
Wed Oct 9 07:05:14 EST 2013

Dear Li,

For Q1, this sounds perfectly reasonable to me.

Q2: That you have such a high number of SNPs/Indels is not so surprising
to me, I have been re-sequencing Col-0 and found many SNPs, often in
repetitive sequences, this is usually just the result of miss-mapping of
reads in repetitive sequences. Are all the SNPs you found homozygous
(freq. more than 80% of the reads)? The number of SNPs in the comparison
of Col-7 to mutant1 indeed seems excessive, also here are all mutations
homozygous? Do they mostly lie in repetitive sequences (centromer)? I have
also seen strange things in T-DNA mutagenesis, such as 4kb deletion w/o
T-DNA insertion.
My suggestion, backcross your mutant to the parental line go for F2 and
see which mutation is linked to your phenotype (use your seq data to
develop mapping markers) this will narrow down your mapping region.

Best greetings,

On 8.10.13 20:12 , "Li Zhang" <zhangl25 from msu.edu> wrote:

>I have some questions about the genome sequencing in Arabidopsis T-DNA
>insertion lines.
>We have sequenced some T-DNA (pSKI015) insertion lines of Arabidopsis
>Illumina paired-end sequencing. Using Bowtie2, we aligned all the reads to
>the reference genome, which is a combination with TAIR10 genome and T-DNA
>insertion sequence of pSKI015. Reads aligned part to T-DNA and part to
>Arabidopsis genome were found and the insertion sites were identified
>Because we already knew that the phenotype of some mutant lines was not
>related to the T-DNA insertion, we also used the bowtie2 results to
>identify the SNPs and INDELs in the mutants.
>For example, we compared the genome of col-7 to col-0, as background of
>these mutants is col-7 and found the SNPs and INDELs in col-7. Then we
>compared the genome of mutant1 to col-0, also found the SNPs and INDELs in
>mutant1. Finally, SNPs or INDELs, only found in mutant1 not in col-7 were
>identified as the true SNPs and INDELs.
>Q1: is this method normally used? Or is there a better method to find SNPs
>and INDELs in these mutants?
>Q2: To our surprise, we have found over 6000 SNPs and INDELs in Chr5
>between col-0 and col-7, and over 7000 SNPs and INDELs in Chr5 between
>col-0 and mutant1 with value of QUAL >=30. Is it normal to find so much
>SNPs and INDELs? We do not expect this high number of SNPs and INDELs as
>our knowledge col-0 and col-7 are very close. The number of SNPs and
>found only in mutant1 but not in col-7 mutant1 is over 1000. T-DNA
>insertion mutagenesis of col-7 should not induce lots of mutations. Hence
>is it possible that we did something wrong?
>Thank you.
>Li Zhang
>Department of Plant Biology
>DOE Plant Research Laboratory
>Michigan State University
