[Arthropod] Re: Arthropod Digest, Vol 10, Issue 5
(by alpapan from gmail.com)
Mon Jan 24 19:52:10 EST 2011
Our values vary, from 18 % to 50 % but my guess it is more relating to
total amount of RNA. The species with 18% was within 15-20 % for 8
different libraries/tissues, including examples like head. For another
species we had consistently ca 50 % rRNA from 4 libraries.
Until a couple of months ago, I was removing redundancy from all the
data (using cd-hit-est) and that would remove almost all rDNA. The
velvet assemblies looked OK (kmer coverage was even but low) but Brian
Haas suggested that for inchworm it is ok to not remove redundancy
(inchworm does not use a kmer histogram to filter nodes). for what it
is worth, inchworm was better at assembling so an rDNA-flitering step
had to be included.
my method (which may or may not be optimal) is as follows
For every new species:
get a 1 M sequences. Align them vs a non-rendundant insect rDNA
nucleotide sequence (I use SSAHA despite being slow because it allows
% identity cutoffs)
ssaha2 -best 1 -memory 4000 -identity 80 -tags 1 -solexa -score 45
-cmatch 45 -save insecta_rDNA.nr80
^ALIGNMENT|grep -o -E 'USI-EA[^ ]+' >
get the sequences that match and assemble them separately.
BLAST to verify it is all rDNA
then you have an rDNA library for that species.
For every dataset
Use your favourite aligner to pre-process the data and identify rDNA
using a species specific rDNA (a fast aligner such as bwa is good)
optionally assemble those reads and verify they are all rDNA.
Optionally you may not only pick the reads that match an rDNA but also
their PE mate (if it is PE)
Before all this, I use the fastx_toolkit to quality trim the data
(fastq_quality_trimmer -t 19 or -t 24 depending on the difference in
file size between the two outputs)
On Tue, Jan 25, 2011 at 4:03 AM, <arthropod-request from oat.bio.indiana.edu> wrote:
> Send Arthropod mailing list submissions to
> arthropod from net.bio.net
> To subscribe or unsubscribe via the World Wide Web, visit
> or, via email, send a message with subject or body 'help' to
> arthropod-request from net.bio.net
> You can reach the person managing the list at
> arthropod-owner from net.bio.net
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Arthropod digest..."
> Today's Topics:
> 1. RNA-seq transcriptome studies: remove ribosomal rna tip
> (Don Gilbert)
> Message: 1
> Date: Sun, 23 Jan 2011 23:44:48 -0500 (EST)
> From: Don Gilbert <gilbertd from net.bio.net>
> Subject: [Arthropod] RNA-seq transcriptome studies: remove ribosomal
> rna tip
> To: arthropod from magpie.bio.indiana.edu
> Message-ID: <201101240444.p0O4imH10683 from net.bio.net>
> For those of you contemplating Rna-seq experiments, here is a tip:
> consider methods to deplete ribosomal RNA before sequencing.
> In aphid rna-seq data now at NCBI/SRA, the rRNA genes (spread over a
> mere 300 Kb of poorly assembled scaffolds) account for 50% to over
> 90% of rnaseq reads, depending on experiment and body tissue. These
> clog up analyses as well as waste useful experimental effort
> (i.e. for 1.1 billion aphid reads available, 900 mill are rRNA reads,
> with 200 million for the rest of transcriptome).
> I have no experience in the methods involved, but here are examples
> -- Don Gilbert
> RNA-Seq quantitative measurement of expression through massively parallel RNA-sequencing
> "2.2. rRNA removal
> One of the principal technical hurdles to overcome with RNA-seq is the fact that
> the vast majority of RNA (>90%) present in cells consists of ribosomal RNA (rRNA)."
> "Ribosomal RNA depletion ..
> Ae. aegypti .. subjected to eukaryotic ribosomal RNA depletion using the RiboMinus Kit (Invitrogen) "
> Arthropod mailing list
> Arthropod from net.bio.net
> End of Arthropod Digest, Vol 10, Issue 5
More information about the Arthropod