IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Program to count Phred 20 bases

Phillip San Miguel pmiguel at purdue.edu
Tue Oct 19 16:13:29 EST 1999


"Stephen R. Lasky" wrote:

> Phillip San Miguel wrote:
> >
> > Is there a publically available program or script that will
> > count the number of Phred 20 bases (that is, the number of
> > bases with quality scores of 20 or higher) for each sequence
> > in a quality file generated by Phred?
>
> you might want to contact Brent Ewing at UW about a program called
> qrep.  I think this gives the kind of output you want.

    Thanks, I'll look into this.
    My wife bought me a book on perl this weekend and I managed to kludge
together a script that counts the number of quality scores > 20 in a quality
file. I'd include it here, but I guess I'll improve it as time goes on.

> > I have a couple of other questions about Phred: what are the
> > scores generated by the -qr and how is a "High Quality Base"
> > defined?
>
> high quality bases are those with a phred q value of more than 20.  I
> think the numbers in the -qr report are the number of lanes that have
> q>20 scores for the number of bases shown on in the first column.

I've checked this a number of ways and the q>20 scores are different
(usually, maybe always higher) than the "high quality bases" scores in the
-qr report.

> > I've noticed that Phred will generate a histogram
> > of scores of some sort using the -qr qualifier.
>
> I don't get a histogram with version of phred ..... -qr <filename> that
> I am using, but I get three when I run qrep.  The first is the percent
> of the dataset that have x number of total bases, the second is the
> percent of dataset that have q values listed in the first column, and
> the third is the percent of the reads in the dataset that have x quality
> values.  You also get the average total length of the read, the total
> bases read (sum of total bases per lane), and the average number of high
> quality (q>20) bases in the dataset.
>
> You can find most of this at
> http://bozeman.mbt.washington.edu/phrap.docs/phred.html

Sure, but that is the documentation I'm using. I'm using version 0.961028.m
of phred. I see the documentation is for what appears to be a much newer
version, 0.980904.a

> hope that helps.  [...]

Yes, thanks. I'd never heard of qrep. (Actually, after I posted this message
I checked the bionet archives and found your last mention of qrep. But you
go into more detail here.)

Phillip San Miguel





More information about the Autoseq mailing list

Send comments to us at biosci-help [At] net.bio.net