Program to count Phred 20 bases

Phillip San Miguel pmiguel at purdue.edu
Tue Oct 19 16:13:29 EST 1999

"Stephen R. Lasky" wrote:

> Phillip San Miguel wrote:
> >
> > Is there a publically available program or script that will
> > count the number of Phred 20 bases (that is, the number of
> > bases with quality scores of 20 or higher) for each sequence
> > in a quality file generated by Phred?
> you might want to contact Brent Ewing at UW about a program called
> qrep.  I think this gives the kind of output you want.

    Thanks, I'll look into this.
    My wife bought me a book on perl this weekend and I managed to kludge
together a script that counts the number of quality scores > 20 in a quality
file. I'd include it here, but I guess I'll improve it as time goes on.

> > I have a couple of other questions about Phred: what are the
> > scores generated by the -qr and how is a "High Quality Base"
> > defined?
> high quality bases are those with a phred q value of more than 20.  I
> think the numbers in the -qr report are the number of lanes that have
> q>20 scores for the number of bases shown on in the first column.

I've checked this a number of ways and the q>20 scores are different
(usually, maybe always higher) than the "high quality bases" scores in the
-qr report.

> > I've noticed that Phred will generate a histogram
> > of scores of some sort using the -qr qualifier.
> I don't get a histogram with version of phred ..... -qr <filename> that
> I am using, but I get three when I run qrep.  The first is the percent
> of the dataset that have x number of total bases, the second is the
> percent of dataset that have q values listed in the first column, and
> the third is the percent of the reads in the dataset that have x quality
> values.  You also get the average total length of the read, the total
> bases read (sum of total bases per lane), and the average number of high
> quality (q>20) bases in the dataset.
> You can find most of this at
> http://bozeman.mbt.washington.edu/phrap.docs/phred.html

Sure, but that is the documentation I'm using. I'm using version 0.961028.m
of phred. I see the documentation is for what appears to be a much newer
version, 0.980904.a

> hope that helps.  [...]

Yes, thanks. I'd never heard of qrep. (Actually, after I posted this message
I checked the bionet archives and found your last mention of qrep. But you
go into more detail here.)

Phillip San Miguel

