"Stephen R. Lasky" wrote:
> Phillip San Miguel wrote:
> > Is there a publically available program or script that will
> > count the number of Phred 20 bases (that is, the number of
> > bases with quality scores of 20 or higher) for each sequence
> > in a quality file generated by Phred?
>> you might want to contact Brent Ewing at UW about a program called
> qrep. I think this gives the kind of output you want.
Thanks, I'll look into this.
My wife bought me a book on perl this weekend and I managed to kludge
together a script that counts the number of quality scores > 20 in a quality
file. I'd include it here, but I guess I'll improve it as time goes on.
> > I have a couple of other questions about Phred: what are the
> > scores generated by the -qr and how is a "High Quality Base"
> > defined?
>> high quality bases are those with a phred q value of more than 20. I
> think the numbers in the -qr report are the number of lanes that have
> q>20 scores for the number of bases shown on in the first column.
I've checked this a number of ways and the q>20 scores are different
(usually, maybe always higher) than the "high quality bases" scores in the
> > I've noticed that Phred will generate a histogram
> > of scores of some sort using the -qr qualifier.
>> I don't get a histogram with version of phred ..... -qr <filename> that
> I am using, but I get three when I run qrep. The first is the percent
> of the dataset that have x number of total bases, the second is the
> percent of dataset that have q values listed in the first column, and
> the third is the percent of the reads in the dataset that have x quality
> values. You also get the average total length of the read, the total
> bases read (sum of total bases per lane), and the average number of high
> quality (q>20) bases in the dataset.
>> You can find most of this at
Sure, but that is the documentation I'm using. I'm using version 0.961028.m
of phred. I see the documentation is for what appears to be a much newer
> hope that helps. [...]
Yes, thanks. I'd never heard of qrep. (Actually, after I posted this message
I checked the bionet archives and found your last mention of qrep. But you
go into more detail here.)
Phillip San Miguel