Why does the sequence_vector_clip module add a PR line?

Marcus Claesson m.claesson at student.ucc.ie
Tue Dec 23 15:30:41 EST 2003


Hi all,

I keep getting 'Inconsistent' templates when entering the assembly
into Gap4 and I would be extremely grateful if someone could help me
figure out why. I'm running Pregap4 1.3 and Gap4 4.7 on Linux RedHat
9.0.

Here's an example of what it can look like in the Template Display:

Template name		ls012_016d_h12 
number			3 
strands			1 
vector			unknown (1) 
clone			unknown (1) 
min insert length	3000 
max insert length	5000 
estimated length        4976
Status                  Inconsistent
Contains reading ls012_016d_h12.q1ca (3) from contig
ls012_016d_h12.q1ca (3)
Contains reading ls012_016d_h12.p1ca (29) from contig
ls002_001_g20.p1cb (6)

My guess is that it's because I have multiple PR lines in the
experiment files, like in this ls012_016d_h12.q1ca.exp file:

ID   ls012_016d_h12.q1ca
EN   ls012_016d_h12.q1ca
LN   ls012_016d_h12.q1ca..scf
LT   SCF
AQ   54.920000
AV   0 3 7 8 8 7 7 8 8 10 11 13 7 6 8 8 8 8 6 6 6 6 6 6 9 11 10 8 
...
TN   ls012_016d_h12
PR   2
SI   3000..5000
CH   7
SL   65
SR   791
PR   1
SF   	$STADTABL/vectors/pwsk29.seq

PR really should be 2 but somehow 'PR 1' is added. Turning off modules
one by one reveals that it is the sequence_vector_clip module that
adds it (together with SL,SR and SF which is all ok). It is this
faulty PR setting that is entered into Gap4 that causes the
Inconsistency (I think). Why is this line added, I already have added
PR from the naming scheme?

My sequence_vector_clip settings are:
Use Vector-primer file - Yes
Vector-primer filename - /usr/local/staden/tables/vector_primer
Select vector-primer subset - pwsk29/EcoRV
Max primer to cut-site - 40
Percentage min 5'match - 60
Percentage max 3'match - 80
Default 5' position - -1

The naming scheme I'm using is a modified version of
Sanger_new_naming_scheme:

set ns_name "New style Sanger Centre naming scheme, Local Version"
set ns_regexp {([^_]+)([^.]+)\.([pqrsf][0-9])([a-z])}
set ns_lt(SI) {subst {$1 {ls01* 3000..5000} 1500..3000}}
set ns_lt(TN) {$1$2}
set ns_lt(PR) {subst {$3 {[spf]1 1} {[qr]1 2} {[spf]* 3} {[qr]* 4} 0}}
set ns_lt(CH) {subst {$4 {p 2} {b 6} {e 8} {t 3} {d 5} {c 7} {f 9} {l
11} {m 12} {n 13} {j 16} {k 17} 0}}
set_name_scheme


Have you seen anything like this?

Best regards,
Marcus




More information about the Staden mailing list