Searching databases with GCG and others

James Tisdall tisdall at amalthea.humgen.upenn.edu
Thu Jan 13 17:16:09 EST 1994


In article <pagilbert-130194100516 at 132.203.140.7> pagilbert at cti.ulaval.ca (Philippe-Alexandre Gilbert) writes:
>
>Is it possible to search PIR or other databases for sequence from a
>specific size (or a size range) from GCG? I also tried with gopher and
>keywords like #Length 300 but it doesn't work (and how to specify a range
>with gopher ?)
>
>Thank you for your help.
>
>-- 
>Philippe-Alexandre Gilbert             tel: (418)-656-2964
>Centre de Traitement de l'Information  e-mail: pagilbert at cti.ulaval.ca
>Departement de Biochimie
>Quebec, Canada


Not sure about GCG - but since you request "GCG or others" - in DNA WorkBench,
free software at cbil.humgen.upenn.edu in pub/dnaworkbench via anonymous ftp,
this works:

  #for length exactly 300, in PIR-
database pir
sequence ^.{300}$ pirall

  #for length 300 or greater-
sequence ^.{300,}$ pirall

  #for length between 300 and 400-
sequence ^.{300,400}$ pirall

  #for length less than or equal to 300, in GenBank-
database genbank
sequence ^.{1,300}$ gball

Explanation:
The SEQUENCE command searches for sequence, which may be something like
ACCTGGGCT, or may incorporate "regular expressions", a form of "wild card"
notation much used in computer science.  
^         means starting from the beginning
.         means match any nucleotide or amino acid
{300,500} means match 300 to 500 of them
$         means match the end of the sequence.  
So, all together it means match any sequence that has 300 to 500
nucleotides or amino acids from beginning to end.
======================================================================
James Tisdall
Departments of Genetics and Computer and Information Science
Computational Biology and Informatics Laboratory, Human Genome Project
University of Pennsylvania

tisdall at cbil.humgen.upenn.edu
215-573-3113
fax 215-573-3111
======================================================================




More information about the Bio-soft mailing list