Functional equivalent of GCG Findpatterns?

Gary Williams gwilliam at hgmp.mrc.ac.uk
Wed Jan 19 04:26:56 EST 2000


In article <sc4vh4rd7vv.fsf at fes1.sanger.ac.uk>,
Keith James  <kdj at fes1.sanger.ac.uk> wrote:
>>>>>> "Greg" == Greg Quinn <greg at franklin.burnham-inst.org> writes:
>
>    Greg> I appreciate the direction of the EMBOSS project,but I'm not
>    Greg> sure that FuzzNuc, or I guess the one that I would be
>    Greg> interested in, FuzzPro is the functional equivalent of
>    Greg> FindPatterns; unless I missed something on the description
>    Greg> page for FuzzPro, I didn't see a specific query sequence
>    Greg> syntax of the kind found in FindPatterns that precisely
>    Greg> allows me to specify particular residues at each position in
>    Greg> the query. It's this kind of ability which I'm looking
>    Greg> for....
>
>Yes, you are right. Having only used it for searching using IUB
>ambiguity codes I assumed that when it was prompting for 'search
>pattern' it would accept some sort of regular expression whose syntax
>was described elsewhere in the docs.
>
>I've just tried it with GCG and Unix regexp type patterns and it
>doesn't accept them. Now I know!


Try using the Prosite pattern specification.


   -  The standard IUPAC one-letter codes for the amino acids are used.
   -  The symbol `x' is used for a position where any amino acid is accepted.
   -  Ambiguities are  indicated by  listing the acceptable amino acids for a
      given position,  between square  parentheses `[  ]'. For example: [ALT]
      stands for Ala or Leu or Thr.
   -  Ambiguities are  also indicated  by listing  between a  pair  of  curly
      brackets `{  }' the  amino acids  that are  not  accepted  at  a  given
      position. For  example: {AM}  stands for  any amino acid except Ala and
      Met.
   -  Each element in a pattern is separated from its neighbor by a `-'.
      (Optional).
   -  Repetition of  an element  of the pattern can be indicated by following
      that element  with a  numerical value  or  a  numerical  range  between
      parenthesis. Examples: x(3) corresponds to x-x-x, x(2,4) corresponds to
      x-x or x-x-x or x-x-x-x.
   -  When a  pattern is  restricted to  either the  N- or  C-terminal  of  a
      sequence, that  pattern either starts with a `<' symbol or respectively
      ends with a `>' symbol.
   -  A period ends the pattern. (Optional).


   Example:

   [AC]-x-V-x(4)-{ED}

   This pattern  is translated  as: [Ala or Cys]-any-Val-any-any-any-any-{any
   but Glu or Asp}


Gary Williams               Tel: +44 1223 494522  Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.uk            http://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK







More information about the Bio-soft mailing list