[Bio-software] comp linguistics - pseudoknots

jlynn from acsalaska.net via bio-soft%40net.bio.net (by jlynn from acsalaska.net)
Wed Apr 22 03:03:49 EST 2009


My approach to finding RNA sructure, including pknots and triple helix
is a bit different:

I strictly search for structure ingnoring specific nt sequence by
writing formal grammars for what ever I'm looking for.

Simple examples are find a sequence that matches loops ((((::::)))),
pknots ((((::::[[[:::))))::)))) or even something
like a triple helix pknot:

mt plasmid DNA
(((::[[[:::{{{:::))):::]]]:::}}}
ATCAGAAAGCGTGGTCTGATGTTTTTTCACCA

platipus genome
(((::[[[::{{{:::))):::]]]:::}}}
CAAGTTCTATTGAGTTTTGGGGAGAAGGTCA

platipus genome
(((:::[[[:::{{{:::))):::]]]:::}}}
CACAGCTGATGATAAATGGTGGAGTCAGGATTA

Its obvious we get many "impossible" or ambiguous structures as
matches but as the complexity of the structure increases the
probability that the matched structure is something "real" increases.

Its essentially a way around the NP problem and for most H-type pknots
I can scan a complete viral genome in about 1-3 minutes - all grammars
seem to parse in O(n) time.

I like some feedback on the worth of such a method.

James Lynn
www.rnaparse.com



More information about the Bio-soft mailing list