[Bio-software] Re: How to find scattered (non-tandem) repeats in DNA sequences?

Mikhail Fursov via bio-soft%40net.bio.net (by mike.fursov from gmail.com)
Wed Apr 7 01:41:21 EST 2010


Hi Sunny,

Your results are not related to the nested repeats filter. The nested
repeat in UGENE is the repeat that has all its parts inside of some
other repeat. For example the repeat join(20..30, 120..130) can be
cosidered as nested for the repeat join(10..40, 110, 140)

In your example we see that the region 3+ looks very similar to 4
other regions: 1625+, 1995+, 1806+ and 1488+
Actually this can be considered as a single 5-parts repeat, but UGENE
doesn't have repeats grouping function today and reports 5 pairs. The
grouping is not a simple task for repeats with <100% identity,  but
should be easy to implement for 100% option. We will add this feature
to our TODO list for the future releases.

Now you can also try to use other types of analysis available in UGENE
together with repeats finder. For example you can find ORFs, run
HMMER, BLAST, CDD, TFBS searches for the region between your repeats
or just align them in alignment editor. This can be like a constructor
of complex elements. Note that markup results for the given sequence
are always stored as annotations in Genbank file, so there is no need
to repeat computations every time - just load the annotations for the
sequence you found previous time.

Best regards,
Mikhail.

On Apr 7, 4:19 am, RY Bao <ry_... from yahoo.com> wrote:
> Hi Mike,
>
> Thank you very much! It's working great. I obtained the list of scattered repeats with their positions, length, and distance between the repeats (minimum repeat length = 6bp, repeats identity = 100%, minimum distance between repeats  = 1bp) . I have one quick question. For example, in case of
>
> 1. repeat unit, join(2..9, 1625..1632 )
> 2. repeat unit, join(3..10, 1995..2002)
> 3. repeat unit, join(3..9, 1806..1812)
> 4. repeat unit, join(3..8, 1488..1493)
>
> Those 4 query repeats overlap, however, their target repeats do not overlap at all. Is this because in case of overlapping/nested query repeats, only non-overlapping target repeats are counted? I read the repeat finder plugin section on the instruction page but was not sure. Thanks a lot!
>
> Greetings,
> Sunny
>
> ________________________________
> From: Mikhail Fursov <mike.fur... from gmail.com>
> To: bionet-softw... from moderators.isc.org
> Sent: Tue, April 6, 2010 8:38:06 AM
> Subject: [Bio-software] Re: How to find scattered (non-tandem) repeats in DNA sequences?
>
> On Apr 6, 2:54 am, RY Bao <ry_... from yahoo.com> wrote:
>
> > Hello Everybody,
>
> > I have a 2kb DNA sequence and would like to analyze the scattered repeats in it, which can be allowed with certain mismatches and may be separated by 1 to hundreds of base pairs. Is there a computer program or Web service available to perform this task ?
>
> > Greetings,
> > Sunny Bao (Wayne State University, US)
>
> Hi Sunny Bao,
>
> Here is a description of one of the possible solutions for your:
>
> 1) Open your sequence in UGENE(http://ugene.unipro.ru). Just run
> UGENE and use "Open" button or menu.
>
> 2) Select "Actions->Analyze->Find repeats" from the main menu
>
> 3) Select the repeats parameters: size/identity/distance etc...
>
> 4) Push start button and see the repeated regions as annotations
> (highlighted regions) to the sequence.
>
> Hope it helps,
> Mike.
>
> _______________________________________________
> Bio-soft mailing list
> Bio-s... from net.bio.nethttp://www.bio.net/biomail/listinfo/bio-soft




More information about the Bio-soft mailing list