[Bio-software] 454 EST assembly

Amit Dattatreya Parhar via bio-soft%40net.bio.net (by amit.p from ocimumbio.com)
Fri Aug 10 06:28:17 EST 2007


Hi,

We are going to perform an EST assembly project based on 454 data. And
although 454 has a software for assembly (I feel that it is not good enough
for repetitive or Eukaryotic genomes) . Would CAP3 or PCAP do decent job if
I feed the program with assembled contigs from 454 software say length
approx. 200 bps?
will this two stage strategy would work?

Is there any other open source software available which do the job
efficiently?

Thanks in advance.

regards,
Amit

----- Original Message ----- 
From: <bio-soft-request from oat.bio.indiana.edu>
To: <bio-soft from magpie.bio.indiana.edu>
Sent: Saturday, August 04, 2007 10:31 PM
Subject: Bio-soft Digest, Vol 27, Issue 3


> Send Bio-soft mailing list submissions to
> bio-soft from net.bio.net
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://www.bio.net/biomail/listinfo/bio-soft
> or, via email, send a message with subject or body 'help' to
> bio-soft-request from net.bio.net
>
> You can reach the person managing the list at
> bio-soft-owner from net.bio.net
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Bio-soft digest..."
>
>
> Today's Topics:
>
>    1. Re: GCG non-support (Steve Thompson)
>    2. Re: GCG non-support (Peter Rice)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 3 Aug 2007 14:55:28 -0400 (EDT)
> From: Steve Thompson <stevet from bio.fsu.edu>
> Subject: [Bio-software] Re: GCG non-support
> To: Peter Rice <pmr from ebi.ac.uk>
> Cc: mol-evol from magpie.bio.indiana.edu, bio-soft from magpie.bio.indiana.edu,
> info-gcg from magpie.bio.indiana.edu, comp-bio from magpie.bio.indiana.edu
> Message-ID: <20070803131050.H7747 from epsilon.bio.fsu.edu>
> Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
>
> On Fri, 3 Aug 2007, Peter Rice wrote (in part):
>
> > [Re. SeqLab custom extensions availablity] In this case, sharing may be
> > possible. I can ask anyway.
>
> Yes, that would be great, especially if we get Accelrys to release
> SeqLab's code.  I was quite pessimistic a couple of weeks ago, but have
> some [small] hope now.  But I do realize not to 'hold my breath.'
>
> > Can you make a quick list of the additional functionality you would like
> > to see [in GDE]?  I seem to recall "rich sequence format" was one of
> > GCG's major extensions.
>
> Yes, some have already been mentioned in this thread; I think these two
> are the most important:
>
> 1) The ability to directly load sequence data from output sequence lists
> from other programs such as BLAST, FastA, and a reference searching
> program (in GCG that is the SRS derivative LookUp) with the option of
> trimming that data to the length id'ed by a similarity search.  As Nick
> mentioned this ability to handle "ad-hoc" databases can be very powerful.
>
> 2) The ability to display FEATURE information from database entries in
> colored and graphical representations.  This is especially helpful for
> homology inference of active sites and secondary structure.
>
> > [Re. Accelrys' "perpetual license" plan] Interesting. I see it mentions
> > Pipeline Pilot. EMBOSS is an ISV partner and committed to interfacing
> > EMBOSS applications as Pipeline Pilot components.
>
> Pipeline Pilot is exciting and incredibly powerful from all I can tell,
> but also incredibly expensive - well beyond the budget of most university
> departments and/or research institutes, especially these days.
>
> > I hope there is some source code release. We had the source code at
Sanger
> > not just for EGCG - but also because the sequencers needed more than the
> > 350kb limit on sequence length.
> >
> > But I see no quick way to decide on possible source code licensing. Too
> > many authors, too many companies/institutes. I am not surprised that
> > nothing is promised at this stage.
> >
> > Hang on to your licences though ... one (legally speaking) relatively
easy
> > possibility would be a cheap source license for existing licensees.
>
> Oh yes, I don't plan on letting anything that we already own get lost.
>
> > [Re. getting SRS5 code under SeqLab or GDE] That would be difficult ...
> > and I am not too confident of the status of SRS5 code anyway. But there
> > are some alternatives around.
>
> Too bad, but I'm glad there are alternatives.  Perhaps some variation of
> NCBI's stand alone Entrez, but it is designed for ASN.1 data . . . . .
>
> > [Re. my old days at WSU's VADMS Center] Hah! Not the one I meant (VADMS
> > had an alpha binary release). The beta test was at HGMP/RFCGR in Hinxton
> > - where the EMBOSS development team was and it was their closure that
> > threw EMBOSS into crisis mode.
>
> Yes, that was another terrible shame of funding drying up.
>
> > An early version of MSE was used by GCG for several of their editing
> > applications, so at least that and GDE 2.3 are currently available with
> > source code. EMBOSS's MSE is under GPL. I will try to figure out what
GDE's
> > licence really means.
>
> A clarification of GDE's license would help a lot.  Thanks!
>
>                               Cheers - Steve
>                                                  Steven M. Thompson
>                                  A C T G         stevet from bio.fsu.edu
>                                    \-/
http://bio.fsu.edu/~stevet/cv.html
>                                    /\
>                                   /--|          FSU SCS / BioInfo 4U
>                                  /---/
>                                  |--/    Florida State University School
of
>                                  \-/            Computational Science
>                                   /\
>                                  /--\            1st floor DIRAC 150G
>                                  |---\           Tallahassee, Florida
>                                   \---\          32306-4120
>                                    \--|          850-644-4490
>                                     \-/
>                                     /\           2538 Winnwood Circle
>                                    /--\          Valdosta, Georgia
>                                   /---|          31601-7953
>                                   |--/           229-249-9751
>
>
>
> ------------------------------
>
> Message: 2
> Date: Fri, 03 Aug 2007 22:55:10 +0100
> From: Peter Rice <pmr from ebi.ac.uk>
> Subject: [Bio-software] Re: GCG non-support
> To: Steve Thompson <stevet from bio.fsu.edu>
> Cc: mol-evol from magpie.bio.indiana.edu, bio-soft from magpie.bio.indiana.edu,
> info-gcg from magpie.bio.indiana.edu, comp-bio from magpie.bio.indiana.edu
> Message-ID: <46B3A43E.9010000 from ebi.ac.uk>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> Steve Thompson wrote:
> >> Can you make a quick list of the additional functionality you would
> >> like to see [in GDE]?  I seem to recall "rich sequence format" was one
> >> of GCG's major extensions.
> >
> > Yes, some have already been mentioned in this thread; I think these two
> > are the most important:
> >
> > 1) The ability to directly load sequence data from output sequence lists
> > from other programs such as BLAST, FastA, and a reference searching
> > program (in GCG that is the SRS derivative LookUp) with the option of
> > trimming that data to the length id'ed by a similarity search.  As Nick
> > mentioned this ability to handle "ad-hoc" databases can be very
powerful.
>
> That input should be simple (something like EMBL:X13776) and the code is
> relatively easy to do
>
> > 2) The ability to display FEATURE information from database entries in
> > colored and graphical representations.  This is especially helpful for
> > homology inference of active sites and secondary structure.
>
> That was the part that used "rich sequence format" to store rearranged
> features and markup.
>
> I expect it can be reproduced using GFF as the feature standard (EMBOSS
> uses GFF internally, it is a good fit with even the EMBL/Genbank/DDBJ
> feature table and extendable for colouring etc - Artemis does something
> similar)
>
> > Too bad, but I'm glad there are alternatives.  Perhaps some variation of
> > NCBI's stand alone Entrez, but it is designed for ASN.1 data . . . . .
>
> Or MRS from CMBI in Nijmegen. Or something using web services. There are
>   many possibilities.
>
> Peter
>
>
>
> ------------------------------
>
> _______________________________________________
> Bio-soft mailing list
> Bio-soft from net.bio.net
> http://www.bio.net/biomail/listinfo/bio-soft
>
> End of Bio-soft Digest, Vol 27, Issue 3
> ***************************************
>




More information about the Bio-soft mailing list