[Bio-software] MIRA V2.4.0 assembler released

Bastien Chevreux bach at chevreux.org
Sun Sep 11 16:46:45 EST 2005


Hi all,

MIRA is a sequence assembly system suited for genome and EST sequences. 

Since V2.4.0rc1, the binary has no restrictions whatsoever concerning the
number of sequences it processes.

Highlights of the V2.4.0 version include:
- overall speedups in many parts thanks to new algorithms (read/read
  comparison, SW alignment, pathfinder module, contig build module, read
  extension)
- overall quality improvements: longer contigs with less errors remaining,
  reliable detection and resolving of misassemblies when using clone pair
  (also called templates or "double-barreled data") techniques, enhanced
  'probably true' consensus computation without gaps and with consensus
  quality files, improved automatic editor when using ABI 373, 377, 3100 and
  3700 trace files (MEGABACE should also be ok)
- assembly for whole genomes supported for up to 10 megabases (and more for
  really fast and big computers)
- EST assembly support: detection of SNPs; transcript assembly by strains,
  according to detected SNP bases, special routines for extreme coverage
  that allow assembly of gene families with thousands of similar sequences
- additional and/or improved input and output formats; fasta with quality,
  gap4 directed assembly, phrap/consed ACE format (output only) and others
- assembly options: a plethora of options to fine tune the assembly, these
  can now also be loaded from parameter files
- data preprocessing routines if these were not or incorrectly provided by
  external data preprocessing programs: clipping potential vector leftovers
  in sequences, support for 'screened' bases in FASTA files, own quality
  clipping routines, tagging of poly-A or poly-T bases at the end of EST
  sequences
- full IUPAC support in input and output files (as well as internal
  computation)
- support for merging ancillary data present in EXP files or loaded from XML
  trace info files (in NCBI format)
- many assembly info files generated, containing machine and human readable
  statistics, cluster and assembly information
- optimised multiple alignments (no more gap base jiggling)
- possibility to load "backbones" and assemble against those sequences
- possibility to assembly several closely related strains in one go
- support for loading GenBank (gbf/gbk) files while retaining all features
  and transferring them to Staden GAP4 viewers
- improved documentation and examples
- a lot more

The MIRA sequence assembler is available precompiled for 32 and 64 bit Linux
platforms at http://chevreux.org/projects_mira.html

Regards,
  Bastien

-- 
        -- The universe has its own cure for stupidity. --
         -- Unfortunately, it doesn't always apply it. --



More information about the Bio-soft mailing list