acedb future?

Ewan Birney birney at ebi.ac.uk
Tue Sep 5 04:12:43 EST 2000



(for the people who don't get constantly hassled by my email, I am one of
the technical leads for the Ensembl project and also the bioperl
coordinator. I played with acedb about 4 years ago; as my letter says, I
like some things and hate others in acedb...)




Where should ACEDB go?


ACeDB is a loveable, annoying, integrated genome database with many great
features and many large faults. A hard question is where acedb should go
over the next 2 to 3 years. These are my views, as biased as always:

(a) componentise heavily. ACeDB should become 4-5 separate projects,
with their own release schedules and cvs modules, being something like:

	(i) database kernel

	(ii) bioobject layer

	(iii) graphics library

	(iv) fmap

	(v) other viewers

I would take the "chain saw" approach for componentising these things,
ie, having some pretty brutal tearing apart of the code, potentially
copying large amounts of files so that projects can be separated
appropiately. 

The new projects, in my mind, would not be run via the classic ACeDB
source code management system + makefile, but a more vanilla cvs with
autoconf style configurations, lowering the problems of entry for new
developers to work with you guys.

>From the little that I know of ACeDB, I think the code perhaps reuses code
*too much* at a fine granularity, which then causes problems in reusing
large components.

(b) plot a course for each component such that the end point is to be "one
of the best open projects" in this area. Take note of what is out there
already, and decide whether it is better to merge with other open 
source projects, discard projects or forge ahead. Here are some ideas:


  - make the database kernel the best open source XML database;
reading/writing XML and XMLSchema; Bindings to perl/python/java; JDBC
bindings if it is mappeable.

	- there are XML databases out there but I don't know if they are
that good (the open source ones). Is XMLSchema good? Or established? 
Certainly DTD support would have to be put in.


  - make the bio objects lean, mean tight objects able to sit on top of
acedb or another database; able to coexist with other schemes well

	- there is EMBOSS, an established C bio-objects system. There are
the bioperl/biopython/biojava projects.

  - make fmap the world's best, fast, large genome viewer. Look at the 
apollo server IDL and leverage that. Discuss what people like and hate
about fmap. Integrate GAZE and other 3rd party programs using everything
from a stanardised GFF system call-out to CORBA based services.



I would hate to use ACEDB at the moment, but that doesn't mean I can't see
the potential for many aspects of the software. I believe that it is time
for your guys to be bold and go forward. I have no doubt that this is
going to be difficult in many areas; some tough decisions await you guys,
but it is better to make bold moves than none at all.



ewan (an acedb admirer)




Apollo IDL 

http://www.ensembl.org/Docs/wiki/html/EnsemblDocs/ApolloIDL.html

EMBOSS

http://www.sanger.ac.uk/Software/EMBOSS

bioperl/biojava/biopython

bio.perl.org
biojava.org
biopython.org




-----------------------------------------------------------------
Ewan Birney. Mobile: +44 (0)7970 151230, Work: +44 1223 494420
<birney at ebi.ac.uk>. 
-----------------------------------------------------------------




---







More information about the Acedb mailing list