Different entry is same sequence (Re: same sequence is different in EMBL and GENBANK)
toms at fcsparc6.ncifcrf.gov
Mon May 9 12:00:50 EST 1994
In article <1994May8.084902.4583 at comp.bioz.unibas.ch> doelz at comp.bioz.unibas.ch
(Reinhard Doelz) writes:
| I haven't analyzed this systematically but I am afraid that inconsistencies
| like this make database provider's life difficult.
It makes the database user's life extremely difficult.
| As human intervention
| is extremely expensive (manpower) and we (customers) don't want to pay the
| prediction that it will become worse in the future is a safe guess.
Yes, unless action is taken soon eventually there will be a crisis.
| I think we all agree that databases are non-optimal. On the other hand,
| if you see those guys working, they don't feel lazy, nor do they enjoy
| being reminded that they do produce low-quality data. (I won't talk
| on proteins here but the situation there is even worse). The data need
| better MAINTENANCE!
| We could spend another XX M$ on both sides of the atlantic to have a
| staff of workers clean up the past, and cope with the flood of the future.
| But still, this wouldn't help. I think that there's something severely
| wrong with responsibilities. The researchers don't do what they should, namely
| take care of their own entries or areas, and correct the entries as appropriate.
| And, for the future, the genome projects should adopt slightly more
| responsibility for what they produce. Just dumping thousands of low-quality
| data entries to the databases, generated by robots, and complain afterwards
| doesn't help. The funding agencies must understand that a genome project
| is USELESS (read: wasted money) if the data are not integrated well into the
| data sets. The coordinators of the projects must refer from cooking their
| own little databases as they comlain the loudest on the unability of the
| general database providers. We certainly don't need hundreds of small databases
| but rather one set which is complete, and high quality.
| ?We ?
| Who are 'We' that we tolerate these duplications without doing something
| ourselves? A change in culture is needed.
Duplication should not be tolerated, that's why it is the first principle in my
database philosophy paper. (anonymous ftp from
ftp.ncifcrf.gov/pub/delila/philgen* but in revision at the moment. If you
would like me to tell you when the next revision is out, please send me a
National Cancer Institute
Laboratory of Mathematical Biology
Frederick, Maryland 21702-1201
toms at ncifcrf.gov
More information about the Proteins