From owner-embldatabank@net.bio.net Sun Dec 03 22:00:00 1995
Newsgroups: bionet.molbio.embldatabank
Path: biosci!bcm.tmc.edu!news.msfc.nasa.gov!newsfeed.internetmci.com!howland.reston.ans.net!EU.net!Germany.EU.net!news.dfn.de!news.embl-heidelberg.de!bioftp.unibas.ch!daresbury!hgmp.mrc.ac.uk!ebi.ac.uk!EMBL-EBI.ac.uk!stoesser
From: stoesser@EMBL-EBI.ac.uk (Guenther Stoesser)
Subject: Feature Table Definition Document v1.08
Sender: news@ebi.ac.uk (Mr news)
Message-ID: <DJ2Bwr.CGx@ebi.ac.uk>
Date: Mon, 4 Dec 1995 12:55:39 GMT
Lines: 33
Reply-To: stoesser@EMBL-EBI.ac.uk (Guenther Stoesser)
Organization: European Bioinformatics Institute (EMBL) - UK
X-Newsreader: mxrn 6.18-32


Dear Colleagues,

this is to announce that the DDBJ/EMBL/GenBank Feature Table Definition
Document v1.08 has been created.

The new version of this document is available by anonymous FTP as a
compressed tar-file with postscript parts from the European Bioinformatics
Institute (EBI) in Cambridge, UK.

FTP server at EBI:
------------------
Users should connect via anonynous FTP to FTP.EBI.AC.UK
Directory: pub/databases/embl/doc
File:      FTv1.08.tar.Z

WWW server at EBI:
------------------
The document is also available on the EBI WWW-server in URL:
http://www.ebi.ac.uk/ebi_docs/embl_db/ft/feature_table.html

I would like to thank colleagues at DDBJ and NCBI for their feedback in the
process of producing this document.
                                                                    
Regards,

Guenter Stoesser

      Guenter Stoesser                   E-mail:    stoesser@ebi.ac.uk
      European Bioinformatics Institute  URL:       http://www.ebi.ac.uk
      Hinxton Hall, Hinxton,             Telephone: +44 (0)1223 494 466
      Cambridge CB10 1RQ U.K.            Fax:       +44 (0)1223 494 468
                                                                               

From owner-embldatabank@net.bio.net Tue Dec 05 22:00:00 1995
Path: biosci!internet!biosci!not-for-mail
From: biohelp (BIOSCI Administrator)
Newsgroups: bionet.molbio.embldatabank
Subject: IMPORTANT: BIOSCI miniFAQ
Date: 6 Dec 1995 02:02:47 -0800
Organization: BIOSCI International Newsgroups for Molecular Biology
Lines: 196
Sender: daemon@net.bio.net
Distribution: world
Message-ID: <199512061000.CAA16980@net.bio.net>
NNTP-Posting-Host: net.bio.net


This is a new "miniFAQ" designed to answer the questions that come up
the *most frequently*.  The main BIOSCI FAQ (Frequently Asked
Questions) is accessible on the World Wide Web at URL
http://www.bio.net/.

	Contents:
	--------
	1) What to do about "spams," i.e., junk mail, ads, etc.

	2) Examples of subscribing and unsubscribing to the mailing lists.

	3) How to access BIOSCI/bionet newsgroup archives.

	4) The BIOSCI user address and research interest directory.


1) What to do about "spams," i.e., junk mail, ads, etc.
-------------------------------------------------------
BIOSCI is a set of parallel USENET newsgroups (the "bionet" groups)
and mailing lists.  The same postings are distributed on both media
(except for a small number of mailing-list-only groups at
net.bio.net).  Unfortunately it is becoming a despicable practice on
the Internet (by a few people out to make a fast buck) to do automated
mass postings to thousands of newsgroups and mailing lists.  These
attempts to grab free advertising are refered to as "spams" in the
usual, somewhat boneheaded, net terminology.  USENET is more
susceptible to this practice, and many spams originate on the USENET
groups and then are passed on to the mailing lists.  However, spammers
also get lists of mailing addresses and hit these too, so neither
medium is immune.

What should you do personally if you get junk mail?
---------------------------------------------------
Just delete it and move on without reading it further.  Filing a
protest is becoming increasingly useless because spammers are often
disguising the addresses where the messages are sent from.  Unless you
really understand Internet mail systems, your attempt at protest by
sending replies to the message will often end up being sent to the
address of an innocent person that the spammer is victimizing.

What can BIOSCI/bionet do to protect its newsgroups?
----------------------------------------------------
The only solution currently available is to moderate the newsgroup.
If this newsgroup is already moderated, then you are in good shape.
Moderation protects the newsgroups from about 95% of the spams that
are being sent to date.  This means that someone has to take the time
to review each message before it goes out.  We have set up software
here that simply allows the moderator to forward to an address at
net.bio.net messages that (s)he wishes to have distributed.  This
takes no more time than that needed to read the message and pass it
on, say about 1 min. per message.

Most newsgroups currently have a discussion leader who is responsible
for their newsgroup.  The discussions leaders and their e-mail
addresses are listed in the BIOSCI Information Sheet which is
available on the Web at http://www.bio.net/.  If a newsgroup is being
hit with too many junk postings, please contact the discussion leader
for that group and see if there is interest in moderating the group.
Please do not assume that by simply posting a complaint to the
newsgroup itself, anyone on the BIOSCI staff will act on your
complaint.  With close to 100 newsgroups to run, the BIOSCI staff has
to rely on the discussion leaders of each newsgroup to report problems
directly to us at biosci-help@net.bio.net.

We will moderate any of our newsgroups if the discussion leader tells
us that the readership of the group wishes to do so and if a moderator
is willing to do the work.  For most BIOSCI/bionet groups, this
entails only a few minutes of work each day.

Moderating a newsgroup will resolve probably 95% of the junk postings.
Unfortunately there are easy ways for determined spammers to override
the moderation mechanism.  We are working on new systems to provide
access to our newsgroups over the WWW.  These should be available
soon, probably November 1995, and will allow you to use your Web
browser to look at the news postings.  While this will not stop
spammers from trying to post to the groups, this will give you yet
another way, besides using USENET news, to keep the junk out of your
personal mail files.


2) Examples of subscribing and unsubscribing to the mailing lists.
------------------------------------------------------------------
PLEASE NOTE: The BIOSCI management does NOT act on
subscription/unsubscription requests that are posted improperly to the
newsgroups and mailing lists.  People who do this only bother everyone
on the lists to no avail.  Please be sure to follow the proper
procedures below.

Gory details are in the BIOSCI Information sheets on the Web at
http://www.bio.net.  Below we give an example utilizing the
METHODS-AND-REAGENTS list at both of our two BIOSCI sites:

Users in the Americas and Pacific Rim countries who use the BIOSCI
------------------------------------------------------------------
node at computer net.bio.net:
----------------------------

A) Determine the "listname" which is the <=8 character mail address
                                         ^^^^^^^^^^^^^
   for the group.  These can be found in the BIOSCI Info. Sheet.  For
   the METHODS-AND-REAGENTS group the mailing address is
   methods@net.bio.net.  The listname is the portion of the address to
   the left of the @ sign, i.e., "methods".  The listname is used with
   the "subscribe" and "unsubscribe" commands illustrated below.

B) Mail all commands in the body of a mail message addressed to
   biosci-server@net.bio.net.  Do NOT send commands to the newsgroup
   posting addresses!  Leave the Subject: line blank, any text on it
   will be ignored.

C) In the body of your message put one or more of the following
   commands with an "end" command on the last line, e.g.,

   subscribe methods
   unsubscribe methods
   end

   Do NOT put your e-mail address or other text on these lines.  The
   server only allows you to cancel your subscription if the address
   on your mail header matches the address on our mailing list.
   Please ask for help at biosci-help@net.bio.net if your address has
   changed, e.g., if you know you are on the list but the server tells
   you that you are not a member.


Users in Europe, Africa, and Central Asia who use the BIOSCI node at
--------------------------------------------------------------------
computer daresbury.ac.uk (also known as dl.ac.uk):
-------------------------------------------------

To subscribe and unsubscribe to/from the BIOSCI lists, you need to
specify the full USENET newsgroup name with "bionet-news." prepended.
The USENET newsgroup names are listed in the BIOSCI Information sheet
on the Web at http://www.bio.net/.  For the METHODS-AND-REAGENTS list
the USENET newsgroup name is bionet.molbio.methds-reagnts, thus the
appropriate commands are

    sub bionet-news.bionet.molbio.methds-reagnts

    unsub bionet-news.bionet.molbio.methds-reagnts

These commands are included in a message addressed to mxt@dl.ac.uk,
NOT to the newsgroup mailing addresses.  As usual, include the text in
the body of the message as text on the Subject: line is ignored.

To unsubscribe from all the lists at the UK node, use

    unsub bionet-news

Please note that if the address in the list is different than the one
in your mail message header, you will not be able to unsubscribe by
this method. If you have problems, please mail biosci@daresbury.ac.uk.


3) How to access BIOSCI/bionet newsgroup archives.
--------------------------------------------------
Back postings of all BIOSCI/bionet newsgroups can be found on the
World Wide Web at URL http://www.bio.net/.  There are several
searchable newsgroup indices at this site.  E-mail users can search
the BIOSCI archives by using our waismail e-mail server.  For
instructions send the message

help

to waismail@net.bio.net.  Leave the Subject: line blank (anything
entered on the Subject: line is ignored).


4) The BIOSCI user address and research interest directory.
-----------------------------------------------------------
Please take this opportunity to add your name, address, and research
interest information to the BIOSCI User Address Database if you have
not already done so.

You can fill out the address form directly through our Web page at URL
http://www.bio.net/adrform.html.

The address database is reindexed nightly for WWW access (the URL is
http://www.bio.net/).  If you are not directly on the Internet but can
reach it by e-mail, please use our waismail server to access the user
directory.  waismail use is described above.  You can also request a
user address form by e-mail from biosci-help@net.bio.net.

Please check your database entry from time-to-time to see if your
address information is still up-to-date.  Because of our limited
personnel resources, we ask that you resubmit a *complete* form to
revise your entry; we only replace complete entries and do not have
resources to edit old forms.

				Sincerely,

				Dave Kristofferson
				BIOSCI/bionet Manager

				biosci-help@net.bio.net

From owner-embldatabank@net.bio.net Wed Dec 20 22:00:00 1995
Newsgroups: bionet.molbio.embldatabank,embnet.general
Path: biosci!agate!newsxfer2.itd.umich.edu!newsfeed.internetmci.com!EU.net!peer-news.britain.eu.net!sunsite.doc.ic.ac.uk!hgmp.mrc.ac.uk!ebi.ac.uk!stoehr
From: stoehr@ebi.ac.uk (Peter Stoehr)
Subject: EMBL Release 45
Sender: news@ebi.ac.uk (Mr news)
Message-ID: <1995Dec21.115055@ebi.ac.uk>
Date: Thu, 21 Dec 1995 10:50:55 GMT
Lines: 159
Organization: European BioInformatics Institute

Release 45 of the EMBL Nucleotide Sequence Database is now available from the
network servers of the European Bioinformatics Institute:

Anonymous FTP server: ftp.ebi.ac.uk
                      Directory: pub/databases/embl/release

FASTA email server  : FASTA@ebi.ac.uk
Email retrieval     : NetServ@ebi.ac.uk
SRS WWW retrieval   : http://www.ebi.ac.uk/srs/srsc

We expect this release to become available in Europe via other EMBnet sites
very shortly, if not already.

The data files of release 45 total some 1.6GB in EMBL flat-file format.
From the release notes:

<excerpt from ftp.ebi.ac.uk:pub/databases/embl/release/relnotes.doc>

The EMBL nucleotide sequence database was frozen  to  make  Release  45  on  4th
December   1995.   The  release  contains  622566  sequence  entries  comprising
427,620,278 nucleotides.  This represents an increase of about 18% over  Release
44.  A breakdown of Release 45 by taxonomic division is shown below:

                  Division             Entries     Nucleotides
                  -----------------    -------     -----------
                  Bacteriophage           1140         1640505
                  ESTs                  370663       128422650
                  Fungi                  10456        27197150
                  Invertebrates          16592        45879687
                  Organelles             11687        12476847
                  Other Mammals           7445         8085941
                  Other Vertebrates       8658         9779295
                  Plants                 13681        17196932
                  Primates               54129        47934830
                  Prokaryotes            27650        49323813
                  Rodents                27347        30975411
                  STSs                   22072         7343968
                  Synthetic              11208         5368738
                  Unclassified           12234         5784538
                  Viruses                27604        30209973
                  -----------------    -------    ------------
                  Total                 622566       427620278

                  plus:
                  Other patents           3180          347998
                  -----------------    -------    ------------
                  Grand Total           625746       427968276




1.1  Database Cross-references

At this release we have introduced a new feature table qualifier  "/db_xref"  to
represent  cross-references to external databases.  This qualifier is valid, but
optional, for all feature keys.  There are two components to the cross-reference
value,  the  name  of the database and the identifier within that database being
referenced, formatted as follows:

                     /db_xref="database:identifier"

In this release, we have included cross-references using the "/db_xref"  on  CDS
features with the "database" values SWISS-PROT and PID described below.



1.1.1  SWISS-PROT

A cross-reference from a CDS feature to the database "SWISS-PROT" indicates that
this  feature  corresponds  to  the  entry  in  the  SWISS-PROT Protein Sequence
Database with the given accession number, eg.

                     /db_xref="SWISS-PROT:P22032"


                                       1
<PAGE>
Release Notes (Release 45)


1.1.2  PID

A cross-reference from a CDS feature to the database "PID" (protein  identifier)
contains an identifier for the translated product of that coding sequence.  This
identifier will remain the same, despite changes to the sequence, as long as the
translation  remains  the  same.  It can therefore be used by external databases
(such as SWISS-PROT) as an identifier onto which cross-references can be  built.
These  identifier  values  will be maintained in collaboration with NCBI so that
they will be the same in GenBank.  Eg.

                     /db_xref="PID:g220291"




1.2  EST Database Files

In order to keep the size  of  the  data  files  within  reasonable  limits  for
handling  purposes,  we have split the EST division into several files.  At this
release we have created a fourth file of EST data  named  EST4.DAT.   Additional
files will be added in subsequent releases as appropriate.


2  FORTHCOMING CHANGES

2.1  *IMPORTANT* Notice Of Accession Number Format Change

Nucleotide Sequence Database Collaborative Agreement, 31 May 1995

Currently, accession numbers used by the nucleotide sequence  databases  consist
of  one  prefix  letter  followed by 5 digits.  EST projects and projects to add
patent data have accelerated the need to extend the accession number space.   It
is projected that the databases will run out of accession numbers within 8 to 10
months.

It is clear that:

* As much notice as possible should be given to users and software developers
* The change should make a large enough space that another change will not
  be necessary in the foreseeable future.
* The accession number should continue to be readily identifiable as a
  DDBJ/EMBL/GenBank accession number.

The collaborators concluded that:

* A new form of accession number will be created, defined as an
  8-character alphanumeric string, beginning with two upper case
  letters and followed only by digits (e.g., SR004562).  Leading and
  trailing zeros are significant.  The letter 'O' will not be used.

* Existing 6-character accession numbers will remain as they are, and will
  never be transformed to an 8-character form.

* New accession numbers will not be used before February 1, 1996. The groups
  agree to avoid using new accession numbers as long as possible after that.


The International Nucleotide Sequence Databases
DDBJ/EMBL/GenBank




2.2  New Nucleic Acid Identifier (NI) Line

We intend to introduce a new line type NI to  contain  an  identifier  for  each
nucleic  acid  sequence.  While the sequence remains the same, so will the value
of this identifier.  When a sequence change occurs,  however  minor,  a  new  NI
value  will  be  assigned  whilst the accession number on the AC line may remain
unchanged.  These NI values are analagous to those to be represented in the  NID
lines  of  GenBank  entries,  and we will inherit GenBank NID values into our NI
lines.  Starting at release 47 (June 1996), each entry will have an NI  line  of
the form:

   AC   U35111;
   XX
   NI   g1006834

<end of excerpt>

From owner-embldatabank@net.bio.net Wed Dec 27 22:00:00 1995
Path: biosci!agate!ihnp4.ucsd.edu!library.ucla.edu!newsfeed.internetmci.com!uwm.edu!lll-winken.llnl.gov!nntp.coast.net!swidir.switch.ch!in2p3.fr!univ-lyon1.fr!jussieu.fr!pasteur.fr!infobiogen.fr!news
From: Jean-Marc Plaza <plaza@infobiogen.fr>
Newsgroups: bionet.software.srs,bionet.molbio.embldatabank
Subject: (no subject)
Date: 28 Dec 1995 11:40:46 GMT
Organization: INFOBIOGEN
Lines: 106
Message-ID: <4btvnu$9qk@lovelace.infobiogen.fr>
NNTP-Posting-Host: lovelace.infobiogen.fr
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
X-Mailer: Mozilla 1.1N (X11; I; SunOS 5.4 sun4d)
X-URL: news:bionet.software.srs
Xref: biosci bionet.software.srs:205 bionet.molbio.embldatabank:582

Hi,
there are several problems with EMBL 45 when you
run trembl (Thure Etzold) on it.
First, in CDS complement , trembl does a core dump when it finds :
	"complememt(>" instead of "complememt("
Secondly, The codon start for HSACTSG7 is /codon_start=2129 in EMBL 45
instead of 1, 2 or 3. And trembl does not like it -core dump. :-)
In Genbank the codon start is /codon_start=1 for HUMACTSG7 and 
the translated sequence is different.

Any idea to fix that ?

Thanks

Jean-Marc

EMBL 45
ID   HSACTSG7   standard; DNA; PRI; 466 BP.
DE   Human enteric smooth muscle gamma-actin gene, exon 9 and 3' flank.
FT   CDS_pept        join(D00649:2129..2254,D00649:3135..3263,D00650:653..763,
FT                   D00650:1033..1117,D00651:66..227,D00652:55..246,D00653:357
FT                   ..538,89..232)
FT                   /note="gamma-actin precursor"
FT                   /codon_start=2129
FT                   /db_xref="PID:e34116"
FT                   /translation="PEWPQTASPWIRVVPWGIRRLLRFASMEYTKGKLPIFLAGNGMPVFT                   GEEADLFDDSVRVNYFNWYINEVLKAVKEDL=
VDVRSYIVRSLIDGYEGPLGFSQRFGLYFT                   HVNFNDSSRPRTPRKSAYLFTSIIEKNGFSAKKVKRNPLPVRADFTSRARVTDSLPSEVFT                   PSK=
AKISVEKFSKQPRFERDLFYDGRFRDDFLWGVSSSPYQIEGGWNADGKGPSIWDNFFT                   THTPGNGVKDNATGDVACDSYHQLDADLNILRTLKVKSYRFSISWSRIFPTGRNS=
TINKFT                   QGVDYYNRLIDSLVDNNIFPMVTLFHWDLPQALQDIGGWENPSLIELFDSYADYCFKTFFT                   GDRVKFWMTFNEPWCHVVLGYSSGIFP=
PSVQEPGWLPYKVSHIVIKAHARVYHTYDEKYFT                   RSEQKGVISLSLNTHWAEPKDPGLQRDVEAADRMLQFTMGWFAHPIFKNGDYPDVMKWTFT                  =
 VGNRSELQHLASSRLPTFTEEEKNYVRGTADVFCHNTYTSVFVQHSTPRLNPPSYDDDMFT                   ELKLIEMNSSTGVMHQDVPWGTRRLLNWIKEEYGNIPIYITENGQGLENPT=
LDDTERIFFT                   YHKTYINEALKAYKLDGVDLRGYSAWTLMDDFEWLLGYTMRFGLYYVDFNHVSRPRTARFT                   ASARYYPDLIANNGMPLAREDEF=
LYGEFPKGFIWSAASASYQVEGAWRADGKGLSIWDTFT                   FSHTPLRIGNDDNGDVACDSYHKIAEDVVALQNLGVSHYRFSIAWSRILPDGTTKFINEFT              =
     AGLSYYVRFIDALLAAGITPQVTIYHWDLPQALQDVGGWENETIVQRFKEYADVLFQRLFT                   GDRVKFWITLNEPFVIAAQGYGTGVSAPGISFRPGTAPYIAGHNLIK=
AHAEAWHLYNDVFT                   YRARQGGTISITISSDWGEPRDPTNREHVEAARSYVQFMGGWFAHPIFKNGDYPEVMKTFT                   RIRDRSLGAGLNKSRLPEF=
TESEKSRIKGTFDFFGFNHNTTVLAYNLDYPAAFSSFDADFT                   RGVASIADSSWPVSGSFWLKVTPFGFRRILNWLKEEYNNPPIYVTENGVSRRGEPELNDFT          =
         TDRIYYLRSYINEALKAVHDKVDLRGYTVWSIMDNFEWATGFAERFGVHFVNRSDPSLPFT                   RIPRASAKFYATIVRCNGFPDPAQGPHPCLQQPEDAAPTASPV=
QSEVPFLGLMLGIAEAFT
QTALYVLFALLLLGACSLAFLTYNTGRRSKQGNAQPSQHQLSPISSF"
<H2>ERROR:</H2><P> syntax error, at "129
/db_xref="PID:e34116"
/translation="PEWPQTASPWIRVVPWGIRRLLRFASMEYTKGKLPIFLAGNGM

GENBANK 92
LOCUS       HUMACTSG7     466 bp    DNA             PRI       23-JAN-1992
DEFINITION  Human enteric smooth muscle gamma-actin gene, exon 9 and 3' flank.
ACCESSION   D00654
NID         g219424
KEYWORDS    actin; gamma-actin.
SEGMENT     7 of 7
SOURCE      Human peripheral blood genomic DNA, clone HACTSG-112.
  ORGANISM  Homo sapiens
            Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia;
            Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae.
REFERENCE   1  (bases 89 to 309)
  AUTHORS   Miwa,T. and Kamada,S.
  TITLE     The nucleotide sequence of a human smooth muscle (enteric type)
            gamma-actin cDNA
  JOURNAL   Nucleic Acids Res. 18, 4263-4263 (1990)
  MEDLINE   90332437
REFERENCE   2  (bases 1 to 466)
  AUTHORS   Miwa,T., Manabe,Y., Kurokawa,K., Kamada,S., Kanda,N., Bruns,G.,
            Ueyama,H. and Kakunaga,T.
  TITLE     Structure, chromosome location, and expression of the human smooth
            muscle (enteric type) gamma-actin gene: evolution of six human
            actin genes
  JOURNAL   Mol. Cell. Biol. 11, 3296-3306 (1991)
  MEDLINE   91246198
COMMENT     These data kindly submitted in computer readable form by: Takeshi
            Miwa  
            Department of Oncogene Research  
            Research Institute for Microbial Diseases  
            Osaka University  
            3-1 Yamadaoka  
            Suita  
            Osaka 565  
            Japan  
            Phone:  06-875-2470  
            Fax:    06-875-1292.
            
            NCBI gi: 219424
FEATURES             Location/Qualifiers
          1..466
                     /organism="Homo sapiens"
CDS_pept        join(D00649:2129..2254,D00649:3135..3263,D00650:653..763,
                     D00650:1033..1117,D00651:66..227,D00652:55..246,
                     D00653:357..538,89..232)
                     /note="gamma-actin precursor;  NCBI gi: 219426"
                     /codon_start=1
                     /db_xref="PID:g219426"
                     /translation="MCEEETTALVCDNGSGLCKAGFAGDDAPRAVFPSIVGRPRHQGV
                     MVGMGQKDSYVGDEAQSKRGILTLKYPIEHGIITNWDDMEKIWHHSFYNELRVAPEEH
                     PTLLTEAPLNPKANREKMTQIMFETFNVPAMYVAIQAVLSLYASGRTTGIVLDSGDGV
                     THNVPIYEGYALPHAIMRLDLAGRDLTDYLMKILTERGYSFVTTAEREIVRDIKEKLC
                     YVALDFENEMATAASSSSLEKSYELPDGQVITIGNERFRCPETLFQPSFIGMESAGIH
                     ETTYNSIMKCDIDIRKDLYANNVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPP
                     ERKYSVWIGGSILASLSTFQQMWISKPEYDEAGPSIVHRKCF"

---------------------------------------------------
Jean-Marc PLAZA
INFOBIOGEN - CNRS
7, rue Guy Moquet BP8 94801 VILLEJUIF Cedex, France
tel: +33 45 59 52 39  fax: +33 45 59 52 50
e-mail: plaza@infobiogen.fr
---------------------------------------------------


From owner-embldatabank@net.bio.net Wed Dec 27 22:00:00 1995
Path: biosci!agate!ihnp4.ucsd.edu!swrinde!cs.utexas.edu!howland.reston.ans.net!nntp.coast.net!swidir.switch.ch!in2p3.fr!univ-lyon1.fr!jussieu.fr!pasteur.fr!infobiogen.fr!news
From: Jean-Marc Plaza <plaza@infobiogen.fr>
Newsgroups: bionet.molbio.embldatabank,bionet.software.srs
Subject: trembl and EMBL 45
Date: 28 Dec 1995 11:43:50 GMT
Organization: INFOBIOGEN
Lines: 123
Message-ID: <4btvtm$9qk@lovelace.infobiogen.fr>
NNTP-Posting-Host: lovelace.infobiogen.fr
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Mailer: Mozilla 1.1N (X11; I; SunOS 5.4 sun4d)
X-URL: news:bionet.molbio.embldatabank?ALL
Xref: biosci bionet.molbio.embldatabank:583 bionet.software.srs:206

Sorry, I forgot the subjecty so I post it again:


Hi,
there are several problems with EMBL 45 when you
run trembl (Thure Etzold) on it.
First, in CDS complement , trembl does a core dump when it finds :
        "complememt(>" instead of "complememt("
Secondly, The codon start for HSACTSG7 is /codon_start!29 in EMBL 45
instead of 1, 2 or 3. And trembl does not like it -core dump. :-)
In Genbank the codon start is /codon_start=1 for HUMACTSG7 and
the translated sequence is different.


Any idea to fix that ?


Thanks


Jean-Marc


EMBL 45
ID   HSACTSG7   standard; DNA; PRI; 466 BP.
DE   Human enteric smooth muscle gamma-actin gene, exon 9 and 3' flank.
FT   CDS_pept        join(D00649:2129..2254,D00649:3135..3263,D00650:653..763,
FT                   D00650:1033..1117,D00651:66..227,D00652:55..246,D00653:357
FT                   ..538,89..232)
FT                   /note="gamma-actin precursor"
FT                   /codon_start!29
FT                   /db_xref="PID:e34116"
FT                   /translation="PEWPQTASPWIRVVPWGIRRLLRFASMEYTKGKLPIFLAGNGMPVFT
GEEADLFDDSVRVNYFNWYINEVLKAVKEDL
VDVRSYIVRSLIDGYEGPLGFSQRFGLYFT                   HVNFNDSSRPRTPRKSAYLFTSIIEKNGFSAKKVKRNPLPVRADFTSRARVTDSLPSEVFT
PSK
AKISVEKFSKQPRFERDLFYDGRFRDDFLWGVSSSPYQIEGGWNADGKGPSIWDNFFT
THTPGNGVKDNATGDVACDSYHQLDADLNILRTLKVKSYRFSISWSRIFPTGRNS
TINKFT                   QGVDYYNRLIDSLVDNNIFPMVTLFHWDLPQALQDIGGWENPSLIELFDSYADYCFKTFFT
GDRVKFWMTFNEPWCHVVLGYSSGIFP
PSVQEPGWLPYKVSHIVIKAHARVYHTYDEKYFT                   RSEQKGVISLSLNTHWAEPKDPGLQRDVEAADRMLQFTMGWFAHPIFKNGDYPDVMKWTFT
 VGNRSELQHLASSRLPTFTEEEKNYVRGTADVFCHNTYTSVFVQHSTPRLNPPSYDDDMFT
ELKLIEMNSSTGVMHQDVPWGTRRLLNWIKEEYGNIPIYITENGQGLENPT
LDDTERIFFT                   YHKTYINEALKAYKLDGVDLRGYSAWTLMDDFEWLLGYTMRFGLYYVDFNHVSRPRTARFT
ASARYYPDLIANNGMPLAREDEF
LYGEFPKGFIWSAASASYQVEGAWRADGKGLSIWDTFT                   FSHTPLRIGNDDNGDVACDSYHKIAEDVVALQNLGVSHYRFSIAWSRILPDGTTKFINEFT
     AGLSYYVRFIDALLAAGITPQVTIYHWDLPQALQDVGGWENETIVQRFKEYADVLFQRLFT
GDRVKFWITLNEPFVIAAQGYGTGVSAPGISFRPGTAPYIAGHNLIK
AHAEAWHLYNDVFT                   YRARQGGTISITISSDWGEPRDPTNREHVEAARSYVQFMGGWFAHPIFKNGDYPEVMKTFT
RIRDRSLGAGLNKSRLPEF
TESEKSRIKGTFDFFGFNHNTTVLAYNLDYPAAFSSFDADFT                   RGVASIADSSWPVSGSFWLKVTPFGFRRILNWLKEEYNNPPIYVTENGVSRRGEPELNDFT
         TDRIYYLRSYINEALKAVHDKVDLRGYTVWSIMDNFEWATGFAERFGVHFVNRSDPSLPFT
RIPRASAKFYATIVRCNGFPDPAQGPHPCLQQPEDAAPTASPV
QSEVPFLGLMLGIAEAFT
QTALYVLFALLLLGACSLAFLTYNTGRRSKQGNAQPSQHQLSPISSF"
<H2>ERROR:</H2><P> syntax error, at "129
/db_xref="PID:e34116"
/translation="PEWPQTASPWIRVVPWGIRRLLRFASMEYTKGKLPIFLAGNGM


GENBANK 92
LOCUS       HUMACTSG7     466 bp    DNA             PRI       23-JAN-1992
DEFINITION  Human enteric smooth muscle gamma-actin gene, exon 9 and 3' flank.
ACCESSION   D00654
NID         g219424
KEYWORDS    actin; gamma-actin.
SEGMENT     7 of 7
SOURCE      Human peripheral blood genomic DNA, clone HACTSG-112.
  ORGANISM  Homo sapiens
            Eukaryota; Animalia; Metazoa; Chordata; Vertebrata; Mammalia;
            Theria; Eutheria; Primates; Haplorhini; Catarrhini; Hominidae.
REFERENCE   1  (bases 89 to 309)
  AUTHORS   Miwa,T. and Kamada,S.
  TITLE     The nucleotide sequence of a human smooth muscle (enteric type)
            gamma-actin cDNA
  JOURNAL   Nucleic Acids Res. 18, 4263-4263 (1990)
  MEDLINE   90332437
REFERENCE   2  (bases 1 to 466)
  AUTHORS   Miwa,T., Manabe,Y., Kurokawa,K., Kamada,S., Kanda,N., Bruns,G.,
            Ueyama,H. and Kakunaga,T.
  TITLE     Structure, chromosome location, and expression of the human smooth
            muscle (enteric type) gamma-actin gene: evolution of six human
            actin genes
  JOURNAL   Mol. Cell. Biol. 11, 3296-3306 (1991)
  MEDLINE   91246198
COMMENT     These data kindly submitted in computer readable form by: Takeshi
            Miwa
            Department of Oncogene Research
            Research Institute for Microbial Diseases
            Osaka University
            3-1 Yamadaoka
            Suita
            Osaka 565
            Japan
            Phone:  06-875-2470
            Fax:    06-875-1292.
            

            NCBI gi: 219424
FEATURES             Location/Qualifiers
          1..466
                     /organism="Homo sapiens"
CDS_pept        join(D00649:2129..2254,D00649:3135..3263,D00650:653..763,
                     D00650:1033..1117,D00651:66..227,D00652:55..246,
                     D00653:357..538,89..232)
                     /note="gamma-actin precursor;  NCBI gi: 219426"
                     /codon_start=1
                     /db_xref="PID:g219426"
                     /translation="MCEEETTALVCDNGSGLCKAGFAGDDAPRAVFPSIVGRPRHQGV
                     MVGMGQKDSYVGDEAQSKRGILTLKYPIEHGIITNWDDMEKIWHHSFYNELRVAPEEH
                     PTLLTEAPLNPKANREKMTQIMFETFNVPAMYVAIQAVLSLYASGRTTGIVLDSGDGV
                     THNVPIYEGYALPHAIMRLDLAGRDLTDYLMKILTERGYSFVTTAEREIVRDIKEKLC
                     YVALDFENEMATAASSSSLEKSYELPDGQVITIGNERFRCPETLFQPSFIGMESAGIH
                     ETTYNSIMKCDIDIRKDLYANNVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPP
                     ERKYSVWIGGSILASLSTFQQMWISKPEYDEAGPSIVHRKCF"


---------------------------------------------------
Jean-Marc PLAZA
INFOBIOGEN - CNRS
7, rue Guy Moquet BP8 94801 VILLEJUIF Cedex, France
tel: +33 45 59 52 39  fax: +33 45 59 52 50


