Sequin version 2.70

francis at NCBI.NLM.NIH.GOV francis at NCBI.NLM.NIH.GOV
Mon Sep 28 13:36:06 EST 1998


Dear Sequin users,

We have recently released a new version of Sequin, the sequence
submission/editing tool from NCBI, for all platforms.

The current version of Sequin is now 2.70

Please refer to the Sequin home page at:

    http://www.ncbi.nlm.nih.gov/Sequin/

for the latest developments, new questions in the Frequently Asked
Questions section, and the most recent version of the help
documentation.

Major changes for Sequin version 2.70
-----------------------------------------

Both of the major changes in this Sequin version will be useful for
genome centers annotating large records.

. This version is capable of editing complete bacterial chromosomes
or large eukaryotic chromosomal segments in a single record.  Because
the generation of reports (i.e., GenBank and Graphic view) and
validation are both much faster, chromosomes no longer have to be split
up into separate overlapping records.

. Sequin can now annotate features by reading in a tab-delimited
table.  The table specifies the location and type of feature, and
Sequin processes the feature intervals and translates any CDSs.  The
table is read in the record viewer (after the sequence has been
imported) using the File-->Open menu.  The table must follow a defined
format.  The first line starts with >Feature, a space, and then the
Sequence ID of the sequence you are annotating.  In the example below,
eIF4E is the Sequence ID.  The table is composed of five columns:
start, stop, feature key, qualifier key, and qualifier value.  The
columns are separated by tabs.  The first row has start, stop, and
feature key.  Additional feature intervals just have start and stop.
The qualifiers follow on lines starting with three tabs.

For example, a table which looks like this:

>Features eIF4E
80	2881	gene
			gene	eIF4E

201	224	CDS
1550	1920
1986	2085
2317	2404
2466	2629
			product	eukaryotic initiation factor 4E-II

1402	1458	CDS
1550	1920
1986	2085
2317	2404
2466	2629
			product	eukaryotic initiation factor 4E-I
			note	encoded by two messenger RNAs

80	224	mRNA
1550	1920
1986	2085
2317	2404
2466	2881
			product	eukaryotic initiation factor 4E-II

80	224	mRNA
892	1458
1550	1920
1986	2085
2317	2404
2466	2881
			product	eukaryotic initiation factor 4E-I

80	224	mRNA
1129	1458
1550	1920
1986	2085
2317	2404
2466	2881
			product	eukaryotic initiation factor 4E-I


will result in a GenBank flatfile which contains this:

     mRNA            join(80..224,1129..1458,1550..1920,1986..2085,2317..2404,
                     2466..2881)
                     /gene="eIF4E"
                     /product="eukaryotic initiation factor 4E-I"
     mRNA            join(80..224,892..1458,1550..1920,1986..2085,2317..2404,
                     2466..2881)
                     /gene="eIF4E"
                     /product="eukaryotic initiation factor 4E-I"
     mRNA            join(80..224,1550..1920,1986..2085,2317..2404,2466..2881)
                     /gene="eIF4E"
                     /product="eukaryotic initiation factor 4E-II"
     gene            80..2881
                     /gene="eIF4E"
     CDS             join(201..224,1550..1920,1986..2085,2317..2404,2466..2629)
                     /gene="eIF4E"
                     /codon_start=1
                     /product="eukaryotic initiation factor 4E-II"
                     /translation="MVVLETEKTSAPSTEQGRPEPPTSAAAPAEAKDVKPKEDPQETG
                     EPAGNTATTTAPAGDDAVRTEHLYKHPLMNVWTLWYLENDRSKSWEDMQNEITSFDTV
                     EDFWSLYNHIKPPSEIKLGSDYSLFKKNIRPMWEDAANKQGGRWVITLNKSSKTDLDN
                     LWLDVLLCLIGEAFDHSDQICGAVINIRGKSNKISIWTADGNNEEAALEIGHKLRDAL
                     RLGRNNSLQYQLHKDTMVKQGSNVKSIYTL"
     CDS             join(1402..1458,1550..1920,1986..2085,2317..2404,
                     2466..2629)
                     /gene="eIF4E"
                     /note="encoded by two messenger RNAs"
                     /codon_start=1
                     /product="eukaryotic initiation factor 4E-I"
                     /translation="MQSDFHRMKNFANPKSMFKTSAPSTEQGRPEPPTSAAAPAEAKD
                     VKPKEDPQETGEPAGNTATTTAPAGDDAVRTEHLYKHPLMNVWTLWYLENDRSKSWED
                     MQNEITSFDTVEDFWSLYNHIKPPSEIKLGSDYSLFKKNIRPMWEDAANKQGGRWVIT
                     LNKSSKTDLDNLWLDVLLCLIGEAFDHSDQICGAVINIRGKSNKISIWTADGNNEEAA
                     LEIGHKLRDALRLGRNNSLQYQLHKDTMVKQGSNVKSIYTL"


Note that if the gene feature spans the intervals of the CDS and mRNA
features for that gene, you don't need to include gene "qualifiers" in
those features, since they will be picked up by overlap.

Features which are on the complementary strand are indicated by reversing
the interval locations.  For example, the table:

>Features dna2
2710	2639	tRNA
			note	codon recognized: GAA
			product	tRNA-Glu
			anticodon	(pos:2675..2677, aa:Glu)

will result in a GenBank flatfile containing:

     tRNA            complement(2639..2710)
                     /note="codon recognized: GAA"
                     /product="tRNA-Glu"
                     /anticodon=(pos:2675..2677, aa:Glu)


If the formatting of these tables is not reproduced correctly in your email,
you can also view them at:

http://www.ncbi.nlm.nih.gov/Sequin/log.html

regards to all,

francis, for the sequin development team.

--
| B.F. Francis Ouellette             
|
| francis at ncbi.nlm.nih.gov   

New Address: francis at cmmt.ubc.ca







More information about the Bio-soft mailing list