From owner-embldatabank@net.bio.net Wed Mar 01 22:00:00 1995
Path: biosci!biosci!not-for-mail
From: "Alexander J. Ropelewski" <ar1z+@andrew.cmu.edu>
Newsgroups: bionet.announce,bionet.molbio.proteins,bionet.molbio.embldatabank,bionet.molbio.genbank,bionet.molbio.evolution,sci.research,sci.bio.technology
Subject: Sequencing Analysis workshop Annoucement
Date: 2 Mar 1995 14:17:48 -0800
Organization: Pittsburgh Supercomputing Center, Carnegie Mellon, Pittsburgh, PA
Lines: 114
Sender: kristoff@net.bio.net
Approved: bionews-moderator@net.bio.net
Distribution: world
Message-ID: <AjJ=_t_00WB5M6cFUA@andrew.cmu.edu>
NNTP-Posting-Host: net.bio.net
Xref: biosci bionet.announce:1861 bionet.molbio.proteins:3876 bionet.molbio.embldatabank:460 bionet.molbio.genbank:1962 bionet.molbio.evolution:2508 sci.research:4024 sci.bio.technology:2392

                   NUCLEIC ACID AND PROTEIN SEQUENCE ANALYSIS   
                       WORKSHOP FOR BIOMEDICAL RESEARCHERS  
                              Pittsburgh, Pennsylvania  
                              June 4-9, 1995


  
Pittsburgh Supercomputing Center (PSC) is again offering a five-day workshop on
"Nucleic Acid and Protein Sequence Analysis," June 4-9, 1995.  It is  
funded by a grant from the National Center for Human Genome Research of 
the National Institutes of Health.     
  
The workshop will familiarize biomedical researchers with computational  
methods and provide practice in applying supercomputing resources to
problems of concern in macromolecular sequence analysis.  Emphasis will be
on alignment of and pattern extraction from multiple sequences.   
Participants will gain practical experience on PSC's Cray C-90 and T3D in 
(1) comparing and aligning sequences, (2) identifying informative patterns 
in a set of sequences; and (3) using extracted informative patterns to 
identify related sequences.  Researchers will also learn several approaches 
to database searching and  multiple sequence alignment, how to use profile 
analysis effectively, and how to identify patterns in their sequences.   
Participants are encouraged to bring sequence analysis problems from their 
current research.  Extensive documentation will be given at the outset on 
the PSC computing environment as well as on the specific programs
to be employed in the workshop.  No prior supercomputing experience is 
required.
    
Workshop leaders are Dr. Gary Churchill, Cornell University, Dr. Michael 
Gribskov, San Diego Supercomputing Center, and Dr. Hugh Nicholas, PSC.
  
A limited number of grants to cover travel and hotel accommodations are
available for U.S. academic participants.  ALL PARTICIPANTS ARE REQUIRED 
TO PAY A $135 REGISTRATION FEE, IN ADVANCE, UPON ACCEPTANCE INTO THE WORKSHOP. 
The deadline for submitting applications is April 17, 1995.  Enrollment is
limited to 20 participants.  
   
Additional information about this workshop can be found in             
http://pscinfo.psc.edu/biomed/workshops95.html



				      * * * * *



                     PITTSBURGH SUPERCOMPUTING CENTER
                     NUCLEIC ACID AND PROTEIN SEQUENCE ANALYSIS 
                     WORKSHOP FOR BIOMEDICAL RESEARCHERS
                               June 4-9, 1995
     
                               APPLICATION


Name:	       ________________________________________________________________

Affiliation:   ________________________________________________________________

Address:       ________________________________________________________________
	       (Business)
	       ________________________________________________________________

	       ________________________________________________________________
	       (Home)
	       ________________________________________________________________

Telephone:  ____________________________         ______________________________
	           (Business)				     (Home)

*Social Security Number:  _______-_____-_______	Citizenship:___________________

Electronic Mail Address:_______________________________________________________

Status: ___Graduate  ___Post-doctoral Fellow  ___Faculty  ___Other (specify)

In order to attend the workshop, will you need funds for travel?___ lodging?___

Please indicate specifically any special housing, transportation or dietary 
arrangements you will need: __________________________________________

How did you learn about this workshop:_________________________________________

REQUIREMENTS:

Applicants must submit a completed application form and a cover letter.  The
letter should describe, in one or two paragraphs, the sequence analysis 
problems encountered in your research, and how participating in the workshop 
will enhance this research.  Please include a brief statement describing your 
level of experience with computers.  Faculty members, staff and post-docs 
should provide a curriculum vita.  Graduate students must have a letter 
of recommendation from a faculty member. If you have requested travel funds, 
please include the cost of roundtrip air fare from your home to Pittsburgh and 
indicate the amount of travel funds you will need. ALL PARTICIPANTS WILL BE 
REQUIRED TO PAY A $135 ADVANCE REGISTRATION FEE UPON ACCEPTANCE INTO THE 
WORKSHOP.

Please return all application materials by APRIL 17, 1995 to:

  Biomedical Workshop Applications Committee
  Pittsburgh Supercomputing Center
  4400 Fifth Avenue, Suite 230C
  Pittsburgh, PA 15213

Direct inquiries to: Nancy Blankenstein, blankens@psc.edu or 412/268-4960.

*Disclosure of Social Security Number is voluntary.

PSC does not discriminate on the basis of race, color, religion, sex, age,
creed, national or ethnic origin, or handicap.






From owner-embldatabank@net.bio.net Fri Mar 03 22:00:00 1995
Path: biosci!agate!howland.reston.ans.net!news.sprintlink.net!uunet!newstf01.news.aol.com!newsbf02.news.aol.com!not-for-mail
From: fniner@aol.com (FNINER)
Newsgroups: bionet.molbio.embldatabank
Subject: Accession numbers for human JAKs and STATs
Date: 4 Mar 1995 06:27:01 -0500
Organization: America Online, Inc. (1-800-827-6364)
Lines: 7
Sender: root@newsbf02.news.aol.com
Message-ID: <3j9iq5$rqv@newsbf02.news.aol.com>
Reply-To: fniner@aol.com (FNINER)
NNTP-Posting-Host: newsbf02.mail.aol.com

I am interested in obtaining nucleotide sequences for HUMAN Janus family
of protein tyrosine kinases (JAK1, JAK2, JAK3, and Tyk2) and also various
HUMAN Signal Transducers and Activators of Transcription (STAT1 - STAT6)
proteins.  I would greatly appreciate any information concerning the
accession numbers for these genes.  Many thanks in advance.

FNINER@AOL.COM

From owner-embldatabank@net.bio.net Fri Mar 03 22:00:00 1995
Newsgroups: bionet.molbio.embldatabank
Path: biosci!daresbury!hgmp.mrc.ac.uk!ebi.ac.uk!jecop
From: jecop@ebi.ac.uk (Jeroen Coppieters)
Subject: Re: Accession numbers for human JAKs and STATs
Sender: news@ebi.ac.uk (Mr news)
Message-ID: <D4xGHo.744@ebi.ac.uk>
Date: Sat, 4 Mar 1995 17:56:12 GMT
References: <3j9iq5$rqv@newsbf02.news.aol.com>
Organization: European Bioinformatics Institute
X-Newsreader: TIN [version 1.2 PL2]
Lines: 94

FNINER (fniner@aol.com) wrote:
: I am interested in obtaining nucleotide sequences for HUMAN Janus family
: of protein tyrosine kinases (JAK1, JAK2, JAK3, and Tyk2) and also various
: HUMAN Signal Transducers and Activators of Transcription (STAT1 - STAT6)
: proteins.  I would greatly appreciate any information concerning the
: accession numbers for these genes.  Many thanks in advance.

: FNINER@AOL.COM
If you have access to Mosaic or Netscape, you can easily answer this
question yourself.
The Sequence retrieval system (SRS) servers allow you to select information
from a wide range of sequence and sequence related databases.
There are several SRS Servers, I describe here how to work with the EBI one
Except for the URL, the other will work similarly
Access the following URL:
http://www.ebi.ac.uk/srs/srsc
or http://www.ebi.ac.uk
  select [database query/retrieval]
       select [Sequence Retrieval System]
         select [search sequence libraries]


select the database(s) you want to search, type in the keywords of interest
and select [do-query]

I selected embl+emnew
and the following keywords:
Definition: JAK
Organism: homo
Alltext: tyrosine

which gave me:
ID   HS09607    standard; RNA; PRI; 4064 BP.
XX
AC   U09607;
XX
DT   14-JUL-1994 (Rel. 40, Created)
DT   14-JUL-1994 (Rel. 40, Last updated, Version 1)
XX
DE   Human JAK family protein tyrosine kinase (JAK3) mRNA, complete cds.
...
D   HSPTKJAK   standard; RNA; PRI; 3541 BP.
XX
AC   M64174; M35203;
XX
DT   12-APR-1991 (Rel. 28, Created)
DT   20-JAN-1993 (Rel. 34, Last updated, Version 2)
XX
DE   Human protein-tyrosine kinase (JAK1) mRNA, complete cds.
XX
...
ID   HS76112    standard; RNA; EST; 321 BP.
XX
AC   T29761;
XX
DT   09-JAN-1995 (Rel. 42, Created)
DT   09-JAN-1995 (Rel. 42, Last updated, Version 1)
XX
DE   EST93948 Homo sapiens cDNA 5' end similar to tyrosine kinase JAK1
DE   (HT:92).
XX
...
Exchanging TYK for JAK, gave
XX
AC   X54637;
XX
DT   05-NOV-1990 (Rel. 25, Created)
DT   12-SEP-1993 (Rel. 36, Last updated, Version 2)
XX
DE   Human tyk2 mRNA for non-receptor protein tyrosine kinase
XX
KW   protein tyrosine kinase.

I leave the query with the STAT's to you, I do not want to take all the
fun out of it  :-)

--
Jeroen
======================================================================

         . O .                               Jeroen Coppieters
     . O O o   O .                            Software Support
   O O O O *o    O O               Jeroen.Coppieters@ebi.ac.uk
  O O O O(   *o  )O O                         ++44 1223 494422
  )O O O O   o*  O O(                        
  O O O O( o*    )O O
  )O O O O  *o   O O(                      EMBL Outstation EBI
  O O O O(   *o  )O O      (European Bioinformatics Institute)
  )O @ O O   o*  O O(                             Hinxton Hall
    O O O( o*   )O('                                   Hinxton
     ` O(   *o O  '                         Cambridge CB10 1RQ
         ` O '                                              UK
http://www.ebi.ac.uk
======================================================================

From owner-embldatabank@net.bio.net Sun Mar 05 22:00:00 1995
Path: biosci!genetics.com!mcolbert
From: mcolbert@genetics.com
Newsgroups: bionet.molbio.embldatabank
Subject: ftp site for full release of EMBL
Date: 6 Mar 1995 11:02:36 -0800
Organization: BIOSCI International Newsgroups for Molecular Biology
Lines: 11
Sender: daemon@net.bio.net
Distribution: world
Message-ID: <9503061857.AA21933@genetics.com>
NNTP-Posting-Host: net.bio.net

Hello,

I would like to know where there is an anonymous ftp site for the latest full
release of EMBL.  I am on the east coast of the U.S.  I know where the updates
are located.


Thanks.

			-Maureen Colbert
			 mcolbert@genetics.com

From owner-embldatabank@net.bio.net Tue Mar 07 22:00:00 1995
Path: biosci!bcm!cs.utexas.edu!swrinde!gatech!swiss.ans.net!paperboy.amoco.com!cronkite!usenet
From: wmmounts@amoco.com (Bill Mounts)
Newsgroups: bionet.molbio.embldatabank
Subject: SWISS-PROT File Formats...
Date: 8 Mar 1995 16:54:39 GMT
Organization: Amoco Corporation
Lines: 11
Message-ID: <3jkngf$60f@cronkite.amoco.com>
Reply-To: wmmounts@amoco.com

I am looking for definitions of the following of the following line codes which are currently in use with the SWISS-PROT database.  Any help is greatly appreciated.

GN
RP
RC
RM

Thanks again...

Bill


From owner-embldatabank@net.bio.net Thu Mar 09 22:00:00 1995
Newsgroups: bionet.molbio.embldatabank
Path: biosci!bcm!cs.utexas.edu!news.sprintlink.net!EU.net!Belgium.EU.net!idefix.CS.kuleuven.ac.be!reks.uia.ac.be!news
From: bone@reks.uia.ac.be (Darth Vader)
Subject: Re: IMPORTANT OPPORTUNITY
Message-ID: <1995Mar10.082301.11869@reks.uia.ac.be>
Sender: news@reks.uia.ac.be (USENET News System)
Organization: University of Antwerp
X-Newsreader: <WinQVT/Net v3.9>
Date: Fri, 10 Mar 1995 08:23:01 GMT
Lines: 2

..maybe we should ask mafia for a protection against bitch like that and any 
other idiot that puts a message scoring more than 90 on the bullshit meter?

From owner-embldatabank@net.bio.net Thu Mar 09 22:00:00 1995
Newsgroups: bionet.software,bionet.molbio.genbank,bionet.molbio.embldatabank,bionet.molbio.proteins
Path: biosci!bcm!cs.utexas.edu!swrinde!gatech!newsxfer.itd.umich.edu!zip.eecs.umich.edu!caen!hearst.acc.Virginia.EDU!murdoch!reed0.med.Virginia.EDU!wrp
From: wrp@reed0.med.Virginia.EDU (Bill Pearson)
Subject: FASTA 2.0x avaiable
X-Nntp-Posting-Host: reed0.med.virginia.edu
Message-ID: <D58EFt.9Gx@murdoch.acc.Virginia.EDU>
Sender: usenet@murdoch.acc.Virginia.EDU
Organization: University of Virginia
Date: Fri, 10 Mar 1995 15:45:29 GMT
Lines: 41
Xref: biosci bionet.software:11426 bionet.molbio.genbank:1970 bionet.molbio.embldatabank:467 bionet.molbio.proteins:3942


A new (experimental) release of the FASTA program package is now
available from virginia.edu in pub/fasta/fasta20x.shar(.Z). Version
2.0x incorporates several major improvements in the FASTA and SSEARCH
(Smith-Waterman) sequence searching programs:

(1) Explicit statistical estimates are now available from FASTA,
TFASTA, and SSEARCH.  Expectation values (a la BLAST) are provided for
library similarity scores.  In addition, raw similarity scores are
normalized to correct for the expected length-dependence of similarity
scores.  The statistical estimates are very accurate for SSEARCH and
for FASTA if the "-o" option is used.  The "-o" option improves
dramatically the performace of FASTA - at a cost in execution time - I
recommend strongly that you use it as often as possible for proteins.
For DNA it serves no purpose.

(2) FASTA now uses the rigorous Smith-Waterman algorithm to produce
alignments.  Thus, there is no limit to length size in alignments.

(3) The BLOSUM50 matrix is now used by default.

(4) It is easy to change the penalties for the first residue (now -12
by default) and additional residue (-2) in a a gap.

(5) A new alignment option "-m 4", makes it easy to display the part
of the query sequence that is aligned to the library sequence.  For
many protein families, one expects for the alignment to extend from
one end of the sequence to the other.  This display makes it easy to
see when the alignments become much shorter.


This version has only been tested for a few weeks, thus the version
number 2.0x. Please let me know if you have any problems with it.

Bill Pearson
wrp@virginia.edu
-- 
wrp@virginia.EDU
Dept. of Biochemistry #440
U. of Virginia
Charlottesville, VA 22908

From owner-embldatabank@net.bio.net Fri Mar 10 22:00:00 1995
Newsgroups: bionet.software,bionet.molbio.genbank,bionet.molbio.embldatabank,bionet.molbio.proteins
Path: biosci!rutgers!rockyd!notes
From: "Dr. Kent L. Nastiuk" <nastiuk@rockvax.rockefeller.edu>
Subject: Re: FASTA 2.0x avaiable
X-Nntp-Posting-Host: nastiuk_pc.rockefeller.edu
Message-ID: <D5AHto.HJ3@rockyd.rockefeller.edu>
Sender: notes@rockyd.rockefeller.edu (News Administrator)
Organization: Rockefeller University
References:  <D58EFt.9Gx@murdoch.acc.Virginia.EDU>
Date: Sat, 11 Mar 1995 18:53:47 GMT
Lines: 14
Xref: biosci bionet.software:11437 bionet.molbio.genbank:1973 bionet.molbio.embldatabank:470 bionet.molbio.proteins:3952

wrp@reed0.med.Virginia.EDU (Bill Pearson) wrote:
>
> 
> A new (experimental) release of the FASTA program package is now
> available from virginia.edu in pub/fasta/fasta20x.shar(.Z). Version
> 2.0x incorporates several major improvements in the FASTA and SSEARCH
> (Smith-Waterman) sequence searching programs:
> 
i tried to login to 'virginia.edu' and 'reed0...', but could access neither

could you provide better ftp site information, including how to login
(ie guest, anonymous, or whatever?

thanks

From owner-embldatabank@net.bio.net Fri Mar 10 22:00:00 1995
Path: biosci!bcm!cs.utexas.edu!howland.reston.ans.net!news2.near.net!das-news2.harvard.edu!oitnews.harvard.edu!hsdndev!purdue!mozo.cc.purdue.edu!macg203d.bio.purdue.edu!user
From: shall@bilbo.bio.purdue.edu (Stephen Hall)
Newsgroups: bionet.molbio.embldatabank
Subject: Re: IMPORTANT OPPORTUNITY
Date: Sat, 11 Mar 1995 14:40:21 -0800
Organization: Purdue University
Lines: 12
Message-ID: <shall-1103951440210001@macg203d.bio.purdue.edu>
References: <1995Mar10.082301.11869@reks.uia.ac.be>
NNTP-Posting-Host: macg203d.bio.purdue.edu

In article <1995Mar10.082301.11869@reks.uia.ac.be>, bone@reks.uia.ac.be
(Darth Vader) wrote:

> ..maybe we should ask mafia for a protection against bitch like that and any 
> other idiot that puts a message scoring more than 90 on the bullshit meter?

One responded that this type of posting that Vader refers to is evidence
that brothers and sisters shouldn't mate.  But people like Mr. Haythorn,
one of the originators of this posting, and their families have worked at
inbreeding for many generations.  I mean they should feel a real sense of
accomplishment that they have managed to bring the recessive mutation for
idiocy into the homozygous state.

From owner-embldatabank@net.bio.net Fri Mar 10 22:00:00 1995
Path: biosci!biosci!not-for-mail
From: Bill Pearson <wrp@reed0.med.virginia.edu>
Newsgroups: bionet.software,bionet.molbio.embldatabank,bionet.molbio.genbank,bionet.announce,bionet.molbio.proteins
Subject: FASTA 2.0x available
Date: 11 Mar 1995 07:30:55 -0800
Organization: University of Virginia
Lines: 40
Sender: kristoff@net.bio.net
Approved: bionews-moderator@net.bio.net
Distribution: world
Message-ID: <D57Gxr.LD0@murdoch.acc.Virginia.EDU>
NNTP-Posting-Host: net.bio.net
Xref: biosci bionet.software:11434 bionet.molbio.embldatabank:468 bionet.molbio.genbank:1972 bionet.announce:1887 bionet.molbio.proteins:3949

A new (experimental) release of the FASTA program package is now
available from virginia.edu in pub/fasta/fasta20x.shar(.Z). Version
2.0x incorporates several major improvements in the FASTA and SSEARCH
(Smith-Waterman) sequence searching programs:

(1) Explicit statistical estimates are now available from FASTA,
TFASTA, and SSEARCH.  Expectation values (a la BLAST) are provided for
library similarity scores.  In addition, raw similarity scores are
normalized to correct for the expected length-dependence of similarity
scores.  The statistical estimates are very accurate for SSEARCH and
for FASTA if the "-o" option is used.  The "-o" option improves
dramatically the performace of FASTA - at a cost in execution time - I
recommend strongly that you use it as often as possible for proteins.
For DNA it serves no purpose.

(2) FASTA now uses the rigorous Smith-Waterman algorithm to produce
alignments.  Thus, there is no limit to length size in alignments.

(3) The BLOSUM50 matrix is now used by default.

(4) It is easy to change the penalties for the first residue (now -12
by default) and additional residue (-2) in a a gap.

(5) A new alignment option "-m 4", makes it easy to display the part
of the query sequence that is aligned to the library sequence.  For
many protein families, one expects for the alignment to extend from
one end of the sequence to the other.  This display makes it easy to
see when the alignments become much shorter.


This version has only been tested for a few weeks, thus the version
number 2.0x. Please let me know if you have any problems with it.

Bill Pearson
wrp@virginia.edu
-- 
wrp@virginia.EDU
Dept. of Biochemistry #440
U. of Virginia
Charlottesville, VA 22908

From owner-embldatabank@net.bio.net Sat Mar 11 22:00:00 1995
Newsgroups: bionet.software,bionet.molbio.genbank,bionet.molbio.embldatabank,bionet.molbio.proteins
Path: biosci!lhc!borduas!francis
From: francis@borduas.nlm.nih.gov (Francis Ouellette)
Subject: Re: FASTA 2.0x avaiable
Message-ID: <1995Mar12.220021.15425@nlm.nih.gov>
Followup-To: bionet.software,bionet.molbio.genbank,bionet.molbio.embldatabank,bionet.molbio.proteins
Sender: news@nlm.nih.gov
Organization: National Library of Medicine
X-Newsreader: TIN [version 1.2 PL2]
References: <D58EFt.9Gx@murdoch.acc.Virginia.EDU> <D5AHto.HJ3@rockyd.rockefeller.edu>
Date: Sun, 12 Mar 95 22:00:21 GMT
Lines: 50
Xref: biosci bionet.software:11444 bionet.molbio.genbank:1974 bionet.molbio.embldatabank:471 bionet.molbio.proteins:3956

Dr. Kent L. Nastiuk (nastiuk@rockvax.rockefeller.edu) wrote:

> wrp@reed0.med.Virginia.EDU (Bill Pearson) wrote:
> >
> > 
> > A new (experimental) release of the FASTA program package is now
> > available from virginia.edu in pub/fasta/fasta20x.shar(.Z). Version
> > 2.0x incorporates several major improvements in the FASTA and SSEARCH
> > (Smith-Waterman) sequence searching programs:
> > 
> i tried to login to 'virginia.edu' and 'reed0...', but could access neither

> could you provide better ftp site information, including how to login
> (ie guest, anonymous, or whatever?

If you look in Amos Bairoch's 'serv_ftp.txt' document, you
will find these coordinates (login anonymous, use your e-mail
address as password.

--------------------------------------------------------------------
Organiz.: University of Virginia / USA
Address : uvaarpa.virginia.edu (128.143.2.7)
Programs: Name=FASTA; Desc=database search program; OS=UNIX and DOS;
Directory= /pub/fasta
Contact : William Pearson; wrp@virginia.edu
Status  : Tested (30 Sep 1992).
--------------------------------------------------------------------

I went to check, and all the files where in the right place
... so if Amos sees this mail, he can update the 'Status' to

>> Status  : Tested (12 Mar 1995).

The latest version of Amos' document is available by anonymous
FTP from expasy.hcuge.ch (IP address 129.195.254.61) in the
/database/info directory as 'serv_ftp.txt'.

This (and many others) is also available from the WWW 
at this URL:

http://expasy.hcuge.ch/info/serv_ftp.txt

regards,

francis

--
| B.F. Francis Ouellette  
|
| francis@ncbi.nlm.nih.gov   

From owner-embldatabank@net.bio.net Sat Mar 11 22:00:00 1995
Newsgroups: bionet.molbio.embldatabank
Path: biosci!lhc!borduas!francis
From: francis@borduas.nlm.nih.gov (Francis Ouellette)
Subject: Re: SWISS-PROT File Formats...
Message-ID: <1995Mar12.222054.15798@nlm.nih.gov>
Sender: news@nlm.nih.gov
Organization: National Library of Medicine
X-Newsreader: TIN [version 1.2 PL2]
References: <3jkngf$60f@cronkite.amoco.com>
Date: Sun, 12 Mar 95 22:20:54 GMT
Lines: 58

Bill Mounts (wmmounts@amoco.com) wrote:

> I am looking for definitions of the following of the following 
> line codes which are currently in use with the SWISS-PROT 
> database.  Any help is greatly appreciated.

> GN
> RP
> RC
> RM

Bill,

if you at the swiss-prot user manual you will find the 
information below (and lots more too)

regards,

francis

--
| B.F. Francis Ouellette  
|
| francis@ncbi.nlm.nih.gov   



Each line  begins with  a two-character  line code, which
indicates the type of  data contained  in the  line. The  current
line types and line codes and the order in which they appear in an
entry, are shown below:

ID     - Identification.
AC     - Accession number(s).
DT     - Date.
DE     - Description.
GN     - Gene name(s).
OS     - Organism species.
OG     - Organelle.
OC     - Organism classification.
RN     - Reference number.
RP     - Reference position.  
RC     - Reference comments.
RM     - Reference Medline.  
RA     - Reference authors.  
RL     - Reference location.  
CC     - Comments or notes.  
DR     - Database cross-references.  
KW     - Keywords. 
FT     - Feature table data.  
SQ     - Sequence header.  
       - (blanks) sequence data.  
//     - Termination line.

rest of document available from:

http://expasy.hcuge.ch/txt/userman.txt


From owner-embldatabank@net.bio.net Sun Mar 12 22:00:00 1995
Newsgroups: bionet.molbio.genome-program,bionet.molbio.embldatabank,embnet.general
Path: biosci!agate!howland.reston.ans.net!pipex!oleane!jussieu.fr!citi2.fr!bioftp.unibas.ch!doelz
From: doelz@comp.bioz.unibas.ch (Reinhard Doelz)
Subject: Re: an EMBL gopher
Message-ID: <1995Mar12.085904.8295@comp.bioz.unibas.ch>
Organization: EMBnet Switzerland [Basel]
X-Newsreader: TIN [version 1.2 PL2]
References: <consaleg-110395154538@192.167.193.111>
Date: Sun, 12 Mar 1995 08:59:04 GMT
Lines: 54
Xref: biosci bionet.molbio.genome-program:1261 bionet.molbio.embldatabank:473

G. Giacomo Consalez (consaleg@dibit.hsr.it) wrote:
: I have been looking for a functioning EMBL gopher server to retrieve
: genembl and genpept entries, just like I regularly do for genbank,
: swissprot and PIR entries. None seems to be running. A www server would be
: fine, although gopher products are much faster.


To my knowledge, we run the only database gopher on EMBL and EMBL updates 
world-wide. There are on average more than 300 requests served per day.
Peak requests are more than 1200 daily which is due to some unfortunate 
ambition to run automaitc scripts.  

There are two issues to be kept in mind.
	(1) The server runs on an Indigo3000, which is the same machine 
            which we introduced the service on in 1992. This machine 
            runs brilliantly but deserves a major CPU and memory upgrade 
            which is unfortunately unrealistic for such a service. 
            Updating GOPHER indices is a pretty resource-consuming step. 
            We currently need more than an hour to create the waisindex.
            If you apprach the server at that time you will get rejected.
            Depending on load, it can take up to a minute if you do get 
            a result in peak hours. 

        (2) WAIS-indexing is a full-text search. This means that a full 
            text is analyzed for keywords. The more keywords there are 
            the slower WAIS will become. Worse, it will not allow searches
            for trivial words ('protein', 'mrna') as it has a built-in 
            limitation to not incorporate words occuring more than a given
            number.

: Anybody with a good suggestion?

Use the WWW-based SRS browser from Thure Etzold; on many EMBnet nodes, 
including the one in Bari, you'll find local versions also. 
http://www.embl-heidelberg.de/srs/status.html is the overview on the 
available SRS servers on the network. 

: Thanks.

Question for the community: How long will this service need to run? 
Despite the disk space resources the machine which it is running on 
has been frozen in its OS. The code has been modified to an extend which 
makes it highly questionable whether it will ever run on a more advanced
OS version. This implies that, if the box dies, the service will die as 
well with short notice. Would you bother? 

Regards
Reinhard Doelz
EMBnet Switzerland
-- 
 R.Doelz         Klingelbergstr.70| Tel. x41 61 267 2247  Fax x41 61 267 2078|
 Biocomputing        CH 4056 Basel| electronic Mail    doelz@ubaclu.unibas.ch|
 Biozentrum der Universitaet Basel|-------------- Switzerland ---------------|
<a href=http://beta.embnet.unibas.ch/>EMBnet Switzerland:info@ch.embnet.org</a> 

From owner-embldatabank@net.bio.net Sun Mar 12 22:00:00 1995
Newsgroups: bionet.molbio.embldatabank,embnet.general
Path: biosci!agate!howland.reston.ans.net!news.sprintlink.net!pipex!oleane!jussieu.fr!citi2.fr!bioftp.unibas.ch!doelz
From: doelz@comp.bioz.unibas.ch (Reinhard Doelz)
Subject: Re: an EMBL gopher
Message-ID: <1995Mar12.100054.9038@comp.bioz.unibas.ch>
Followup-To: bionet.molbio.embldatabank,embnet.general
Organization: EMBnet Switzerland [Basel]
X-Newsreader: TIN [version 1.2 PL2]
References: <consaleg-110395154538@192.167.193.111> <1995Mar12.085904.8295@comp.bioz.unibas.ch>
Date: Sun, 12 Mar 1995 10:00:54 GMT
Lines: 78

Reinhard Doelz (doelz@comp.bioz.unibas.ch) wrote:

:         (2) WAIS-indexing is a full-text search. This means that a full 
:             text is analyzed for keywords. The more keywords there are 
:             the slower WAIS will become. Worse, it will not allow searches
:             for trivial words ('protein', 'mrna') as it has a built-in 
:             limitation to not incorporate words occuring more than a given
:             number.

Colleagues, to give you an example, look at the xembl index and see that 
the following words are affected (about 300). This is not a real word
but a phenomenon caused by not merging the index block of this word.
(xembl is only the daily updates. Real 'embl' has a much higher exclusion
rate.)

adams, adamskerlavage, animalia, animaliametazoa, assessment, based,
basedupon, basepairs, bednarik, bednarikcao, blake, blakebrandon, bp,
brandon, brandonchiu, bult, bultlee, cao, caocepeda, catarrhini,
catarrhinihominidae, cdna, cepeda, cepedacoleman, chiu, chiuclayton,
chordata, chordatavertebrata, clark, clarkdubuque, clayton, claytoncline,
cline, clinecotton, clone, coleman, colemancollins, collins, collinsdimke,
cotton, cottonearle, dillon, dillonfannon, dimke, dimkefeng, diversity,
dubuque, dubuqueelliston, earle, earlehughes, elliston, ellistonhawkins,
est, esthomo, estproject, eukaryota, eukaryotaanimalia, eukaryotaplantae,
eutheria, eutheriaprimates, expression, expressionpatterns, fannon,
fannonrosen, feng, fengferrie, ferrie, ferriefischer, fields,
fieldsfraser, fine, finefitzgerald, fischer, fischerhastings, fitzgerald,
fitzgeraldfitzhugh, fitzhugh, fitzhughfritchman, fleischmanfuldner,
fleischmann, fragment, fraser, fraserventer, fritchman,
fritchmangeoghagen, fuldner, fuldnerbult, gene, genediversity, genexpress,
genexpressgenexpress, genexpressthe, geoghagen, geoghagenglodek, glodek,
glodekgnehm, gnehm, gnehmhanna, gocayne, gocaynewhite, greene,
greenegruber, gruber, gruberhudson, hanna, hannahedblom, haplorhini,
haplorhinicatarrhini, haseltine, haseltinefields, hastings, hastingshe,
hawkins, hawkinsholman, hedblom, hedblomhinkle, hehu, hillier,
hillierclark, hinkle, hinklejr, holman, holmanhultman, hominidae,
hominidaeadams, hominidaegenexpress, hominidaehillier, homo, hu, hudson,
hudsonkim, hughes, hughesfine, hugreene, hultman, hultmankucaba, human,
humangene, initial, initialassessment, ji, jili, jr, jrkelley, kelley,
kelleyklimek, kelleyliu, kerlavage, kerlavagefleischman, kim, kimkozak,
kirkness, kirknessweinstock, klimek, klimekkelley, kozak, kozakkunsch,
kucaba, kucabale, kunsch, kunschji, le, lee, leekirkness, lelennon,
lennon, lennonmarra, li, libednarik, limeissner, liu, liumarmaros,
mammalia, mammaliatheria, marmaros, marmarosmerrick, marra, marraparsons,
mcdonald, mcdonaldnguyen, meissner, meissnerolsen, merck, merckest,
merrick, merrickmoreno, metazoa, metazoachordata, millionbasepairs,
moreno, morenopalanques, nguyen, nguyenpellegrino, olsen, olsenraymond,
palanques, palanquesmcdonald, parsons, parsonsrifkin, partial, patterns,
patternsbased, pellegrino, pellegrinophillips, phillips, phillipsryder,
plantae, primates, primateshaplorhini, program, project, raymond,
raymondwei, rifkin, rifkinrohlfing, rna, rnaest, rohlfing, rohlfingtan,
rosen, rosenhaseltine, ruben, rubendillon, ryder, ryderscott, sapiens,
saudek, saudekshirley, scott, scottsaudek, sequence, shirley,
shirleysmall, similar, small, smallspriggs, spriggs, spriggsutterback,
standard, sutton, suttonblake, tan, tantrevaskis, thegenexpress, theria,
theriaeutheria, transcribed, trevaskis, trevaskiswaterston, utterback,
utterbackweidman, venter, venterinitial, vertebrata, vertebratamammalia,
washu, washumerck, waterston, waterstonwilliamson, wei, weidman,
weidmanli, weinstock, weinstockgocayne, weiwing, white, whitesutton,
williamson, williamsonwohldmann, wilson, wilsonwashu, wing, wingxu,
wohldmann, wohldmannwilson, xu, xuyu, yu, yuruben


As you can see, GOPHER  built on WAIS indices might fail to retrieve
some authors heavily involved in publishing :-)

More severely, retrieval expressions like 'mrna and mammalia' will 
give not the exprected results. 


Regards
Reinhard

-- 
 R.Doelz         Klingelbergstr.70| Tel. x41 61 267 2247  Fax x41 61 267 2078|
 Biocomputing        CH 4056 Basel| electronic Mail    doelz@ubaclu.unibas.ch|
 Biozentrum der Universitaet Basel|-------------- Switzerland ---------------|
<a href=http://beta.embnet.unibas.ch/>EMBnet Switzerland:info@ch.embnet.org</a> 

From owner-embldatabank@net.bio.net Sun Mar 12 22:00:00 1995
Path: biosci!agate!howland.reston.ans.net!news.moneng.mei.com!uwm.edu!news.alpha.net!solaris.cc.vt.edu!swiss.ans.net!paperboy.amoco.com!cronkite!usenet
From: wmmounts@amoco.com (Bill Mounts)
Newsgroups: bionet.molbio.embldatabank
Subject: Re: SWISS-PROT File Formats...
Date: 13 Mar 1995 14:25:32 GMT
Organization: Amoco Corporation
Lines: 29
Message-ID: <3k1kks$2ak@cronkite.amoco.com>
References: <1995Mar12.222054.15798@nlm.nih.gov>
Reply-To: wmmounts@amoco.com

In article 15798@nlm.nih.gov, francis@borduas.nlm.nih.gov (Francis Ouellette) writes:
> Bill Mounts (wmmounts@amoco.com) wrote:
> > I am looking for definitions of the following of the following 
> > line codes which are currently in use with the SWISS-PROT 
> > database.  Any help is greatly appreciated.
> 
> > GN
> > RP
> > RC
> > RM
> 
> Bill,
> 
> if you at the swiss-prot user manual you will find the 
> information below (and lots more too)
> 
> regards,
> 
> francis

My primary reason for writing was because I didn't have the most recent set of docs.  Thanks for your info, though.

Bill
 






From owner-embldatabank@net.bio.net Sun Mar 12 22:00:00 1995
Path: biosci!biosci!not-for-mail
From: Bill Pearson <wrp@reed0.med.virginia.edu>
Newsgroups: bionet.software,bionet.molbio.embldatabank,bionet.molbio.genbank,bionet.announce,bionet.molbio.proteins
Subject: Re: FASTA 2.0x available
Date: 13 Mar 1995 15:21:20 -0800
Organization: University of Virginia
Lines: 30
Sender: biohelp@net.bio.net
Approved: bionews-moderator@net.bio.net
Distribution: world
Message-ID: <D5C41w.GAJ@murdoch.acc.Virginia.EDU>
References: <D57Gxr.LD0@murdoch.acc.Virginia.EDU>
NNTP-Posting-Host: net.bio.net
Xref: biosci bionet.software:11456 bionet.molbio.embldatabank:476 bionet.molbio.genbank:1976 bionet.announce:1891 bionet.molbio.proteins:3964

In article <D57Gxr.LD0@murdoch.acc.Virginia.EDU>,
Bill Pearson  <wrp@reed0.med.virginia.edu> wrote:
>A new (experimental) release of the FASTA program package is now
>available from virginia.edu in pub/fasta/fasta20x.shar(.Z). Version
>2.0x incorporates several major improvements in the FASTA and SSEARCH
>(Smith-Waterman) sequence searching programs:

>(2) FASTA now uses the rigorous Smith-Waterman algorithm to produce
>alignments.  Thus, there is no limit to gap size in alignments.
                                         ^^^

Several people have reported problems in compiling the programs due to
oversights on my part. I believe that those problems have been fixed.

Another person reported some problem downloading the programs.  They
are available via anonymous ftp from the machine
"uvaarpa.virginia.edu", also known as "virginia.edu" and
"128.143.2.7".  FTP in as anonymous using your email address as the
password.

Keep those cards and letters coming. (If you have problems, an email
message is more likely to get a prompt response than a newsgroup
posting).

Bill Pearson
-- 
wrp@virginia.EDU
Dept. of Biochemistry #440
U. of Virginia
Charlottesville, VA 22908

From owner-embldatabank@net.bio.net Mon Mar 13 22:00:00 1995
Path: biosci!adam.cc.sunysb.edu!news.nysernet.net!news.sprintlink.net!uunet!fdn.fr!jussieu.fr!univ-lyon1.fr!serra.unipi.it!sirio.cineca.it!maya.dei.unipd.it!civ!stdvolp
From: stdvolp@civ.bio.unipd.it (Studenti Volpin)
Newsgroups: bionet.molbio.embldatabank
Subject: 3D-reconstruction
Date: 14 Mar 1995 16:01:39 GMT
Organization: Biology Dept. - University of Padova - Italy
Lines: 17
Distribution: eunet
Message-ID: <3k4el3$7i6@maya.dei.unipd.it>
NNTP-Posting-Host: civ.bio.unipd.it

Hallo out there,
I'm a PhD doing research on mouse development.
I need a particular software which allow me to have a tridimensional 
picture of material under study starting from simple histological sections.
I'm thinking about a scanning device with a reconstruction mathematical 
model.
I have known that there are some people who are dealing with those kind 
of programs. 
Any suggestion?

Please, e-mail addressing to stdvolp@cribi1.unipd.it

Thank you in advance.

                                      Donatella



From owner-embldatabank@net.bio.net Tue Mar 14 22:00:00 1995
Path: biosci!daresbury!bioftp.unibas.ch!citi2.fr!jussieu.fr!fdn.fr!uunet!news.tele.fi!news.funet.fi!news.cc.tut.fi!vuokko!bltihi
From: bltihi@uta.fi (Timo Hiltunen)
Newsgroups: bionet.molbio.genome-program,bionet.molbio.embldatabank,embnet.general
Subject: Re: an EMBL gopher
Followup-To: bionet.molbio.embldatabank
Date: 14 Mar 1995 10:05:55 GMT
Organization: University of Tampere, Finland
Lines: 15
Distribution: world
Message-ID: <3k3pq3$7fe@cc.tut.fi>
References: <consaleg-110395154538@192.167.193.111> <1995Mar12.085904.8295@comp.bioz.unibas.ch>
NNTP-Posting-Host: vuokko.uta.fi
X-Newsreader: TIN [version 1.2 PL1]
Xref: biosci bionet.molbio.genome-program:1269 bionet.molbio.embldatabank:478

Reinhard Doelz (doelz@comp.bioz.unibas.ch) wrote:
: To my knowledge, we run the only database gopher on EMBL and EMBL updates 
: world-wide. There are on average more than 300 requests served per day.
: Peak requests are more than 1200 daily which is due to some unfortunate 
: ambition to run automaitc scripts.  

: Question for the community: How long will this service need to run? 

I, for example, use EMBL databank (in Switzerland) rather often, and think
that there should be sequence databases that are also accessible via
text-based (= non-WWW) methods.

Timo Hiltunen
from the Univ. of Tampere, Finland


From owner-embldatabank@net.bio.net Wed Mar 15 22:00:00 1995
Newsgroups: bionet.software.srs,bionet.molbio.embldatabank
Path: biosci!daresbury!hgmp.mrc.ac.uk!ebi.ac.uk!jecop
From: jecop@ebi.ac.uk (Jeroen Coppieters)
Subject: SRS-FTP gateway (beta release 1.1b)
Sender: news@ebi.ac.uk (Mr news)
Message-ID: <D5Jrq6.IHq@ebi.ac.uk>
Date: Thu, 16 Mar 1995 19:06:06 GMT
Organization: European Bioinformatics Institute
X-Newsreader: TIN [version 1.2 PL2]
Lines: 210
Xref: biosci bionet.software.srs:25 bionet.molbio.embldatabank:479

I sent this out a week ago, but due to local distribution problems, most of
the world never saw it.
For the people that did see the previous announcement,
an important new feature is added, so do read on.
Also people looking for non WWW access to the databases could be interested

Included is a short manual on the use of an FTP to SRS gateway, developed
in the EMBL Outstation - EBI. It can be used right now, but I do not 
garantee yet, that the service will be officially supported.
Please try it, and if you find any problems, or have any suggestions, let
me know (jecop@ebi.ac.uk)
The idea is that it can be used in automated checking/retrieving new
sequences of interest.

Jeroen

-
USING SRS (Sequence Retrieval System [1]) from Anonymous FTP
------------------------------------------------------------
(Jeroen Coppieters, 17-Mar-1995, srsftp version 1.1beta)

This document describes SRSFTP as it is maintained at 
EMBL Outstation, the European Bioinformatics Institute
further referenced as "the EBI"

The Anonymous FTP server of the EBI has a gateway to SRS,
developped by Jeroen Coppieters and RJ White.
This allows the retrieval of sequences from Swissprot and EMBL (as well as 
all other databases that are maintained in SRS at the EBI [2]), using
the power of SRS.
A query can be executed from any directory, after connecting to the
anonymous ftp server.

All results of the query will be stored in the file you name.
The sequences are stored in flat-file EMBL format by default

In any query, the * wildcard can be used

three formats of queries are available:
 - simple sequence retrieval
 - simple srs query
 - full srs query

1) SIMPLE SEQUENCE RETRIEVAL
---------------------------
FORMAT: 
get DB:INDEX:QUERYSTRING FILENAME

DB is one of the following:
embl, emblnew, emblall, nuc
swissprot, swissnew, swissall, pep
if any of the xxall databases is specified, both the release and the
updates are searched. If a sequence has been updated since the last
release, both the old and new entry will be returned.

INDEX  DATABASE SEARCHFIELD          QUERYSTRING
acc      accession number            accession nr (e.g. X07888)
id       identifier                  identifier (e.g. ATP6_YEAST)
dat      date                        date (e.g. 20-NOV-1994)
fts      feature                     feature name (e.g. intron)
ref      reference                   Journal reference
                                     (e.g. Plant Mol. Biol. 10:91-104(1987))
sl       sequence length             number (e.g. 2400)
                                     or range (e.g. 2300:5000)
def      definition                  string
aut      author                      string
cc       comment                     string
org      organism                    string
tit      reference title             string
all      all text fields             string

EXAMPLES:
get emblall:acc:x07888 x07888.seq
retrieve sequence from embl/emnew with accession number x07888
store in the file x07888.seq

get swissprot:all:nitrate* nitrate.pep
get all swissprot entries that have a word starting with nitrate
in any of the text fields.
Store in the file nitrate.pep


2) SIMPLE SRS QUERY
------------------
FORMAT:
get srs:QUERYSTRING

This allows linking of several databases

For more information on SRS queries, have a look at the SRS manual
This however does not allow the complete functionality of SRS.
Restrictions are: - NO SPACES are allowed in the query.
                  - no command line parameters can be included

EXAMPLES
get srs:[embl-fts:intron]>parent&[embl-org:arabidopsis*] arain.seq
retrieve all sequences from Arabidopsis spec. that contain an intron.

get srs:[prosite-id:PROTEIN_KINASE_TYR]>swissprot kinase.pep
retrieve all proteins, that contain a tyrosine kinase motif (PROSITE)

get srs:[prosite-id:PROTEIN_KINASE_TYR]>swissprot>pdb kinase.pdb
retrieve 3D structures (if known) from the above proteins

3) FULL SRS QUERY
-----------------
If you want to have access to command line options (e.g. to change the
output format), a full blown getz command is available.

FORMAT:
From UNIX, VMS+Multinet, MSDOS+NCSA ftp:
get getz+commandline+options+querystring filename

VMS+UCX:
get "getz+commandline+options+querystring" filename

- All spaces in the query should be replaced by plus (+)
- If you want to include a + in a query, use a double plus (++)
- do not put quotes round the querystring 
- if you want to escape a character with a backslash (\), duplicate
  the backslash (\\)
- contrary to forms 1 and 2, there is no colon in the command

- the getz program has a new command line option -pipe "string"
  where string can be "/bin/compress" or "/bin/gzip"
  this will compress the results before transmitting the data,
  thus minimizing network traffic
  Do not forget to use BINARY TRANSFER if you use this option

- On UNIX systems, whenever ftp expects a filename,
  this can be replaced by - (dash)
  the retrieved file will be printed on stdout 

EXAMPLES
get getz+-help file
retrieve help on the usage of getz

get getz+-libs libs.lst
get a list of all available databases

get getz+-d+-sf+fasta+-pipe+"/bin/gzip"+[emnew-org:drosophila*] dro.seq.gz
retrieve all Drosophila sp. sequences from emnew in fasta format. Gzip
before transmitting

get getz+-l+swissprot+-l+swissnew+[sq-id:*_ecoli] ecoli.lst
generate a list of all ecoli proteins in swissprot and swissnew

============================================================
NOTES:
-----
[1] More information on SRS is available from the author: Thure Etzold
Via WWW: http://www.embl-heidelberg.de/srs/srsman.html

Refererences:
Thure Etzold and Patrick Argos, SRS an indexing and retrieval tool for flat 
file data libraries. Comput. Appl. Biosci. 9:49-57, 1993 

Thure Etzold and Patrick Argos, Transforming a set of biological flat file 
libraries to a fast access network. Appl. Biosci. 9:59-64, 1993 

[2] on March 14th 1995, these database were:
                 Library   Group              Entries    Index Date
-------------------------------------------------------------------
               SWISSPROT   Sequence             40292       2/28/95
                SWISSNEW   Sequence              9036        3/8/95
                     PIR   Sequence             71995       2/28/95
                    EMBL   Sequence            234501       2/28/95
                   EMNEW   Sequence            102576       3/14/95
                   NRL3D   Sequence              4153        3/1/95
                   NRSUB   Sequence               248        3/1/95
                     PDB   ProteinStruct         3391        3/1/95
                    HSSP   ProteinStruct         3070        3/4/95
                    DSSP   ProteinStruct         2968        3/2/95
                     ALI   ProteinStruct           84        3/1/95
                    FSSP   ProteinStruct          436        3/4/95
                 PROSITE   SeqRelated            1029        3/1/95
              PROSITEDOC   SeqRelated             786        3/1/95
                  BLOCKS   SeqRelated             770        3/1/95
                     EPD   SeqRelated            1251        3/1/95
                    ECDC   SeqRelated            3571        3/1/95
                  ENZYME   SeqRelated            3546        3/1/95
                  REBASE   SeqRelated            2454        3/1/95
                  PRODOM   SeqRelated           23105        3/1/95
                 FLYGENE   SeqRelated            7126       3/10/95
                SWISSDOM   SeqRelated           28224        3/1/95
                  PIRALN   SeqRelated            1183        3/1/95
              SEQANALREF   Literature            2579        3/1/95
                 MEDLINE   Literature          179262        3/1/95
                    LIMB   Others                 120        3/1/95
                  TFSITE   TransFac              4042       3/14/95
                TFFACTOR   TransFac              1412       3/14/95
                   DBEST   TaggedSites           3264        3/8/95


--
======================================================================
         . O .                               Jeroen Coppieters
     . O O o   O .                            Software Support
   O O O O *o    O O          Jeroen.Coppieters@embl-ebi.ac.uk
  O O O O(   *o  )O O                     (or jecop@ebi.ac.uk)
  )O O O O   o*  O O(                         ++44 1223 494422 
  O O O O( o*    )O O
  )O O O O  *o   O O(                      EMBL Outstation EBI
  O O O O(   *o  )O O      (European Bioinformatics Institute)
  )O @ O O   o*  O O(                             Hinxton Hall
    O O O( o*   )O('                                   Hinxton
     ` O(   *o O  '                         Cambridge CB10 1RQ
         ` O '                                              UK
http://www.ebi.ac.uk
======================================================================

From owner-embldatabank@net.bio.net Thu Mar 16 22:00:00 1995
Newsgroups: bionet.molbio.embldatabank
Path: biosci!daresbury!hgmp.mrc.ac.uk!ebi.ac.uk!jecop
From: jecop@ebi.ac.uk (Jeroen Coppieters)
Subject: Re: an EMBL gopher
Sender: news@ebi.ac.uk (Mr news)
Message-ID: <D5Js9D.H9M@ebi.ac.uk>
Date: Thu, 16 Mar 1995 19:17:37 GMT
References: <consaleg-110395154538@192.167.193.111> <1995Mar12.085904.8295@comp.bioz.unibas.ch> <3k3pq3$7fe@cc.tut.fi>
Organization: European Bioinformatics Institute
X-Newsreader: TIN [version 1.2 PL2]
Lines: 50

Timo Hiltunen (bltihi@uta.fi) wrote:
: Reinhard Doelz (doelz@comp.bioz.unibas.ch) wrote:
: : To my knowledge, we run the only database gopher on EMBL and EMBL updates 
: : world-wide. There are on average more than 300 requests served per day.
: : Peak requests are more than 1200 daily which is due to some unfortunate 
: : ambition to run automaitc scripts.  

: : Question for the community: How long will this service need to run? 

: I, for example, use EMBL databank (in Switzerland) rather often, and think
: that there should be sequence databases that are also accessible via
: text-based (= non-WWW) methods.
In another post I just announced an ftp based access method. Would a similar
gopher system be of more use?
I could develop a similar gateway from gopher to SRS
At the moment I've set up a simple one (only allowing retrieval by 
accesion number or entry name) as a quick test.
Have a look on gopher.ebi.ac.uk
 -->   9.  (Experimental) Sequence retrieval from EMBL/SWISSPROT database/
  -->  1.  Sequence retrieval from Swissprot <??>
       2.  Sequence retrieval from EMBL <??>
But, please, do not hardcode this. As said, it's just a quick test, so
 - it can still go wrong
 - it could have to move somewheree else in the gopher-tree
 - I have no authorization from EMBL/EBI to support this as an official
   service

Let's here some comments

: Timo Hiltunen
: from the Univ. of Tampere, Finland

Jeroen Coppieters

--
======================================================================
         . O .                               Jeroen Coppieters
     . O O o   O .                            Software Support
   O O O O *o    O O          Jeroen.Coppieters@embl-ebi.ac.uk
  O O O O(   *o  )O O                     (or jecop@ebi.ac.uk)
  )O O O O   o*  O O(                         ++44 1223 494422 
  O O O O( o*    )O O
  )O O O O  *o   O O(                      EMBL Outstation EBI
  O O O O(   *o  )O O      (European Bioinformatics Institute)
  )O @ O O   o*  O O(                             Hinxton Hall
    O O O( o*   )O('                                   Hinxton
     ` O(   *o O  '                         Cambridge CB10 1RQ
         ` O '                                              UK
http://www.ebi.ac.uk
======================================================================

From owner-embldatabank@net.bio.net Thu Mar 16 22:00:00 1995
Newsgroups: bionet.molbio.embldatabank
Path: biosci!daresbury!hgmp.mrc.ac.uk!ebi.ac.uk!jecop
From: jecop@embl-ebi.ac.uk (Jeroen Coppieters)
Subject: Re: an EMBL gopher
Message-ID: <D5Kw2y.1qA@ebi.ac.uk>
Sender: jecop@ebi.ac.uk (Jeroen Coppieters)
Date: Fri, 17 Mar 1995 09:37:46 GMT
Lines: 52
Reply-To: jecop@embl-ebi.ac.uk (Jeroen Coppieters)
Organization: EMBL Outstation - EBI, Hinxton Hall, UK
X-Newsreader: mxrn 6.18-16
Followup-To: bionet.molbio.embldatabank



Timo Hiltunen (bltihi@uta.fi) wrote:
: Reinhard Doelz (doelz@comp.bioz.unibas.ch) wrote:
: : To my knowledge, we run the only database gopher on EMBL and EMBL updates 
: : world-wide. There are on average more than 300 requests served per day.
: : Peak requests are more than 1200 daily which is due to some unfortunate 
: : ambition to run automaitc scripts.  

: : Question for the community: How long will this service need to run? 

: I, for example, use EMBL databank (in Switzerland) rather often, and think
: that there should be sequence databases that are also accessible via
: text-based (= non-WWW) methods.
In another post I just announced an ftp based access method. Would a similar
gopher system be of more use?
I could develop a similar gateway from gopher to SRS
At the moment I've set up a simple one (only allowing retrieval by 
accesion number or entry name) as a quick test.
Have a look on gopher.ebi.ac.uk
 -->  9.  (Experimental) Sequence retrieval from EMBL/SWISSPROT database/
   -->  1.  Sequence retrieval from Swissprot <??>
        2.  Sequence retrieval from EMBL <??>
But, please, do not hardcode this. As said, it's just a quick test, so
 - it can still go wrong
 - it could have to move somewheree else in the gopher-tree
 - I have no authorization from EMBL/EBI to support this as an official
   service

Let's here some comments

: Timo Hiltunen
: from the Univ. of Tampere, Finland

Jeroen Coppieters

--
======================================================================
         . O .                               Jeroen Coppieters
     . O O o   O .                            Software Support
   O O O O *o    O O          Jeroen.Coppieters@embl-ebi.ac.uk
  O O O O(   *o  )O O                     (or jecop@ebi.ac.uk)
  )O O O O   o*  O O(                         ++44 1223 494422 
  O O O O( o*    )O O
  )O O O O  *o   O O(                      EMBL Outstation EBI
  O O O O(   *o  )O O      (European Bioinformatics Institute)
  )O @ O O   o*  O O(                             Hinxton Hall
    O O O( o*   )O('                                   Hinxton
     ` O(   *o O  '                         Cambridge CB10 1RQ
         ` O '                                              UK
http://www.ebi.ac.uk
======================================================================

From owner-embldatabank@net.bio.net Tue Mar 21 22:00:00 1995
Newsgroups: bionet.molbio.embldatabank,ebi.general,embnet.general
Path: biosci!daresbury!hgmp.mrc.ac.uk!ebi.ac.uk!stoehr
From: stoehr@ebi.ac.uk (Peter Stoehr)
Subject: EMBL Release 42 built
Sender: news@ebi.ac.uk (Mr news)
Message-ID: <1995Mar22.123438@ebi.ac.uk>
Date: Wed, 22 Mar 1995 11:34:38 GMT
Lines: 96
Organization: European BioInformatics Institute

Release 42 of the EMBL Nucleotide Sequence Database (March 1995) is built and
installed on the EBI's e-mail, anonymous FTP, FASTA and WWW servers. It is also
installed already at some EMBnet sites.

Some extracts from the release notes are appended below.

Regards,
Peter Stoehr
EMBL - EBI
-----------

1  RELEASE 42

The EMBL nucleotide sequence database was frozen to make Release 42 on 8th March
1995.   The  release  contains  303206  sequence  entries comprising 262,559,786
nucleotides.  This represents an increase of  about  16%  over  Release  41.   A
breakdown of Release 42 by taxonomic division is shown below:

                  Division             Entries    Nucleotides
                  -----------------    -------    ------------
                  Bacteriophage           1066         1493417
                  ESTs                  123526        39332522
                  Fungi                   8420        19940449
                  Invertebrates          13831        27610495
                  Organelles              8195         9364254
                  Other Mammals           6272         6976315
                  Other Vertebrates       7041         8144622
                  Plants                 11105        14145431
                  Primates               35290        36665648
                  Prokaryotes            21427        37074154
                  Rodents                23626        26850022
                  STSs                    7232         2288477
                  Synthetic               8597         4295284
                  Unclassified            6082         3577630
                  Viruses                21496        24801066
                  -----------------    -------    ------------
                  Total                 303206       262559786

                  plus:
                  Other patents           6686         2507063
                  -----------------    -------    ------------
                  Grand Total           309892       265066849



1.1  Literature Reference Identifiers

As previously announced, we have introduced at  this  release  identifiers  into
journal  references.  We have created a new RX line-type, which will be optional
for any reference in the database, with the following format:

RX   database_name; identifier.

e.g.

RN   [1]
RP   1-549
RX   MEDLINE; 82196900.
RA   Hennighausen L.G., Sippel A.E.;
RT   "Mouse whey acidic protein is a novel member of the family of
RT   'four-disulfide core' proteins";
RL   Nucleic Acids Res. 10:2677-2684(1982).

In this release, there are links to over 50,000 MEDLINE records.


2  FORTHCOMING CHANGES

2.1  Accession Numbers

It will soon be necessary to extend the  range  of  possible  accession  numbers
available  for  the  nucleotide  sequence  databases.  This will be an important
topic for an upcoming collaborative meeting between EMBL, GenBank and DDBJ,  and
we  do  not  wish  to preempt the result of that discussion.  Inevitably though,
there must be a change to the  present  structure  of  accession  numbers  which
consist  of  one  prefix letter followed by 5 digits (eg X12399), and it is very
likely that accession  numbers  will  become  longer  and  contain  more  prefix
letters.   Existing accession numbers will remain valid as is.  We will announce
such a significant change as  widely,  and  with  as  much  advance  notice,  as
possible.


2.2  EST Database Divisions

The number of EST sequences is growing rapidly and will continue to  do  so  for
some time.  In order to keep the size of the data files within reasonable limits
for handling purposes, we propose to split the EST division into  several  files
named EST1.DAT, EST2.DAT etc at the next release (Release 43, June 1995).

2.3  Feature Identifiers

We are investigating ways of assigning unique identifiers to  sequence  features
described  within the Feature Table.  We will initially focus on CDS features to
enable a finer level of cross-referencing than at present between the nucleotide
and  protein  sequence databases.  We are hoping to adopt a common approach with
our collaborators at DDBJ and GenBank.

From owner-embldatabank@net.bio.net Fri Mar 24 22:00:00 1995
Path: biosci!daresbury!not-for-mail
From: <kumarvl@medinst.ernet.in>
Newsgroups: bionet.molbio.embldatabank
Subject: help
Date: 25 Mar 1995 03:19:09 -0000
Lines: 1
Sender: lpddist@mserv1.dl.ac.uk
Distribution: bionet
Message-ID: <3l023d$r4m@mserv1.dl.ac.uk>
Original-To: embl-db@vikram!dl.ac.uk.ncst.ernet.in



From owner-embldatabank@net.bio.net Tue Mar 28 23:00:00 1995
Path: biosci!daresbury!trane.uninett.no!nntp.uio.no!usenet
From: yihena@ifi.uio.no(YichengAn)
Newsgroups: bionet.molbio.embldatabank
Subject: What'd you expect ?
Date: 29 Mar 1995 09:38:41 GMT
Organization: University of Oslo
Lines: 28
Message-ID: <3lb9r1$180@hermod.uio.no>
Reply-To: yichena@ifi.uio.no
NNTP-Posting-Host: biotek08.uio.no
X-Newsreader: WinVN 0.92.3

Hi,

I 've started to work on a DNA sequence analysis software tools, without being asked
by anybody, basically. So here I am to collect your suggestions, which can be anything,
from output formats to functionality. It's going to be work on-linely through WWW, and
to be sort of free-access-ware, so what you'll suggest is probably what you'll get.

Another thing is I need your explanation of what it means by DNA sequencial repeatings,
(it's a shame because I'm myself a Master program student in protein studies). If you got
a sequence as follows:

	????ATTGCG??????GTTGCA??????ATTGCG?????

so the longer repeatings are:
	         |<---------->|                                             |<---------->|
	????ATTGCG??????GTTGCA??????ATTGCG?????

but then, would you expect to see the shorter redundant one like:
	           |<------>|                   |<----->|                   |<------>|
	????ATTGCG??????GTTGCA??????ATTGCG?????

Contact me, Thanks!



Regards,

Yicheng

From owner-embldatabank@net.bio.net Fri Mar 31 23:00:00 1995
Path: biosci!mahidol.ac.th!gsctlr
From: gsctlr@mahidol.ac.th (Thianchai Lakornrach - SCBT - 3636209)
Newsgroups: bionet.molbio.embldatabank
Subject: artificial lipase sequence
Date: 1 Apr 1995 06:57:46 -0800
Organization: BIOSCI International Newsgroups for Molecular Biology
Lines: 14
Sender: daemon@net.bio.net
Distribution: world
Message-ID: <Pine.D-G.3.91.950401212156.6622A@mucc.mahidol.ac.th>
NNTP-Posting-Host: net.bio.net

         1 I would like to know the meaning of an artificial gene.       
           Where is it derive from such as artifitial gene for lipase ?
         
         2 How to get the details of these following sequences? 
               embl A 02815  
               embl A 02816
                    U 17036
               NCBI gi:608524, 608525, 608526
            
Thank you for your help.

THIANCHAI LAKORNRACH
MAHIDOL UNIVERSITY
BANGKOK THAILAND

