From k.gillinder at optushome.com.au Fri Oct 25 00:30:54 2002 From: k.gillinder at optushome.com.au (zoonose) Date: Sun Jun 12 19:43:21 2005 Subject: bioinformatics Message-ID: <20021029131359.9489F7D14E@mercury.hgmp.mrc.ac.uk> Hi, I have a clone sequnce which i am trying to get the full length cDNA for. Using NCBI - UNIgene i have gotten an mRNA transcript, but i want to verify this is correct... does anyone know ? -R --- ===== Moderated bionet.genome.gene-structure __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com From victor at softberry.com Wed Oct 23 02:12:52 2002 From: victor at softberry.com (Victor) Date: Sun Jun 12 19:43:21 2005 Subject: FGENESV - Finding Genes in genomes of RNA Message-ID: <20021029131523.D88207D1B8@mercury.hgmp.mrc.ac.uk> program is available for on line usage at: http://www.softberry.com/berry.phtml?topic=gfindv Method description: The FGENESV algorithm is based on pattern recognition of different types of signals and Markov chain models of coding regions. Optimal combination of these features is then found by dynamic programming and a set of gene models is constructed along given sequence. FGENESV is the fastest ab initio viral gene prediction program available. We have developed 2 variants of gene prediction: FGENESV0 (good to apply for small genomes < 10000 bp) uses generic parameters of coding regions and FGENESV learns genome-specific parameters just from input viral genome sequence. FGENESV predicts all intron-less genes of viruses. However a few % of viral genes contain intron sequences. Such genes often are alternatives to the intron-less variant. Please use standard eukaryotic gene finding programs (such as FGENESH) additionally to FGENESV to find such genes. As additional parameters you can choose Linear or Circular form of your virus and select alternative genetic code (Standard code is default): The Bacterial and Plant Plastid Code (transl_table=11) or The Mold, Protozoan, and Coelenterate Mitochondrial Code and the Mycoplasma/Spiroplasma Code (transl_table=4) FgenesV output: FGENESV: Prediction of potential genes in viral genomes Time: Tue Oct 22 16:17:25 2002 Seq name: NC_001838 Common chimpanzee papillomavirus 1, complete genome. Length of sequence - 7889 bp Number of predicted genes - 8 N S Start End Score 1 + CDS 101 - 559 693 2 - CDS 551 - 907 232 3 + CDS 840 - 2786 3253 4 + CDS 2728 - 3858 938 5 + CDS 3901 - 4185 298 6 + CDS 4195 - 4335 131 7 + CDS 4371 - 5759 2263 8 + CDS 5746 - 7251 1943 Predicted protein(s): >GENE 1 101 - 559 152 aa, chain + ESVNASTPAKTIDQLCKDCNLCMHSLQILCVFCKKTLSTAAAEVYSFEYKDLYIVWRGN PFAACAYCLELQGKVNQYRHFDYAAYAVTVEEETNKSIFDIRIRCYLCHKPLCAVEKVR HILEKARFIKLNCEWKGRCFHCWTSCMENILP GENE 2 551 - 907 118 aa, chain - STKNHPEHPVPSLSVPVSSAILLKFVEHTTGTLYSMSPAASCVGVECQVLYTPQPDAR CYYSDHNWSLFGKLVGFVAWLAWLAWLVRPPHLLSCLIAHCNVDLQGQDSGVTQCPLR GENE 3 840 - 2786 648 aa, chain + MADDTGTDNEGTGCSGWFLVEAIVDKTTGEQVSDDEDETVEDSGLDMVDFIDDRPITHNS LEAQALLNEQEADAHYAAVQDLKRKYLGSPYVSPLGHIEQSVDCDISPRLDAIQLSRKPK > KVKRRLFQSREITDSGYGYSEVETATQVERYGEPENGCGGGGDGREKEGEGQVHTEVHTE > SEIEQHTGTTRVLELLKCKDVRATLHGKFKECYGLSFKDLTREFKSDKTTCGDWVVAGFG > VHHSVSEAFQKLIQPLSTYSHIQWLTNYKCMGMVLLVLLRFKVNKNRCTVARTLATLLNI > PEDHMLIEPPKIQSSVAALYWFRTSISNASIVTGDTPEWIARQTIVEHGLADNQFKLTEM > VQWAYDNDYCDESDIAFEYAQRADFDSNAKAFLNSNCQAKYVKDCATMCKHYKNAEMKKM > SIKQWIKYRSNKIDETGNWKPIVQFLRHQGIEFISFLSKLKLWLHGTPKKNCIAIVGPPD > TGKSAFCMSLIKFLGGTVISYVNSSSHFWLQPLCNAKVALLDDATQSCWGYMDTYMRNLL > DGNPMSIDRKHKSLALIKCPPLLVTSNIDITTEERYKYLYSRVTLFKFPNPFPFDSNGNA > VYELCDANWKCFFARLSASLDIQDSEDEDDGDTSQAFRCVPGTVVRTV > >GENE 4 2728 - 3858 376 aa, chain > + > METLAKHLDACQEQLLELYEENSNELKKHIQHWKCVRYENVLLHKARQMGISHIGPQVVP > PLQVSQTKGHEAIEMQMRIETLLKSQFGMEPWTLQDTSFEMWLTPPKHCFKKQGKTVEVK > YDCNAENTMHYVLWKYIYVYNTEKEIWLKVKGMVDYKGLYYMMEQCKTYYVDFEKEAKQY > GKTLQWEVCFDSTVICSPASVSSTVQEVSNAGPTSYSTTLAQATYTVPSSVSEECVQAPP > SKRQRGPSQSAGKTQHTCNIVCDTDCATLDSANNNINNNSYSSNNGRNNSYCTGTPIVQL > QGDSNNLKCFRYRLHSNYKHLFFACISTWHWTASSNSPKTAIVTLTYVNEQQRQEFLNTV > KIPGTITHKLGFVAIM > >GENE 5 3901 - 4185 94 aa, chain > + > MELQVVPVDVTTTTTNASLLPLLIALTVCLISIILLVFVSEFVIYSSVLVLTLLIYLLLW > LLLTTHLQFYLLTLSLCFIPAFSVHQYILQTQQL > >GENE 6 4195 - 4335 46 aa, chain > + > MLTCSFDDGDTWLLLWLLASLIVAILGLLLLYLKAVHIHSHSCCSK > >GENE 7 4371 - 5759 462 aa, chain > + > MAHSRPRRRKRASATQLYQTCKASGTCPDIIPKVEQNTLADKILKWGSLGVFFGGLGIGT > GSGTGGRTGYVPLESAPRPAIPFGPTARPPIVVDTVGPTDSSIVSLVEDSAIINSGASDL > VPSIHGGFEISTSESTTPAILDVSITTHNTTSTSIFRNPAFAEPSIVQSQPSVEAGGHLL > TSTFTSTISPHSVEEIPLDTFIVSSSNSNPASSTPVPTTVARPRLGLYSKALHQVQVTDP > AFLSSPQRLITFDNPVYEGEDISLHFEHNSIHEPPNEAFMDIIRLHRPAITSRRGVVRFS > RIGQRGSMYTRSGKHIGGRVHFFTDISPISADAQDIELQPLVAAAQDDSDLFDIYVDPDT > TPVAVDNIPSANSTLFIKSSIFDTSWGNTTIPLSLPNNIFVQPGPDILFPTTPAVPPYGP > VISPLPVGPVFISGSEFYLHPSLYFARKRRKRVSLFFSDVAA > >GENE 8 5746 - 7251 501 aa, chain > + > MWRPSDNKLYVPPPAPVSKVLTTDAYVTRTKIFYHASSSRLLAVGNPYFPIRKANKTIVP > KVSGFQFRVFKIVLPDPNKFALPDTSIFDSTSQRLVWACIGLEVGRGQPLGVGYCGHPCL > NKFDDVENSASYAVNPGQDNRVNVAMDYKQTQLCLVGCAPPLGEHWGKGKQCSGVSVQDG > DCPPLELVTSVIQDGDMVDTGFGAMDFAELQSNKSDVPLDICTSTCKYPDYLQMAADPYG > DRLFFYLRKEQMFARHFFNRAGTVGEQIPDELFVKGTTSRATVSSNIYFNTPSGSLVSSE > AQLFNKPYWLHKAQGHNNGICWGNTLFVTVVDTTRSTNMTVCASTTSSPSATYTASEYKQ > YMRHVEEFDLQFIFQLCTIKLTAELMAYIHTMNPTVLEEWNFGLSPPPNGTLEDTYRYVQ > SQAITCQKPTPDKEKQDPYAGLSFWEVNLKEKFSSELEQYPLGRKFLLQTGVQSTSLARA > GTKRAASTSTATPTRKKVKRK > > --- ===== Moderated bionet.genome.gene-structure __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com From victor at softberry.com Thu Oct 31 02:25:53 2002 From: victor at softberry.com (Victor) Date: Sun Jun 12 19:43:24 2005 Subject: New Plasmodium falciparum finding Genes Message-ID: <20021101044248.712167D166@mercury.hgmp.mrc.ac.uk> New Plasmodium falciparum finding Genes parameters for FGENESH the program with parameters for major model organisms is available for on line usage at: http://www.softberry.com/berry.phtml?topic=gfind Method description: A new parameter set for gene prediction Plasmodium falciparum is developed for FGENESH program. Accuracy of prediction of Plasmodium falciparum protein coding genes is about 98% on the nucleotide level. Exact exon prediction accuracy ~80%. The FGENESH algorithm is based on pattern recognition of different types of signals and Markov chain models of coding regions. Optimal combination of these features is then found by dynamic programming and a set of gene models is constructed along given sequence. FGENESH is the fastest and most accurate ab initio gene prediction program available. Fgenesh output: fgenesh Wed Oct 30 23:05:15 EST 2002 FGENESH 1.1 Prediction of potential genes in Plasmodium genomic DNA Time : Wed Oct 30 23:05:15 2002 Seq name: MAL7P1.27 chr7 chloroquine resistance transporter Length of sequence: 4095 Number of predicted genes 1 in +chain 1 in -chain 0 Number of predicted exons 13 in +chain 13 in -chain 0 Positions of predicted genes and exons: G Str Feature Start End Score ORF Len 1 + TSS 130 -4.02 1 + 1 CDSf 501 - 591 18.40 501 - 590 90 1 + 2 CDSi 769 - 1037 15.02 771 - 1037 267 1 + 3 CDSi 1217 - 1389 18.26 1217 - 1387 171 1 + 4 CDSi 1562 - 1694 21.53 1563 - 1694 132 1 + 5 CDSi 1848 - 1919 26.11 1848 - 1919 72 1 + 6 CDSi 2043 - 2118 12.63 2043 - 2117 75 1 + 7 CDSi 2215 - 2297 19.19 2217 - 2297 81 1 + 8 CDSi 2425 - 2475 25.92 2425 - 2475 51 1 + 9 CDSi 2613 - 2669 19.99 2613 - 2669 57 1 + 10 CDSi 2818 - 2910 14.46 2818 - 2910 93 1 + 11 CDSi 3104 - 3148 19.99 3104 - 3148 45 1 + 12 CDSi 3295 - 3349 23.81 3295 - 3348 54 1 + 13 CDSl 3519 - 3595 9.78 3521 - 3595 75 1 + PolA 3691 2.25 Predicted protein(s): >FGENESH: 1 13 exon (s) 501 - 3595 424 aa, chain + KFASKKNNQKNSSKNDERYRELDNLVQEGNGSRLGGGSCLGKCAHVFKLIFKEIKDNIF YILSIIYLSVCVMNKIFAKRTLNKIGNYSFVTSETHNFICMIMFFIVYSLFGNKKGNSK RHRSFNLQFFAISMLDACSVILAFIGLTRTTGNIQSFVLQLSIPINMFFCFLILRYRYH LYNYLGAVIIVVTIALVEMKLSFETQEENSIIFNLVLISALIPVCFSNMTREIVFKKYKI DILRLNAMVSFFQLFTSCLILPVYTLPFLKQLHLPYNEIWTNIKNGFACLFLGRNTVVEN CGLGMAKLCDDCDGAWKTFALFSFFNICDNLITSYIIDKFSTMTYTIVSCIQGPAIAIAY YFKFLAGDVVREPRLLDFVTLFGYLFGSIIYRVGNIILERKKMRNEENEDSEGELTNVDS IITQ --- ===== Moderated bionet.genome.gene-structure __________________________________________________ Do You Yahoo!? Everything you'll ever need on one web page from News and Sport to Email and Music Charts http://uk.my.yahoo.com