brinkman at sfu.ca
Tue Jun 11 04:34:22 EST 2002
Keith James <kdj at adnah.sanger.ac.uk> wrote in message news:<sc4u1ohtr99.fsf at adnah.sanger.ac.uk>...
> >>>>> "Charlie" == Charlie <cckim at stanford.edu> writes:
> Charlie> This is an interesting problem, but as far as I know it
> Charlie> has not been addressed computationally. The primary
> Charlie> problem is that "pathogenicity islands" are not a
> Charlie> precisely defined entity; rather, it is a somewhat
> Charlie> arbitrarily defined region. I'm sure you already know
> Charlie> some of the common characteristics, but I will list a few
> Charlie> for the benefit of other readers of this post:
> Charlie> 1) Flanked by tRNA sequences 2) %GC is low relative to
> Charlie> the rest of the genome 3) Contain sequences that are
> Charlie> unique to the organism 4) Often large (>20kb)
> Charlie> The problem is that very few of the many designated
> Charlie> pathogenicity islands out there actually have all of
> Charlie> these features. My belief is that the term is used
> Charlie> fairly loosely in order to spice up the data a bit.
> As Charlie points out, how you might try to accomplish this depends on
> what you mean by a "pathogenicity island". For example, if you were
> looking for such regions in enteric pathogens you might try Blasting
> against a database of known "islands" or classes of genes (iron
> uptake, type III/IV secretion systems, prophage) in addition to the
> indicators he mentioned.
> Tools which we use to help us include:
> tRNAscan-SE (Todd Lowe & Sean Eddy,
> http://www.genetics.wustl.edu/eddy/software/#trnascan) for identifying
> those tRNAs which may flank these regions.
> Artemis (Kim Rutherford, http://www.sanger.ac.uk/Software/Artemis) for
> visualising anomalous %GC / dinucleotide frequencies + tRNAs
> (identified by tRNA scan) and other genomic context.
> ACT (Kim Rutherford, http://www.sanger.ac.uk/Software/ACT) for
> visualising whole genome comparisons (some of these types of of
> regions appear to be horizontally transferred).
> Artemis and ACT are Java applications which run on Unix/Linux, Mac and
> Windows and can be downloaded for free from the above URLs. Hope this
> is of some use.
We also recommend Artemis as a great tool for such analyses and also
caution you about the looseness of the term "pathogenicity island" (or
even genomic island). That said, there are features commonly
associated with certain subsets of such islands, which we are
investigating, and in the meantime we have developed a tool for our
own use which you may also find useful:
It is called "IslandPath", and an example can be found at:
It performs an analysis of G+C content (with flexible cutoffs),
dinucleotide bias analysis (using a gene-cluster method we developed
to optimize island identification), presence of tRNA genes (NCBI
annotations and tRNAscan SE), and presence of mobility genes (NCBI
annotations and a COG-based analysis). Note the documentation is still
incomplete. For example, regions of high dinucleotide bias (greater
than 1 S.D. of the mean for an analysis of gene-clusters) are marked
with a strikethrough line, but this isn't indicated yet in the help
file. More info regarding the methodology and validation will appear
in a planned paper.
Meanwhile, we would be happy to let you know more about this
application or the results for other genomes if you wish. Hopefully
you'll find it helpful and feedback is always appreciated!
Fiona Brinkman and Will Hsiao
More information about the Bio-soft