|
|
|
|
Tools for Prokaryotic Comparative Genomics
Ultimate Project goals
The goal is to build a software tool that takes as input the genomes
of closely related prokaryotic species and automates comparative
genomic analyses of these species. The tool should be easy for microbiologists
to use. Here are some of the analyses the tool will do:
-
Provide a good genome browser that allows comparative sequence
exploration of the genomes.
Mauve does much of this
very nicely, with many wonderful features. Mauve does not show
multiple sequence alignments when orthologous elements have opposite
orientations, and may not be aggressive enough about finding all
orthologies for some of the analyses listed below.
-
For each species, find all unique genes, that is, those that do not
occur in the other species.
-
For each species, find all genes the other species have but are
missing from this species. Classify these genes as entirely absent or
as pseudogenes, indicating recent loss.
-
Mauve provides a phylogenetic guide tree of the genomes it aligns.
Generalizing the two points above, consider the guide tree's partition
of species into most closely related subsets and determine the genes
that are peculiar to each subset.
-
Alternatively, the user supplies a partition of the species into two
subsets and the tool deduces a "barcode" of genes that can be used to
classify further species into one of these two subsets, as
in O'Sullivan et al. Use this to
classify symptoms, host, or niche.
-
Label Mauve's guide tree with gene gain and loss along each branch, as
in Lefebure and Stanhope,
Figure 4.
-
Investigate unique (unalignable) regions of each species for evidence
of lateral transfer (by BLAST to sequenced genomes, by unusual G+C
content, by unusual codon usage, and by presence of sequence uptake
signals as in O'Sullivan et al.,
page 15.
-
Classify the functions of core and dispensable genes, as
in Tettelin et al.
-
Plot trends in core and dispensable gene counts, as
in Tettelin et al.
-
Investigate phylogenetic footprints in noncoding regions in an attempt
to identify functional regulatory elements.
|