Web-based services:
- (NEW) PerPlot: Performs the Periodicity Plot as described in Mrázek(2010) and Mrázek et al.(2011). The program was written by Jan Mrázek and interface was designed by Tejas Chaudhari and Aryabrata Basu.
- (NEW) PerScan: Performs the Periodicity Scan as described in Mrázek (2010) and Mrázek et al.(2011). The program was written by Jan Mrázek and the interface was designed by Tejas Chaudhari and Aryabrata Basu.
- Ab Initio Motif Identification Environment: the purpose of this environment is to provide tools for discovery and interpretation of significantly overrepresented DNA sequence motifs in prokaryotic genomes. The web interface was designed by Shaohua Xie and the programs were written by Jan Mrázek, Xiangxue Guo, Shaohua Xie, and Anuj Srivastava with the use of ClustalW and John Brzustowski's qclust.
- Find PHX and PA genes: this service identifies Predicted Highly eXpressed (PHX) genes and Putative Alien (PA) genes in a complete annotated genome by thetechniques described in Karlin and Mrázek (2000) and Mrázek and Karlin (1999). These methods were developed in the Karlin lab by Samuel Karlin and Jan Mrázek, programs were written by Jan Mrázek and the web interface was designed by Shaohua Xie.
- Pattern Locator: a new tool for finding sequence patterns in long DNA sequences. The program was written by Jan Mrázek and the web interface was designed by Shaohua Xie. Note: for this web-based service, a restricted version of Pattern Locator is used, which estimates the time needed for completion of the search and stops if the estimated CPU time exceeds a certain limit (currently 90 seconds). The CPU time limit was introduced in order to protect the web server from overloading due to requests involving too complex sequence patterns. Users can also download the unrestricted version of Pattern Locator from here.
- Motif Locator: Uses aligned set of DNA sequence motifs (no gaps, all motifs of the same length) and finds similar motifs in the analyzed sequence. Uses position specific score matrix (PSSM) representation of the motif. The output is similar to Pattern Locator. Includes r-scan statistics and pattern vicinity analysis. The program was written by Jan Mrázek and the web interface was designed by Shaohua Xie.
- Analysis of sequence heterogeneity - sliding window plots: this service provides access to tools described in Karlin (2001) and Mrázek and Karlin (1998). The program was written by Jan Mrázek and the web interface was designed by Shaohua Xie.
- Simple Sequence Repeats: this tool allows you to count SSRs of different lengths in a DNA sequence and make plots such as those we used in Mrazek 2006 and Mrazek et al 2007. The analyzed sequence must not be longer than 12,000,000 nucleotides. The program was written by Jan Mrázek and the web interface was designed by Shaohua Xie.
- Genome signature comparisons (δ*-differences): this program was developed in the Karlin lab and used to generate data presented in Campbell et al. 1999, Karlin et al. 1998, Karlin and Mrazek 1997, Karlin et al. 2002, and elsewhere. The program was written by Jan Mrázek and the web interface was designed by Shaohua Xie.
- Amino acid and codon usage statistics. Reads an annotated DNA sequence in GenBank format, extracts all annotated genes (CDS features), and calculates the amino acid frequencies, codon frequencies and other statistics about the gene collection. The program was written by Jan Mrázek and the web interface was designed by Shaohua Xie.
- Find under- and over-represented short oligonucleotides in a genome. Calculates relative abundances of all di-, tri- and tetranucleotides using the genome signature formula proposed by Karlin and coworkers (1994, 1997) as well as the Markov formula. The program was written by Jan Mrázek and the web interface was designed by Shaohua Xie.
- Find frequent words (oligonucleotides) in a genome. This program applies the frequent word statistics described in Karlin et al.(1996) and Karlin and Mrazek (1996). The program was written by Jan Mrázek and the web interface was designed by Shaohua Xie.
- Find frequent words (oligopeptides) in a proteome. This program applies the frequent word statistics described in Karlin et al.(1996) and Karlin and Mrazek (1996). The program was written by Jan Mrázek and the web interface was designed by Shaohua Xie.
Programs for download:
- Genome Randomizer: a simple utility to generate random sequences from complete prokaryotic genomes using 11 different stochastic models.
- Pattern Locator: this is the unrestricted version.
Computational Microbiology laboratory, Department of Microbiology,
550 Biological Sciences,
Athens, GA 30602-2605