Databases for Microbiologists

http://nar.oxfordjournals.org/content/43/D1/D1.short Databases for microbiologists, Igor B. Zhulin, American Society for Microbiology

  • Global resources
    • NCBI genome datases
    • Sequence Read Archive (SAR) (repository for NGS data) http://www.ncbi.nlm.nih.gov/sra
    • GEO (Gene Expression Omnibus) http://www.ncbi.nlm.nih.gov/geo/
    • NCBI Virus Variations resources http://www.ncbi.nlm.nih.gov/genome/viruses/variation/
  • Comprehensive databases
    • Classify proteins from their sequence or structure
      • Pfam http://pfam.xfam.org/
      • SMART http://smart.embl-heidelberg.de/
      • TIGRFAM http://www.jcvi.org/cgi-bin/tigrfams/index.cgi
      • COGs (Phylogenetic classification of proteins encoded in complete genomes) http://www.ncbi.nlm.nih.gov/COG/
      • CDD ( a collection of multiple sequence alignment models for domains and proteins. CDD includes 3D-structure information as well as domain models from external source databases (Pfam, SMART, COG, PRK, TIGRFAM). ) http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml
      • InterPro Protein http://www.ebi.ac.uk/interpro/
    • structure of proteins
      • RCSB PDB http://www.rcsb.org/pdb/home/home.do
      • SCOOP http://scop2.mrc-lmb.cam.ac.uk/
      • CATH (class, Architecture, Topology and Homology) http://www.cathdb.info/
    • Finding orthologys (genes and their products in different species)
      • COG
      • OrthoDB http://orthodb.org/
    • Reconstruction of metabolism from genome sequence
      • KEGG
      • MetaCyc (http://metacyc.org/)
      • BioCyc http://biocyc.org/
    • Protein Interaction
      • STRING http://string-db.org/
      • BioGRID (protein and genetic interactions in many model organism, including E. Coli, B. subtilis, Eukaryotics microorganism and viruses - http://thebiogrid.org/
      • IMG (Integrates genomes and relative metadata from bacteria, archaea, eukaryotic microbies and viruses. It incorportaes newly avaialble proteomics and RNAseq datasets and contains information on biosynthetic clusters, sets of genes encoding pathways for secondary metabolite production in selected bacterial genomes.) http://img.jgi.doe.gov/
      • Xbase (unavailable) http://xbase.bham.ac.uk
      • MBGD http://mbgd.genome.ad.jp/ (The aim of MBGD is to facilitate comparative genomics from various points of view such as ortholog identification, paralog clustering, motif analysis and gene order comparison.)
      • MicrobesOnline http://www.microbesonline.org/ (MicrobesOnline not only contains information on thousands of bacterial, archaeal and fungal genomes, but also provides access to gene expression and fitness data.)
    • Annotation and comparative analysis.
      • SEED (popular platform for automated microbial 194 genome annotation) http://www.theseed.org/wiki/Main_Page
      • RAST (annotating bacterial and archaeal genomes) http://www.nmpdr.org/FIG/wiki/view.cgi/FIG/RapidAnnotationServer
      • MicroScope (aims at improving annotation by enabling groups of investigators to collectively curate various aspects of a given microbial genome and enables cross-genome comparisons.) https://www.genoscope.cns.fr/agc/microscope/home/
      • COMBREX (annotate microbial genes) http://combrex.bu.edu/
      • PROKKA (quick annotation of a bacterial or archaeal genome) http://www.vicbioinformatics.com/software.prokka.shtml
    • Identify ortholuges
      • POGO-DB (Database of Pairwise-comparisons Of Genomes to find Orthologous genes) http://pogo.ece.drexel.edu/
      • Ortholuge database http://www.pathogenomics.sfu.ca/ortholugedb/?page=about
      • ATGC http://atgc.lbl.gov/atgc/
    • Visualize bacterial genomes
      • BacMap http://bacmap.wishartlab.com/ is an interactive visual database containing hundreds of fully labeled, zoomable, and searchable maps of bacterial genomes.
    • repeated events
      • PSORTdb http://db.psort.org/ a database of protein subcellular localizations for bacteria and archaea that contains both information determined through laboratory experimentation (ePSORTdb dataset) and computational predictions (cPSORTdb dataset).
      • DoriC http://tubic.tju.edu.cn/doric/ catalogs experimentally identified and computationally predicted oriC regions in bacterial and archaeal genomes
      • ICEberg (a database for integrative and conjugative elements found in Bacteria) http://db-mml.sjtu.edu.cn/ICEberg/
      • MICAS (find non-redundant microsatellites (which contains information on simple sequence repeats and short tandem repeats (microsatellites)) of a selected bacterial or archeal genome sequence.) http://micas.cdfd.org.in/
  • Special purpose or community databases
    • E. coli
      • EcoGene http://ecogene.org/ (Comprehensive coverage of genome inforamtion, literature-based curation and experimental data on E.coli)
      • GenoBase (unavailable) same as above
      • EcoCyc http://ecocyc.org/ scientific database for the bacterium Escherichia coli K-12 MG1655. The EcoCyc project performs literature-based curation of the entire genome, and of transcriptional regulation, transporters, and metabolic pathways.
      • PortEco http://porteco.org/ (portal to discuss on E. coli)
      • ASAP (stores and distributes genome information and experimental data from functional genomics data) http://www.genome.wisc.edu/tools/asap.htm
      • RegulonDB (curated information on gene organization and regulation in E. coli) http://ecoliwiki.net/colipedia/index.php/RegulonDB
      • Bacteriome Database http://www.compsysbio.org/bacteriome/ (all known and predicting novel protein-protein interactions in E. Coli)
      • STEPdb http://www.stepdb.eu/step2/ (catalogs sub-cellular localization and topology of all its proteins
      • EcoProDB (unavailable) (experimental inforamtion on E.coli protiens as well as 2D maps)
      • ECMDB (annotation and detailed inforamtion about E. Coli metabolome) http://www.ecmdb.ca/
    • Bacillus subtilis
      • refere to the original paper
    • Cyanobacteria
      • refere to the original paper
    • Saccharomyces
    • refere to the original paper
  • Diversity and metagenomics
    • The Ribosomal Database Project (RDP), maintains the largest collection of aligned and 276 annotated rRNA gene sequences from bacteria, archaea and fung
    • SILVA (offers a set of rRNA gene sequence databases for bacteria, archaea and eukaryota based on representative phylogenetic trees http://www.arb-silva.de/
    • GREEN GENES (monitoring microbial communities within complex environments) http://greengenes.lbl.gov/cgi-bin/nph-index.cgi
    • PhylOPDb (a large collection of regular and explorative rRNA targeted probes) http://g2im.u-clermont1.fr/phylopdb/
    • rrnDB (means for interpreting rRNA gene abundance in bacteria and archaea) https://rrndb.umms.med.umich.edu/
    • BacDive (contains detailed information on various aspects of more than 20,000 strains of bacteria and archaea, which includes taxonomy, physiology, sampling and environmental conditions) http://bacdive.dsmz.de/
    • Global Catalog of Microorganisms (GCM) a database for retrieval and analysis of relevant information for 289 hundreds of thousands of microbial strains from different sources http://gcm.wfcc.info/
    • BIGSdb (Bacterial Isolate Genome Sequence Database) http://pubmlst.org/software/database/bigsdb/
    • PubMLST Public databases for molecular typing and microbial genome diversity. stores sequence data for bacterial isolates and enables analysis of genome variation at the population level http://pubmlst.org/
    • MKetaMicrobesOnline http://meta.microbesonline.org/
    • MetaRef (MetaRef is a resource to comprehensively catalog and characterize clade-specific microbial genes. We identify and provide all core genes associated with all microbial species and genera with available reference genomes (final or draft). A subset of these gene families are consistently present in one or more taxonomic clades, which allows us to further indicate them as marker genes. ) http://metaref.org/
    • Metaprox (can be used to find metagenomic genes, as well as their predicted functional interactions and corresponding annotations.) http://metaprox.uwaterloo.ca/
    • MetaBioME (find novel homologs for known Commercially Useful Enzymes (CUEs) in metagenomic datasets and completed bacterial genomes) http://metasystems.riken.jp/metabiome/
    • FOAM ((Functional Ontology Assignments for Metagenomes) offers classification of gene functions in environmental metagenomes based on ontology and orthologous relationships http://portal.nersc.gov/project/m1317/FOAM
  • Pathogenesis
    • refer to the original paper
  • Transport, secretion and metabolism
    • TCDB
    • TransportDB (curated annotations and detailed genomic comparisons of membrane transport proteins across selected genomes of bacteria, archaea and eucarya)
    • Microme (microbial metabolism, main goal is to support the large scale inference of metabolic flux directly from genome sequence.) http://microme.eu
    • MicroScope
    • CAzy
  • Mine
    • BAGEL ( bacteriocin database ) http://bagel.molgenrug.nl/

Its a summary of the paper


Comments

comments powered by Disqus