In biology, a protein structure database is a database that is modeled around the various experimentally determined protein structures. The aim of most protein structure databases is to organize and annotate the protein structures, providing the biological community access to the experimental data in a useful way. Data included in protein structure databases often includes three-dimensional coordinates as well as experimental information, such as unit cell dimensions and angles for x-ray crystallography determined structures. Though most instances, in this case either proteins or a specific structure determinations of a protein, also contain sequence information and some databases even provide means for performing sequence based queries, the primary attribute of a structure database is structural information, whereas sequence databases focus on sequence information, and contain no structural information for the majority of entries. Protein structure databases are critical for many efforts in computational biology such as structure based drug design, both in developing the computational methods used and in providing a large experimental dataset used by some methods to provide insights about the function of a protein.[1]
The Protein Data Bank (PDB) was established in 1971 as the central archive of all experimentally determined protein structure data. Today the PDB is maintained by an international consortia collectively known as the Worldwide Protein Data Bank (wwPDB). The mission of the wwPDB is to maintain a single archive of macromolecular structural data that is freely and publicly available to the global community.[2][3]
Because the PDB releases data into the public domain, the data has been used in various other protein structure databases.
Examples of protein structure databases include (in alphabetical order);
- Database of Macromolecular Movements
- describes the motions that occur in proteins and other macromolecules, particularly using movies
- Dynameomics
- a data warehouse of molecular dynamics simulations and analyses of proteins representing all known protein fold families
- JenaLib
- the Jena Library of Biological Macromolecules is aimed at a better dissemination of information on three-dimensional biopolymer structures with an emphasis on visualization and analysis.
- ModBase
- a database of three-dimensional protein models calculated by comparative modeling
- OCA
- a browser-database for protein structure/function - The OCA integrates information from KEGG, OMIM, PDBselect, Pfam, PubMed, SCOP, SwissProt, and others.
- OPM
- provides spatial positions of protein three-dimensional structures with respect to the lipid bilayer.
- PDB Lite
- derived from OCA, PDB Lite was provided to make it as easy as possible to find and view a macromolecule within the PDB
- PDBsum
- provides an overview macromolecular structures in the PDB, giving schematic diagrams of the molecules in each structure and of the interactions between them
- PDBTM
- the Protein Data Bank of Transmembrane Proteins — a selection of the PDB.
- PDBWiki
- a community annotated knowledge base of biological molecular structures
- ProtCID
- The Protein Common Interface Database (ProtCID) is a database of similar protein–protein interfaces in crystal structures of homologous proteins.
- Protein
- the NIH protein database, a collection of sequences from several sources, including translations from annotated coding regions in GenBank, RefSeq and Third Party Annotation, as well as records from SwissProt, PIR, PRF, and PDB
- Proteopedia
- the collaborative, 3D encyclopedia of proteins and other molecules. A wiki that contains a page for every entry in the PDB (>100,000 pages), with a Jmol view that highlights functional sites and ligands. Offers an easy-to-use scene-authoring tool so you don't have to learn Jmol script language to create customized molecular scenes. Custom scenes are easily attached to "green links" in descriptive text that display those scenes in Jmol.
- ProteinLounge
- a protein databases that includes visuals of protein structure. Also, includes protein pathways and gene sequences including other tools.
- SCOP
- the Structural Classification of Proteins a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is known.
- SWISS-MODEL Repository
- a database of annotated protein models calculated by homology modeling
- TOPSAN
- the Open Protein Structure Annotation Network — a wiki designed to collect, share and distribute information about protein three-dimensional structures.