wwPDB 2008 News


12/01/2008 Announcement: PDB Archive Version 3.15 to be Released
11/19/2008 Announcement: New Releases to Follow Format Guide Version 3.20
09/15/2008 Announcement: Comprehensive Format Guide Version 3.2
08/14/2008 IUCr: wwPDB Exhibition Stand and Presentations
08/11/2008 Download Statistics Available by Structure ID
05/08/2008 Workshop on Next Generation Validation Tools for the wwPDB
04/21/2008 Recent wwPDB Papers
04/08/2008 PDB Archives More Than 50,000 Structures
01/22/2008 Time-stamped Copies of PDB Archive Available via FTP
01/07/2008 Announcement: Data Processing Versioning Procedures


Announcement: PDB Archive Version 3.15 to be Released

A new standardized version of the PDB archive will be available from ftp://ftp.wwpdb.org in early 2009. All entries released prior to December 2, 2008 will be re-released as PDB Format Version 3.15 files. This release will overwrite all existing files. A snapshot of the archive before this release will be available from ftp://snapshots.wwpdb.org/.

For documentation, please see File Format Documentation.

Questions may be sent to info@wwpdb.org.


Announcement: New Releases to Follow Format Guide Version 3.20

Beginning December 2, 2008, all newly-released PDB entries will follow PDB File Format Contents Guide Version 3.20 (PDF | HTML). This format includes site and assembly annotation, and supports the nomenclature introduced in 2007 in the Chemical Component Dictionary. The Version 3.20 Changes Guide highlights the lists changes in format from 3.1.

Please send any questions to info@wwpdb.org.


Announcement: Comprehensive Format Guide Version 3.2

During the past year, the wwPDB annotators have collaborated on a project to clarify the details and procedures related to data processing and annotation. The result is a PDB Contents Guide Version 3.2 that more fully describes the PDB file format. This document is available as a PDF and in HTML, and is accompanied by a document highlighting these clarifications.

In the coming months, all files released by the wwPDB will follow the format as described in this document. Details will be made available on this website and at www.wwpdb.org.


IUCr: wwPDB Exhibition Stand and Presentations

The wwPDB partners will be exhibiting at the XXI Congress & General Assembly of the International Union of Crystallography (IUCr; August 23 - 31 in Osaka, Japan) at booth #14. Please stop by for website demonstrations and to meet with wwPDB members from around the globe.

Helen M. Berman (RCSB PDB) will present a keynote lecture entitled "What the Protein Data Bank tells us about the past, present, and future of structural biology" on Sunday August 24.

On Saturday, August 30, John Westbrook (RCSB PDB) will present "Data Quality in the PDB Archive".


Download Statistics Available by Structure ID

Downloads from the PDB archive are one of the primary means of accessing scientific structure results. While there are cross-links between the corresponding scientific publication and the PDB entry, in many cases it is the structure file that is accessed and downloaded more frequently.

The wwPDB website has recently added statistics for FTP and HTTP (web) downloads and views for each PDB structure. The high volume of data downloaded around the world underscores the importance of including informative, accurate, and annotated PDB data in the archive. Data are available by month, starting from August 2007, for each wwPDB site. These statistics can be accessed a number of ways:

  • A searchable database can be searched by ID or group of IDs. Results can display the wwPDB site and month accessed, and include a line chart illustrating FTP and HTTP activity over time.
  • Tables provide full and summary statistics. The summary table offers an overall view of activity. For example, 18,051,769 FTP downloads and 4,122,104 HTTP downloads/web page views were made across all wwPDB sites in July 2008. Full Download Complete Reports can be downloaded in CSV and TAB formats.
  • The Top 10 Download Statistics page offers a quick look at structures being downloaded by the most recent month and overall since August 2007. For example, 1crn was the #1 structure viewed and downloaded via HTTP from August 2007 - June 2008. In this table, mouse over the PDB ID to view the structure title.

All download statistics are updated monthly, and collected on an aggregate, rather than individual, basis. The wwPDB does not share server log information with third parties for marketing or other purposes.

To access these features, select Statistics>Downloads from the top menu bar at www.wwpdb.org. The wwPDB website also offers links to member sites, documentation, news, and deposition and processing statistics. Questions may be sent to info@wwpdb.org.


Workshop on Next Generation Validation Tools for the wwPDB

A meeting of the wwPDB X-ray Validation Task Force was held to collect recommendations and develop consensus on additional validation that should be performed on PDB entries, and to identify software applications to perform validation tasks.

The workshop was organized by Randy Read (Cambridge University), and sponsored by the RCSB PDB & PDBe. Detailed information about the workshop is available at Workshop on Next Generation Validation Tools for the wwPDB.


Recent wwPDB Papers

• wwPDB deposition tools, methods (including validation), and policies are described in

Data deposition and annotation at the Worldwide Protein Data Bank. Shuchismita Dutta, Kyle Burkhardt, Ganesh J. Swaminathan, Takashi Kosada, Kim Henrick, Haruki Nakamura, Helen M. Berman (2008) in Methods in Molecular Biology, vol. 426: Structural Proteomics: High-Throughput Methods (Bostjan Kobe, Mitchell Guss, Thomas Huber, eds.), pp. 81-101.

• Issues relating to NMR depositions are discussed in

BioMagResBank (BMRB) as a partner in the Worldwide Protein Data Bank (wwPDB): new policies affecting biomolecular NMR depositions. John L. Markley, Eldon L. Ulrich, Helen M. Berman, Kim Henrick, Haruki Nakamura, and Hideo Akutsu (2008) J Biomol NMR 40(3): 153-155


PDB Archives More Than 50,000 Structures


With this week's update, the PDB archive reached a significant milestone in its 37-year history. The 50,000th molecule structure was released into the archive, joining other structures vital to pharmacology, bioinformatics, and education.

The worldwide Protein Data Bank (wwPDB) has seen the archive double in size since 2004. The PDB was founded in 1971 with seven structures at Brookhaven National Laboratory. Today, the wwPDB receives approximately 25 new experimentally-determined structures from scientists each day for inclusion in the archive. More than 5 million files are downloaded from the PDB archive every month. Users include structural biologists, computational biologists, biochemists, and molecular biologists in academia, government, and industry as well as educators and students.

It is estimated that the size of the PDB archive will triple to 150,000 structures by the year 2014.

The 50,000th structure was released a week after another milestone event--the publication of the 100th edition of the Molecule of the Month.

Proteins, one of the main building blocks for living organisms, come in a variety of shapes, with the form of a protein corresponding to its function. The structures housed in the PDB demonstrate great diversity in size, complexity, and function, including:

  • Insulin, the protein deficient in diabetic patients
  • p53 tumor suppressor, a protein often implicated in cancer
  • Anthrax toxin, the disease-causing protein made by anthrax
  • Amyloid peptide, a protein implicated in Alzheimer's disease
  • Influenza proteins, structures which may help scientists design medicines to combat the flu
  • Prion proteins, misshapen proteins that are the cause of many diseases, including mad cow disease


Time-stamped Copies of PDB Archive Available via FTP

A time-stamped snapshot of the PDB archive (ftp://ftp.wwpdb.org) as of January 7, 2008 has been added to ftp://snapshots.rcsb.org/.

Snapshots of the PDB have been archived annually since 2004. It is hoped that these snapshots will provide readily identifiable data sets for research on the PDB archive.

The script at ftp://snapshots.rcsb.org/rsyncSnapshots.sh may be used to make a local copy of a snapshot or sections of the snapshot.

The directory 20080107 includes the 48,161 experimentally-determined coordinate files that were current as of January 7, 2008. Coordinate data are available in PDB, mmCIF, and XML formats. The date and time stamp of each file indicates the last time the file was modified.


Announcement: Data Processing Versioning Procedures

Data in the PDB archive currently follow either PDB File Format Version 3.0 or 3.1. This is indicated in REMARK 4 of the file.

Version 3.0 is the format used for files released as a result of the Remediation Project.

Since August 1, 2007, all files processed and released into the archive follow Version 3.1. When modifications are made to files released prior to that date, they are then re-released in Version 3.1.

Version 3.1 differs in descriptions of the biological unit (REMARK 300/350), geometry (REMARK 500), atom/residues modeled as zero occupancy (REMARK 475/480), non-polymer residues with missing atoms (REMARK 610), and metal coordination (REMARK 620). Documentation describing the differences between these versions is available at http://www.wwpdb.org/docs.html.

Beginning March 4, 2008, it will be indicated in the REVDAT record with the name "VERSN" when a Version 3.0 file is re-released as Version 3.1.

For example, if the journal record is updated in an entry that still follows Version 3.0, the REVDAT would appear as:

REVDAT 1 13-FEB-07 1ABC 0

There is no change to how depositors submit their files. Any required changes in nomenclature can be made automatically by the wwPDB during the annotation process.

Documentation about file formats and the Remediation Project is available at www.wwpdb.org.