Preparing PDBx/mmCIF files for Depositing Structures

To better support the increasing complexity and size of data submitted to the PDB archive, the wwPDB Deposition & Biocuration system is based on the PDBx/mmCIF data dictionary and file format. The system accepts, processes and distributes PDBx/mmCIF data files.

Depositors are encouraged to use the PDBx/mmCIF format for coordinate files whenever possible.

Generating PDBx/mmCIF format files automatically

PDBx/mmCIF is the official working format of the wwPDB for coordinate files. It is flexible, extensible, and can accommodate structures of any size.

PDBx/mmCIF files ready for deposition are generated by most structure refinement programs. In the event where it is not possible to use a refinement program to generate PDBx/mmCIF files, the pdb_extract program may be used.

Additional information about the PDBx/mmCIF format can be found in this FAQ.

PDBx/mmCIF format is especially useful:

When a PDBx/mmCIF file is the output of a final refinement. The developers of REFMAC, Phenix, and Buster are involved in the development of the PDBx/mmCIF format, and these programs will output PDBx/mmCIF format files that can be deposited without additional modification.
When the structure to be deposited is large. In this context, a large structure is defined as having more than 99,999 atoms and/or more than 62 polymer chains. These are the restrictions of the traditional PDB format. PDBx/mmCIF has no atom number restriction and virtually no chain number restriction. Please consult the "Using pdb_extract" sections of this guide for more information on depositing large structures.
When it is useful to avoid manual entry of additional information via the deposition interface. In addition to converting PDB to PDBx/mmCIF format, pdb_extract can be used to add sequence and other information to a coordinate file prior to deposition.

a) Refinement packages (Preferred)

Depositors are encouraged to use the latest version of the refinement software packages to output up-to-date and mmCIF compliant deposition files.

Recent versions of refinement packages Phenix, REFMAC, and Buster generate PDBx/mmCIF files ready for deposition:

Phenix: Instructions are available at the Phenix website, https://www.phenix-online.org/documentation/overviews/xray-structure-deposition.html

CCP4: instructions are available for CCP4i2 (http://www.ccp4.ac.uk/deposition_ccp4i2) or

CCP4 Cloud (http://www.ccp4.ac.uk/deposition_ccp4cloud)

REFMAC (when using outside of CCP4i2 or CCP4 Cloud): To output a PDBx/mmCIF file from REFMAC, add a card that reads "pdbout format mmcif". REFMAC can also read a file by specifying a PDBx/mmCIF file as an HKLIN argument.

Buster: Instructions are available at the Buster website (https://www.globalphasing.com/buster/wiki/index.cgi?DepositionMmCif)

b) pdb_extract

For non-crystallography depositions, the pdb_extract program is available as an online interface and as a standalone command-line program to convert PDB format file to PDBx/mmcif format. It extracts and harvests data in PDBx/mmCIF format from structure determination programs. To prepare PDB format data files for use with pdb_extract and the wwPDB deposition tool, please follow the instructions at OneDep FAQ.