Dataset Open Access

Single mutation protein structure pairs extracted from the PDB with MicroMiner

Sieg Jochen; Rarey Matthias


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Sieg Jochen</dc:creator>
  <dc:creator>Rarey Matthias</dc:creator>
  <dc:date>2023-09-30</dc:date>
  <dc:description>This page provides the single mutation data extracted with MicroMiner from the PDB. The data contains amino acid pairs in protein structures from the PDB, exemplifying single mutations’ local structural changes for single chains and pairs for protein–protein interfaces. Mutations to non-standard residues are also provided.
See the MicroMiner publication for details:


Sieg, J.; Rarey, M. Searching similar local 3D micro-environments in protein structure databases with MicroMiner, 2023 (accepted in Briefings in Bioinformatics)


Data content:


	pdb_all_monomer.tsv

	
		all single mutations in monomer/single chains
		255853767 pairs/lines
		15GB
	
	
	filtered_single_mutations_pdb_monomer.tsv
	
		redundancy and similarity filtered pdb_all_monomer.tsv
		4868765 pairs/lines
		324MB
	
	
	single_mutations_pdb_monomer_non_standard_aa.tsv
	
		only single mutations containing non-standard in monomer/single chains
		350969 pairs/lines
		21MB
	
	
	pdb_all_ppi.tsv
	
		all single mutations at PPIs
		45752145 pairs/lines
		2.7GB
	
	
	filtered_single_mutations_pdb_ppi.tsv
	
		redundancy and similarity filtered pdb_all_ppi.tsv
		799130 pairs/lines
		54MB
	
	
	single_mutations_pdb_ppi_non_standard_aa.tsv
	
		only single mutations containing non-standard residues at PPIs
		114671 pairs/lines
		6.9MB
	
	


A row in the TSV files describes the residue position of the single mutation in the wild-type (query) and mutant (hit). Multiple local structural and sequential similarity measures are provided, computed from the residue 3D micro-environments. The column fullSeqId contains the global sequence similarity. The first two rows of a TSV file look this:

queryName    queryChain    queryAA    queryPos    hitName    hitChain    hitAA    hitPos    siteIdentity    siteBackBoneRMSD    siteAllAtomRMSD    nofSiteResidues    alignmentLDDT    fullSeqId
10GS    A    CYS    47    2J9H    A    ALA    48    0.938    0.223    0.431    16.0    0.996    0.976    0.976

queryName: query PDB-ID

queryChain: query chain ID

queryAA: query amino acid type (three letter code)

queryPos: query sequence position of the amino acid residue

hitName: hit PDB-ID

hitChain: hit chain ID

hitAA: hit amino acid type (three letter code)

hitPos: hit sequence position of the amino acid residue

siteIdentity: sequence identity of the aligned micro-environments

siteBackBoneRMSD: Calpha-RMSD of the aligned micro-environments

siteAllAtomRMSD: all-atom-RMSD of the aligned micro-environments

nofSiteResidues: number of residues in the micro-environments

alignmentLDDT: mean LDDT score of all residues in the aligned micro-environments

fullSeqId: global sequence identity of the query chain and hit chain (as specified by the chain IDs)

 

This work was supported by the German Federal Ministry of Education and Research as part of de.NBI [grant number 031L0105] and protP.S.I. [grant number 031B0405B].

 </dc:description>
  <dc:identifier>https://www.fdr.uni-hamburg.de/record/13411</dc:identifier>
  <dc:identifier>10.25592/uhhfdm.13411</dc:identifier>
  <dc:identifier>oai:fdr.uni-hamburg.de:13411</dc:identifier>
  <dc:relation>doi:10.25592/uhhfdm.13410</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://creativecommons.org/licenses/by/4.0/legalcode</dc:rights>
  <dc:title>Single mutation protein structure pairs extracted from the PDB with MicroMiner</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>

Cite record as