Single mutation protein structure pairs extracted from the PDB with MicroMiner

Sieg Jochen; Rarey Matthias

doi:10.25592/uhhfdm.13411

September 30, 2023 Dataset Open Access

Single mutation protein structure pairs extracted from the PDB with MicroMiner

Sieg Jochen; Rarey Matthias

DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-3" xsi:schemaLocation="http://datacite.org/schema/kernel-3 http://schema.datacite.org/meta/kernel-3/metadata.xsd">
  <identifier identifierType="DOI">10.25592/uhhfdm.13411</identifier>
  <creators>
    <creator>
      <creatorName>Sieg Jochen</creatorName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0001-5343-7255</nameIdentifier>
      <affiliation>Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany</affiliation>
    </creator>
    <creator>
      <creatorName>Rarey Matthias</creatorName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="http://orcid.org/">0000-0002-9553-6531</nameIdentifier>
      <affiliation>Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Single mutation protein structure pairs extracted from the PDB with MicroMiner</title>
  </titles>
  <publisher>Universität Hamburg</publisher>
  <publicationYear>2023</publicationYear>
  <dates>
    <date dateType="Issued">2023-09-30</date>
  </dates>
  <resourceType resourceTypeGeneral="Dataset"/>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://www.fdr.uni-hamburg.de/record/13411</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsPartOf">10.25592/uhhfdm.13410</relatedIdentifier>
  </relatedIdentifiers>
  <rightsList>
    <rights rightsURI="https://creativecommons.org/licenses/by/4.0/legalcode">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">&lt;p&gt;This page provides the single mutation data extracted with MicroMiner from the PDB. The data contains amino acid pairs in protein structures from the PDB, exemplifying single mutations&amp;rsquo; local structural changes for single chains and pairs for protein&amp;ndash;protein interfaces. Mutations to non-standard residues are also provided.&lt;br&gt;
See the MicroMiner publication for details:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Sieg, J.; Rarey, M. Searching similar local 3D micro-environments in protein structure databases with MicroMiner, 2023 (accepted in Briefings in Bioinformatics)&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Data content:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
	&lt;li&gt;&lt;strong&gt;pdb_all_monomer.tsv&lt;/strong&gt;

	&lt;ul&gt;
		&lt;li&gt;all single mutations in monomer/single chains&lt;/li&gt;
		&lt;li&gt;255853767 pairs/lines&lt;/li&gt;
		&lt;li&gt;15GB&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;filtered_single_mutations_pdb_monomer.tsv&lt;/strong&gt;
	&lt;ul&gt;
		&lt;li&gt;redundancy and similarity filtered pdb_all_monomer.tsv&lt;/li&gt;
		&lt;li&gt;4868765 pairs/lines&lt;/li&gt;
		&lt;li&gt;324MB&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;single_mutations_pdb_monomer_non_standard_aa.tsv&lt;/strong&gt;
	&lt;ul&gt;
		&lt;li&gt;only single mutations containing non-standard in monomer/single chains&lt;/li&gt;
		&lt;li&gt;350969 pairs/lines&lt;/li&gt;
		&lt;li&gt;21MB&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;pdb_all_ppi.tsv&lt;/strong&gt;
	&lt;ul&gt;
		&lt;li&gt;all single mutations at PPIs&lt;/li&gt;
		&lt;li&gt;45752145 pairs/lines&lt;/li&gt;
		&lt;li&gt;2.7GB&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;filtered_single_mutations_pdb_ppi.tsv&lt;/strong&gt;
	&lt;ul&gt;
		&lt;li&gt;redundancy and similarity filtered pdb_all_ppi.tsv&lt;/li&gt;
		&lt;li&gt;799130 pairs/lines&lt;/li&gt;
		&lt;li&gt;54MB&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
	&lt;li&gt;&lt;strong&gt;single_mutations_pdb_ppi_non_standard_aa.tsv&lt;/strong&gt;
	&lt;ul&gt;
		&lt;li&gt;only single mutations containing non-standard residues at PPIs&lt;/li&gt;
		&lt;li&gt;114671 pairs/lines&lt;/li&gt;
		&lt;li&gt;6.9MB&lt;/li&gt;
	&lt;/ul&gt;
	&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A row in the TSV files describes the residue position of the single mutation in the wild-type (query) and mutant (hit). Multiple local structural and sequential similarity measures are provided, computed from the residue 3D micro-environments. The column fullSeqId contains the global sequence similarity. The first two rows of a TSV file look this:&lt;/p&gt;

&lt;pre&gt;&lt;code class="language-bash"&gt;queryName    queryChain    queryAA    queryPos    hitName    hitChain    hitAA    hitPos    siteIdentity    siteBackBoneRMSD    siteAllAtomRMSD    nofSiteResidues    alignmentLDDT    fullSeqId
10GS    A    CYS    47    2J9H    A    ALA    48    0.938    0.223    0.431    16.0    0.996    0.976    0.976&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;queryName&lt;/em&gt;: query PDB-ID&lt;/p&gt;

&lt;p&gt;&lt;em&gt;queryChain&lt;/em&gt;: query chain ID&lt;/p&gt;

&lt;p&gt;&lt;em&gt;queryAA&lt;/em&gt;: query amino acid type (three letter code)&lt;/p&gt;

&lt;p&gt;&lt;em&gt;queryPos&lt;/em&gt;: query sequence position of the amino acid residue&lt;/p&gt;

&lt;p&gt;&lt;em&gt;hitName&lt;/em&gt;: hit PDB-ID&lt;/p&gt;

&lt;p&gt;&lt;em&gt;hitChain&lt;/em&gt;: hit chain ID&lt;/p&gt;

&lt;p&gt;&lt;em&gt;hitAA&lt;/em&gt;: hit amino acid type (three letter code)&lt;/p&gt;

&lt;p&gt;&lt;em&gt;hitPos&lt;/em&gt;: hit sequence position of the amino acid residue&lt;/p&gt;

&lt;p&gt;&lt;em&gt;siteIdentity&lt;/em&gt;: sequence identity of the aligned micro-environments&lt;/p&gt;

&lt;p&gt;&lt;em&gt;siteBackBoneRMSD&lt;/em&gt;: Calpha-RMSD of the aligned micro-environments&lt;/p&gt;

&lt;p&gt;&lt;em&gt;siteAllAtomRMSD&lt;/em&gt;: all-atom-RMSD of the aligned micro-environments&lt;/p&gt;

&lt;p&gt;&lt;em&gt;nofSiteResidues&lt;/em&gt;: number of residues in the micro-environments&lt;/p&gt;

&lt;p&gt;&lt;em&gt;alignmentLDDT&lt;/em&gt;: mean LDDT score of all residues in the aligned micro-environments&lt;/p&gt;

&lt;p&gt;&lt;em&gt;fullSeqId&lt;/em&gt;: global sequence identity of the query chain and hit chain (as specified by the chain IDs)&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;This work was supported by the German Federal Ministry of Education and Research as part of de.NBI [grant number 031L0105] and protP.S.I. [grant number 031B0405B].&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description>
  </descriptions>
</resource>

Publication date:

September 30, 2023

DOI:

Communities:

License (for files):

Creative Commons Attribution 4.0 International

Versions

Version 1 10.25592/uhhfdm.13411

Sep 30, 2023

Cite all versions? You can cite all versions by using the DOI 10.25592/uhhfdm.13410. This DOI represents all versions, and will always resolve to the latest one.

Zentrumfür Nachhaltiges Forschungsdatenmanagement

Suche

Single mutation protein structure pairs extracted from the PDB with MicroMiner

DataCite XML Export

Versions

Cite record as

Export

Single mutation protein structure pairs extracted from the PDB with MicroMiner

DataCite XML Export

DOI Badge

Markdown

[![DOI](https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.13411.svg)](https://doi.org/10.25592/uhhfdm.13411)

reStructedText

.. image:: https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.13411.svg :target: https://doi.org/10.25592/uhhfdm.13411

HTML

<a href="https://doi.org/10.25592/uhhfdm.13411"><img src="https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.13411.svg" alt="DOI"></a>

Image URL

https://www.fdr.uni-hamburg.de/badge/DOI/10.25592/uhhfdm.13411.svg

Target URL

https://doi.org/10.25592/uhhfdm.13411

Versions

Cite record as

Export