Chemical Composition

class ursgal.ChemicalComposition(sequence=None, aa_compositions=None, isotopic_distributions=None)

Chemical composition class. The actual sequence or formula can be reset using the add function.

Keyword Arguments:
 
  • sequence (str) – Peptide or chemical formula sequence
  • aa_compositions (Optional[dict]) – amino acid compositions
  • isotopic_distributions (Optional[dict]) – isotopic distributions

Keyword argument examples:

sequence - Currently this can for example be::
[ ‘+H2O2H2-OH’, ‘+{0}’.format(‘H2O’), ‘{peptide}’.format(pepitde=’ELVISLIVES’), ‘{peptide}+{0}’.format(‘PO3’, peptide=’ELVISLIVES’), ‘{peptide}#{unimod}:{pos}’.format( peptide = ‘ELVISLIVES’, unimod = ‘Oxidation’, pos = 1 ) ]
Examples::
>>> c = ursgal.ChemicalComposition()
>>> c.use("ELVISLIVES#Acetyl:1")
>>> c.hill_notation()
'C52H90N10O18'
>>> c.hill_notation_unimod()
'C(52)H(90)N(10)O(18)'
>>> c
{'O': 18, 'H': 90, 'C': 52, 'N': 10}
>>> c.composition_of_mod_at_pos[1]
defaultdict(<class 'int'>, {'O': 1, 'H': 2, 'C': 2})
>>> c.composition_of_aa_at_pos[1]
{'O': 3, 'H': 7, 'C': 5, 'N': 1}
>>> c.composition_at_pos[1]
defaultdict(<class 'int'>, {'O': 4, 'H': 9, 'C': 7, 'N': 1})
>>> c = ursgal.ChemicalComposition('+H2O2H2')
>>> c
{'O': 2, 'H': 4}
>>> c.subtract_chemical_formula('H3')
>>> c
{'O': 2, 'H': 1}

Note

We did not include mass calculation, since pyQms will calculate masses much more accurately using unimod and other element enrichments.

add_chemical_formula(chemical_formula)

Adds chemical formula to the instance

add_peptide(peptide)

Adds peptide sequence to the instance

clear()

Resets all lookup dictionaries and self

One class instance can be used analysing a series of sequences, thereby avoiding class instantiation overhead

composition_at_pos = None

dict – chemical composition at given peptide position incl modifications (if peptide sequence was used as input or using the use function)

Note

Numbering starts at position 1, since all PSM search engines use this nomenclature.

composition_of_aa_at_pos = None

dict – chemical composition of amino acid at given peptide position (if peptide sequence was used as input or using the use function)

Note

Numbering starts at position 1, since all PSM search engines use this nomenclature.

composition_of_mod_at_pos = None

dict – chemical composition of unimod modifications at given position (if peptide sequence was used as input or using the use function)

Note

Numbering starts at position 1, since all PSM search engines use this nomenclature.

hill_notation(include_ones=False, cc=None)

Formats chemical composition into Hill notation string.

Parameters:cc (dict, optional) – can format other element dicts as well.
Returns:
Hill notation format of self.
For examples::
C50H88N10O17
Return type:str
hill_notation_unimod(cc=None)

Formats chemical composition into Hill notation string adding unimod features.

Parameters:cc (dict, optional) – can format other element dicts as well.
Returns:
Hill notation format including unimod format rules of self.
For example::
C(50)H(88)N(10)O(17)
Return type:str
subtract_chemical_formula(chemical_formula)

Subtract chemical formula from instance

subtract_peptide(peptide)

Subtract peptide from instance

use(sequence)

Re-initialize the class with a new sequence

This is helpful if one wants to use the same class instance for multiple sequence since it remove class instantiation overhead.

Parameters:sequence (str) –