Chemical Composition¶
-
class
ursgal.
ChemicalComposition
(sequence=None, aa_compositions=None, isotopic_distributions=None, monosaccharide_compositions=None)¶ Chemical composition class. The actual sequence or formula can be reset using the add function.
Keyword Arguments: - sequence (str) – Peptide or chemical formula sequence
- aa_compositions (Optional[dict]) – amino acid compositions
- isotopic_distributions (Optional[dict]) – isotopic distributions
Keyword argument examples:
- sequence - Currently this can for example be::
- [ ‘+H2O2H2-OH’, ‘+{0}’.format(‘H2O’), ‘{peptide}’.format(pepitde=’ELVISLIVES’), ‘{peptide}+{0}’.format(‘PO3’, peptide=’ELVISLIVES’), ‘{peptide}#{unimod}:{pos}’.format( peptide = ‘ELVISLIVES’, unimod = ‘Oxidation’, pos = 1 ) ]
- Examples::
>>> c = ursgal.ChemicalComposition() >>> c.use("ELVISLIVES#Acetyl:1") >>> c.hill_notation() 'C52H90N10O18' >>> c.hill_notation_unimod() 'C(52)H(90)N(10)O(18)' >>> c {'O': 18, 'H': 90, 'C': 52, 'N': 10} >>> c.composition_of_mod_at_pos[1] defaultdict(<class 'int'>, {'O': 1, 'H': 2, 'C': 2}) >>> c.composition_of_aa_at_pos[1] {'O': 3, 'H': 7, 'C': 5, 'N': 1} >>> c.composition_at_pos[1] defaultdict(<class 'int'>, {'O': 4, 'H': 9, 'C': 7, 'N': 1})
>>> c = ursgal.ChemicalComposition('+H2O2H2') >>> c {'O': 2, 'H': 4} >>> c.subtract_chemical_formula('H3') >>> c {'O': 2, 'H': 1}
Note
We did not include mass calculation, since pyQms will calculate masses much more accurately using unimod and other element enrichments.
-
add_chemical_formula
(chemical_formula, factor=1)¶ Adds chemical formula to the instance
Parameters: chemical_formula (str) – chemical composition given as Hill notation Keyword Arguments: factor (int) – multiplication factor to add the same chemical formula multiple times
-
add_glycan
(glycan)¶ Adds a glycan to the instance.
Parameters: glycan (str) – sequence of monosaccharides given in unimod format, e.g.: HexNAc(2)Hex(3)dHex(1)Pent(1), available monosaccharides are listed in chemical_composition_kb
-
add_peptide
(peptide)¶ Adds peptide sequence to the instance
-
clear
()¶ Resets all lookup dictionaries and self
One class instance can be used analysing a series of sequences, thereby avoiding class instantiation overhead
-
composition_at_pos
= None¶ chemical composition at given peptide position incl modifications (if peptide sequence was used as input or using the use function)
Note
Numbering starts at position 1, since all PSM search engines use this nomenclature.
Type: dict
-
composition_of_aa_at_pos
= None¶ chemical composition of amino acid at given peptide position (if peptide sequence was used as input or using the use function)
Note
Numbering starts at position 1, since all PSM search engines use this nomenclature.
Type: dict
-
composition_of_mod_at_pos
= None¶ chemical composition of unimod modifications at given position (if peptide sequence was used as input or using the use function)
Note
Numbering starts at position 1, since all PSM search engines use this nomenclature.
Type: dict
-
hill_notation
(include_ones=False, cc=None)¶ Formats chemical composition into Hill notation string.
Parameters: cc (dict, optional) – can format other element dicts as well. Returns: - Hill notation format of self.
- For examples::
- C50H88N10O17
Return type: str
-
hill_notation_unimod
(cc=None)¶ Formats chemical composition into Hill notation string adding unimod features.
Parameters: cc (dict, optional) – can format other element dicts as well. Returns: - Hill notation format including unimod format rules of self.
- For example::
- C(50)H(88)N(10)O(17)
Return type: str
-
subtract_chemical_formula
(chemical_formula, factor=1)¶ Subtract chemical formula from instance
Parameters: chemical_formula (str) – chemical composition given as Hill notation Keyword Arguments: factor (int) – multiplication factor to add the same chemical formula multiple times
-
subtract_peptide
(peptide)¶ Subtract peptide from instance
-
use
(sequence)¶ Re-initialize the class with a new sequence
This is helpful if one wants to use the same class instance for multiple sequence since it remove class instantiation overhead.
Parameters: sequence (str) – See top for possible input formats.