Parse adduct

Description

This function parses mass spectrometry adduct notation strings into their components: multimer count, isotope shift, modifications, charge state, and charge sign. It handles complex adducts with multiple additions/losses.

Usage

parse_adduct(adduct_string, regex = ADDUCT_REGEX_PATTERN)

Arguments

adduct_string Character string representing the adduct in standard notation (e.g., "[M+H]+", "[2M+Na]+", "[M-H2O+H]+")
regex Character string regular expression pattern for parsing (default: uses ADDUCT_REGEX_PATTERN from constants)

Value

Named numeric vector containing:

n_mer Integer number of monomers (e.g., 2 for dimer, 1 for monomer)
n_iso Integer isotope shift (e.g., 1 for M+1 isotopologue, 0 for monoisotopic)
los_add_clu Numeric total mass change in Daltons from all modifications
n_charges Integer absolute number of charges (always positive)
charge Integer charge polarity (+1 for positive mode, -1 for negative mode)

Returns all zeros if parsing fails.

Examples

library("tima")

# Simple adducts
parse_adduct("[M+H]+") # Protonated molecule
parse_adduct("[M-H]-") # Deprotonated molecule
parse_adduct("[M+Na]+") # Sodium adduct

# Complex adducts
parse_adduct("[2M+Na]+") # Dimer with sodium
parse_adduct("[M+H-H2O]+") # Protonated with water loss
parse_adduct("[M1+H]+") # M+1 isotopologue
parse_adduct("[2M1-C6H12O6 (hexose)+NaCl+H]2+") # Complex modification

# Error cases
parse_adduct(NULL) # Returns all zeros
parse_adduct("invalid") # Returns all zeros with warning