Annotate masses

Description

Mass-based MS1 annotation. The pipeline is a sequence of clearly-bounded steps; each step is documented inline. In short:

1. **Pairs in RT windows.** For every feature, find all other features
   in the same RT tolerance window (per sample) and compute the m/z
   delta. The pair is always oriented `(lower_mz, higher_mz)` so that
   `delta = mz_higher - mz_lower >= 0`.

2. **Adduct edges.** Match each pair's `delta` against the table of
   precomputed pairwise differences between known mode-specific
   adducts. A match labels the edge `adduct_low _ adduct_high` and
   tentatively assigns the corresponding adduct to each endpoint.

3. **Cluster edges.** Match `delta` against cluster masses (e.g. ACN,
   MeOH, Na). A cluster adds mass to the *higher* m/z peak, so the
   cluster suffix `+<cluster>` is attached to the **dest** node's
   adduct hypotheses.

4. **Neutral-loss edges.** Match `delta` against neutral-loss masses
   (e.g. H2O, CO2). For an NL pair, the **higher** m/z peak is the
   precursor and the **lower** m/z peak is the product. The loss
   suffix `-<loss>` is attached to the precursor's adduct hypotheses
   (so the same neutral M is inferred from both peaks).

5. **Node hypotheses.** Gather, per feature, **all** plausible adduct
   labels: (a) what we inferred from adduct/cluster/loss edges, (b)
   any adduct supplied upstream by the preprocessing tool, and
   (c) the universal baseline `[M+H]+` / `[M-H]-`. Hypotheses are
   never dropped at this stage.

6. **Library match.** For every `(feature, candidate_adduct)` pair,
   compute the implied neutral mass M and look it up in the library
   within the ppm tolerance.

7. **Network-consensus pruning.** If a feature ends up with several
   library hits, drop only the candidates whose adduct has *zero*
   support in the adduct edge graph **and** whose drop still leaves a
   supported alternative. Ties are kept and drops are logged.

8. **Keep unmatched adducts.** Adduct hypotheses are exported even
   when no library structure matches, so downstream tools still see
   the adduct annotation.

Usage

annotate_masses(
  features = get_params(step = "annotate_masses")\$files\$features\$prepared,
  output_annotations = get_params(step =
    "annotate_masses")\$files\$annotations\$prepared\$structural\$ms1,
  output_edges = get_params(step =
    "annotate_masses")\$files\$networks\$spectral\$edges\$raw\$ms1,
  name_source = get_params(step = "annotate_masses")\$names\$source,
  name_target = get_params(step = "annotate_masses")\$names\$target,
  library = get_params(step = "annotate_masses")\$files\$libraries\$sop\$merged\$keys,
  str_stereo = get_params(step =
    "annotate_masses")\$files\$libraries\$sop\$merged\$structures\$stereo,
  str_met = get_params(step =
    "annotate_masses")\$files\$libraries\$sop\$merged\$structures\$metadata,
  str_tax_cla = get_params(step =
    "annotate_masses")\$files\$libraries\$sop\$merged\$structures\$taxonomies\$cla,
  str_tax_npc = get_params(step =
    "annotate_masses")\$files\$libraries\$sop\$merged\$structures\$taxonomies\$npc,
  adducts_list = get_params(step = "annotate_masses")\$ms\$adducts,
  clusters_list = get_params(step = "annotate_masses")\$ms\$clusters,
  neutral_losses_list = get_params(step = "annotate_masses")\$ms\$neutral_losses,
  ms_mode = get_params(step = "annotate_masses")\$ms\$polarity,
  tolerance_ppm = get_params(step = "annotate_masses")\$ms\$tolerances\$mass\$ppm\$ms1,
  tolerance_rt = get_params(step = "annotate_masses")\$ms\$tolerances\$rt\$adducts,
  adduct_consistency = get_params(step = "annotate_masses")\$ms\$adducts\$consistency\$type,
  adduct_min_support = get_params(step =
    "annotate_masses")\$ms\$adducts\$consistency\$min_support,
  adduct_consistency_min_degree = get_params(step =
    "annotate_masses")\$ms\$adducts\$consistency\$min_degree
)

Arguments

features Table containing your previous annotation to complement
output_annotations Output for mass based structural annotations
output_edges Output for mass based edges
name_source Name of the source features column
name_target Name of the target features column
library Library containing the keys
str_stereo File containing structures stereo
str_met File containing structures metadata
str_tax_cla File containing Classyfire taxonomy
str_tax_npc File containing NPClassifier taxonomy
adducts_list List of adducts to be used
clusters_list List of clusters to be used
neutral_losses_list List of neutral losses to be used
ms_mode Ionization mode. Must be ‘pos’ or ‘neg’
tolerance_ppm Tolerance to perform annotation. Should be <= 20 ppm
tolerance_rt Tolerance to group adducts. Should be <= 0.05 minutes
adduct_consistency Consistency mode for adduct edge filtering: one of off, conditional, strict
adduct_min_support Minimum number of independent supporting neighbors for an adduct assignment in consistency-filtered regions
adduct_consistency_min_degree In conditional mode, minimum local graph degree at which support filtering is activated

Value

Named character of paths to the annotations and edges files.

See Also

Other annotation: annotate_spectra(), filter_annotations(), weight_annotations(), write_mztab()

Examples

library("tima")

annotate_masses()