Filter annotations

Description

This function filters initial annotations by removing MS1-only annotations that also have quality spectral matches (gated on similarity and matched peaks), and joins retention time library data when available. RT errors are computed but no hard cutoff is applied; the downstream scoring system uses a sigmoid penalty to handle RT deviations gracefully.

Usage

filter_annotations(
  annotations = get_params(step =
    "filter_annotations")\$files\$annotations\$prepared\$structural,
  features = get_params(step = "filter_annotations")\$files\$features\$prepared,
  rts = get_params(step = "filter_annotations")\$files\$libraries\$temporal\$prepared,
  output = get_params(step = "filter_annotations")\$files\$annotations\$filtered,
  tolerance_rt = get_params(step = "filter_annotations")\$ms\$tolerances\$rt\$library
)

Arguments

annotations Character vector or list of paths to prepared annotation files
features Character string path to prepared features file. Must contain a feature_id column. The rt column is optional; if absent, RT filtering is skipped even when an RT library is provided.
rts Character string path to prepared retention time library (optional)
output Character string path for filtered annotations output
tolerance_rt Numeric RT tolerance in minutes (used for deduplication of multiple RT library matches; no hard cutoff is applied)

Value

Character string path to the filtered annotations file

See Also

Other annotation: annotate_masses(), annotate_spectra(), weight_annotations()

Examples

library("tima")

copy_backbone()
go_to_cache()
github <- "https://raw.githubusercontent.com/"
repo <- "taxonomicallyinformedannotation/tima-example-files/main/"
dir <- paste0(github, repo)
ann <- get_params(step =
    "filter_annotations")$files$annotations$prepared$structural[[2L]] |>
  gsub(pattern = ".gz", replacement = "", fixed = TRUE)
features <- get_params(step = "filter_annotations")$files$features$prepared
    |>
  gsub(pattern = ".gz", replacement = "", fixed = TRUE)
rts <- get_params(step =
    "filter_annotations")$files$libraries$temporal$prepared |>
  gsub(pattern = ".gz", replacement = "", fixed = TRUE)
get_file(url = paste0(dir, annotations), export = annotations)
get_file(url = paste0(dir, features), export = features)
get_file(url = paste0(dir, rts), export = rts)
filter_annotations(
  annotations = ann,
  features = features,
  rts = rts
)
unlink("data", recursive = TRUE)