Prepare merged structure organism pairs libraries

Description

This function merges all structure-organism pair libraries (LOTUS, HMDB, ECMDB, etc.) into a single comprehensive library. Can optionally filter by taxonomic level to create biologically-focused subsets. Also splits structures into separate metadata tables.

Usage

prepare_libraries_sop_merged(
  files = get_params(step = "prepare_libraries_sop_merged")\$files\$libraries\$sop\$prepared,
  filter = get_params(step = "prepare_libraries_sop_merged")\$organisms\$filter\$mode,
  level = get_params(step = "prepare_libraries_sop_merged")\$organisms\$filter\$level,
  value = get_params(step = "prepare_libraries_sop_merged")\$organisms\$filter\$value,
  cache = get_params(step =
    "prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$processed,
  output_key = get_params(step =
    "prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$keys,
  output_org_tax_ott = get_params(step =
    "prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$organisms\$taxonomies\$ott,
  output_str_stereo = get_params(step =
    "prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$stereo,
  output_str_met = get_params(step =
    "prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$metadata,
  output_str_nam = get_params(step =
    "prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$names,
  output_str_tax_cla = get_params(step =
    "prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$taxonomies\$cla,
  output_str_tax_npc = get_params(step =
    "prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$taxonomies\$npc
)

Arguments

files character Character vector or list of paths to prepared library files
filter logical Logical whether to filter the merged library by taxonomy
level character Character string taxonomic rank for filtering (kingdom, phylum, family, genus, etc.)
value character Character string taxon name(s) to keep (can use | for multiple, e.g., ‘Gentianaceae|Apocynaceae’)
cache character Character string path to cache directory for processed SMILES
output_key character Character string path for output keys file
output_org_tax_ott character Character string path for organisms taxonomy (OTT) file
output_str_stereo character Character string path for structures stereochemistry file
output_str_met character Character string path for structures metadata file
output_str_nam character Character string path for structures names file
output_str_tax_cla character Character string path for ClassyFire taxonomy file
output_str_tax_npc character Character string path for NPClassifier taxonomy file

Details

Creates merged library by combining all available SOP sources, optionally filtering by taxonomic criteria (e.g., only Gentianaceae). Splits output into structures metadata, names, taxonomy, and organisms.

Value

Character string path to the prepared merged SOP library

See Also

Other preparation: prepare_annotations_gnps(), prepare_annotations_mzmine(), prepare_annotations_sirius(), prepare_annotations_spectra(), prepare_features_components(), prepare_features_edges(), prepare_features_tables(), prepare_libraries_rt(), prepare_libraries_sop_bigg(), prepare_libraries_sop_closed(), prepare_libraries_sop_ecmdb(), prepare_libraries_sop_hmdb(), prepare_libraries_sop_lotus(), prepare_libraries_spectra(), prepare_params(), prepare_taxa()

Examples

library("tima")

copy_backbone()
go_to_cache()
github <- "https://raw.githubusercontent.com/"
repo <- "taxonomicallyinformedannotation/tima-example-files/main/"
dir <- paste0(github, repo)
files <- get_params(step = "prepare_libraries_sop_merged")$files$libraries$sop$prepared$lotus |>
  gsub(pattern = ".gz", replacement = "", fixed = TRUE)
get_file(url = paste0(dir, files), export = files)
prepare_libraries_sop_merged(files = files)
unlink("data", recursive = TRUE)