library("tima")
copy_backbone()
go_to_cache()
github <- "https://raw.githubusercontent.com/"
repo <- "taxonomicallyinformedannotation/tima-example-files/main/"
dir <- paste0(github, repo)
files <- get_params(step = "prepare_libraries_sop_merged")$files$libraries$sop$prepared$lotus |>
gsub(pattern = ".gz", replacement = "", fixed = TRUE)
get_file(url = paste0(dir, files), export = files)
prepare_libraries_sop_merged(files = files)
unlink("data", recursive = TRUE)Prepare merged structure organism pairs libraries
Description
This function merges all structure-organism pair libraries (LOTUS, HMDB, ECMDB, etc.) into a single comprehensive library. Can optionally filter by taxonomic level to create biologically-focused subsets. Also splits structures into separate metadata tables.
Usage
prepare_libraries_sop_merged(
files = get_params(step = "prepare_libraries_sop_merged")\$files\$libraries\$sop\$prepared,
filter = get_params(step = "prepare_libraries_sop_merged")\$organisms\$filter\$mode,
level = get_params(step = "prepare_libraries_sop_merged")\$organisms\$filter\$level,
value = get_params(step = "prepare_libraries_sop_merged")\$organisms\$filter\$value,
cache = get_params(step =
"prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$processed,
output_key = get_params(step =
"prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$keys,
output_org_tax_ott = get_params(step =
"prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$organisms\$taxonomies\$ott,
output_str_stereo = get_params(step =
"prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$stereo,
output_str_met = get_params(step =
"prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$metadata,
output_str_nam = get_params(step =
"prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$names,
output_str_tax_cla = get_params(step =
"prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$taxonomies\$cla,
output_str_tax_npc = get_params(step =
"prepare_libraries_sop_merged")\$files\$libraries\$sop\$merged\$structures\$taxonomies\$npc
)
Arguments
files
|
Character vector or list of paths to prepared library files |
filter
|
Logical whether to filter the merged library by taxonomy |
level
|
Character string taxonomic rank for filtering (kingdom, phylum, family, genus, etc.) |
value
|
Character string taxon name(s) to keep (can use | for multiple, e.g., ‘Gentianaceae|Apocynaceae’) |
cache
|
Character string path to cache directory for processed SMILES |
output_key
|
Character string path for output keys file |
output_org_tax_ott
|
Character string path for organisms taxonomy (OTT) file |
output_str_stereo
|
Character string path for structures stereochemistry file |
output_str_met
|
Character string path for structures metadata file |
output_str_nam
|
Character string path for structures names file |
output_str_tax_cla
|
Character string path for ClassyFire taxonomy file |
output_str_tax_npc
|
Character string path for NPClassifier taxonomy file |
Details
Creates merged library by combining all available SOP sources, optionally filtering by taxonomic criteria (e.g., only Gentianaceae). Splits output into structures metadata, names, taxonomy, and organisms.
Value
Character string path to the prepared merged SOP library