3 Performing Taxonomically Informed Metabolite Annotation

Author

Adriano Rutz

Published

June 15, 2026

This vignette describes how Taxonomically Informed Metabolite Annotation is performed. If you followed all previous steps successfully, this should be a piece of cake, you deserve it!

tima::run_tima()
#> + par_def_cre_com dispatched
#> ✔ par_def_cre_com completed [6ms, 375 B]
#> + par_def_exp_mzt dispatched
#> ✔ par_def_exp_mzt completed [1ms, 1.76 kB]
#> + par_def_pre_lib_spe dispatched
#> ✔ par_def_pre_lib_spe completed [1ms, 1.58 kB]
#> + par_def_pre_lib_sop_lot dispatched
#> ✔ par_def_pre_lib_sop_lot completed [1ms, 494 B]
#> + par_def_ann_mas dispatched
#> ✔ par_def_ann_mas completed [1ms, 10.61 kB]
#> + par_def_pre_lib_sop_ecm dispatched
#> ✔ par_def_pre_lib_sop_ecm completed [2ms, 492 B]
#> + par_def_pre_ann_mzm dispatched
#> ✔ par_def_pre_ann_mzm completed [1ms, 1.32 kB]
#> + par_def_pre_lib_rt dispatched
#> ✔ par_def_pre_lib_rt completed [1ms, 2.19 kB]
#> + par_def_pre_ann_gnp dispatched
#> ✔ par_def_pre_ann_gnp completed [1ms, 1.31 kB]
#> + yaml_paths dispatched
#> ✔ yaml_paths completed [0ms, 18.20 kB]
#> + par_def_pre_fea_edg dispatched
#> ✔ par_def_pre_fea_edg completed [1ms, 706 B]
#> + par_def_pre_lib_sop_big dispatched
#> ✔ par_def_pre_lib_sop_big completed [1ms, 314 B]
#> + par_def_pre_fea_tab dispatched
#> ✔ par_def_pre_fea_tab completed [0ms, 857 B]
#> + par_def_pre_ann_mzt dispatched
#> ✔ par_def_pre_ann_mzt completed [1ms, 1.18 kB]
#> + par_def_pre_fea_com dispatched
#> ✔ par_def_pre_fea_com completed [0ms, 358 B]
#> + par_def_pre_ann_sir dispatched
#> ✔ par_def_pre_ann_sir completed [1ms, 1.97 kB]
#> + par_def_ann_spe dispatched
#> ✔ par_def_ann_spe completed [1ms, 2.52 kB]
#> + par_def_pre_lib_sop_mer dispatched
#> ✔ par_def_pre_lib_sop_mer completed [0ms, 7.04 kB]
#> + par_def_pre_lib_sop_pub dispatched
#> ✔ par_def_pre_lib_sop_pub completed [1ms, 527 B]
#> + par_def_pre_tax dispatched
#> ✔ par_def_pre_tax completed [1ms, 1.51 kB]
#> + par_def_wei_ann dispatched
#> ✔ par_def_wei_ann completed [1ms, 5.33 kB]
#> + par_def_pre_ann_spe dispatched
#> ✔ par_def_pre_ann_spe completed [1ms, 1.35 kB]
#> + par_def_pre_lib_sop_hmd dispatched
#> ✔ par_def_pre_lib_sop_hmd completed [1ms, 492 B]
#> + par_def_cre_edg_spe dispatched
#> ✔ par_def_cre_edg_spe completed [1ms, 1.52 kB]
#> + par_def_pre_lib_sop_clo dispatched
#> ✔ par_def_pre_lib_sop_clo completed [1ms, 523 B]
#> + par_def_fil_ann dispatched
#> ✔ par_def_fil_ann completed [1ms, 1.34 kB]
#> + paths dispatched
#> ✔ paths completed [1ms, 3.32 kB]
#> + lib_sop_hmd_fam_raw dispatched
#> [2026-06-15 12:45:38.745] [INFO ] > Starting: download_file [url=https://www.csfmetabolome.ca/system/downloads/current/csf_metabolites_structures.zip, destination=data/source/libraries/sop/csfmetabolome/structures.zip]
#> [2026-06-15 12:45:38.976] [INFO ] [OK] Completed: download_file [size_bytes=251502] (194ms)
#> [2026-06-15 12:45:38.983] [INFO ] > Starting: download_file [url=https://www.fecalmetabolome.ca/system/downloads/current/feces_metabolites_structures.zip, destination=data/source/libraries/sop/fecalmetabolome/structures.zip]
#> [2026-06-15 12:45:39.341] [INFO ] [OK] Completed: download_file [size_bytes=3201305] (358ms)
#> [2026-06-15 12:45:39.342] [INFO ] > Starting: download_file [url=https://www.salivametabolome.ca/system/downloads/current/saliva_metabolites_structures.zip, destination=data/source/libraries/sop/salivametabolome/structures.zip]
#> [2026-06-15 12:45:39.555] [INFO ] [OK] Completed: download_file [size_bytes=622845] (212ms)
#> [2026-06-15 12:45:39.557] [INFO ] > Starting: download_file [url=https://www.serummetabolome.ca/system/downloads/current/serum_metabolites_structures.zip, destination=data/source/libraries/sop/serummetabolome/structures.zip]
#> [2026-06-15 12:45:39.918] [INFO ] [OK] Completed: download_file [size_bytes=12023792] (361ms)
#> [2026-06-15 12:45:39.920] [INFO ] > Starting: download_file [url=https://www.sweatmetabolome.ca/system/downloads/current/sweat_metabolites_structures.zip, destination=data/source/libraries/sop/sweatmetabolome/structures.zip]
#> [2026-06-15 12:45:40.022] [INFO ] [OK] Completed: download_file [size_bytes=42618] (102ms)
#> [2026-06-15 12:45:40.024] [INFO ] > Starting: download_file [url=https://www.urinemetabolome.ca/system/downloads/current/urine_metabolites_structures.zip, destination=data/source/libraries/sop/urinemetabolome/structures.zip]
#> [2026-06-15 12:45:40.266] [INFO ] [OK] Completed: download_file [size_bytes=2654043] (242ms)
#> [2026-06-15 12:45:40.267] [INFO ] > Starting: download_file [url=https://mcdb.ca/system/downloads/current/milk_metabolites_structures.zip, destination=data/source/libraries/sop/mcdb/structures.zip]
#> [2026-06-15 12:45:40.373] [WARN ] file download failed (attempt 1/3), retrying in 1s: HTTP 403 Forbidden.
#> [2026-06-15 12:45:41.418] [WARN ] file download failed (attempt 2/3), retrying in 2s: HTTP 403 Forbidden.
#> [2026-06-15 12:45:43.519] [WARN ] HMDB family download failed: file download failed
#> ✖ x file download failed after retries Expected: Successful operation Received:
#>   HTTP 403 Forbidden. Reason: Tried 3 times with exponential backoff Fix:
#>   Possible solutions: 1. Check network connection 2. Verify server/service is
#>   available 3. Check authentication credentials 4. Try again later if service
#>   is down 5. Increase max_attempts if transient failures are common
#> [2026-06-15 12:45:43.521] [WARN ] HMDB download failed. Creating minimal placeholder SDF file.
#> [2026-06-15 12:45:43.547] [INFO ] > Starting: download_file [url=https://smpdb.ca/downloads/smpdb_structures.zip, destination=data/source/libraries/sop/smpdb/structures.zip]
#> [2026-06-15 12:45:44.741] [INFO ] [OK] Completed: download_file [size_bytes=23382536] (1.2s)
#> [2026-06-15 12:45:44.743] [INFO ] > Starting: download_file [url=https://mimedb.org/system/downloads/2.0/mimedb.sdf.zip, destination=data/source/libraries/sop/mimedb/structures.zip]
#> [2026-06-15 12:45:44.979] [WARN ] file download failed (attempt 1/3), retrying in 1s: HTTP 403 Forbidden.
#> [2026-06-15 12:45:46.023] [WARN ] file download failed (attempt 2/3), retrying in 2s: HTTP 403 Forbidden.
#> [2026-06-15 12:45:48.122] [WARN ] HMDB family download failed: file download failed
#> ✖ x file download failed after retries Expected: Successful operation Received:
#>   HTTP 403 Forbidden. Reason: Tried 3 times with exponential backoff Fix:
#>   Possible solutions: 1. Check network connection 2. Verify server/service is
#>   available 3. Check authentication credentials 4. Try again later if service
#>   is down 5. Increase max_attempts if transient failures are common
#> [2026-06-15 12:45:48.124] [WARN ] HMDB download failed. Creating minimal placeholder SDF file.
#> [2026-06-15 12:45:48.128] [INFO ] > Starting: download_file [url=https://t3db.ca/system/downloads/current/structures.zip, destination=data/source/libraries/sop/t3db/structures.zip]
#> [2026-06-15 12:45:48.238] [WARN ] file download failed (attempt 1/3), retrying in 1s: HTTP 403 Forbidden.
#> [2026-06-15 12:45:49.283] [WARN ] file download failed (attempt 2/3), retrying in 2s: HTTP 403 Forbidden.
#> [2026-06-15 12:45:51.378] [WARN ] HMDB family download failed: file download failed
#> ✖ x file download failed after retries Expected: Successful operation Received:
#>   HTTP 403 Forbidden. Reason: Tried 3 times with exponential backoff Fix:
#>   Possible solutions: 1. Check network connection 2. Verify server/service is
#>   available 3. Check authentication credentials 4. Try again later if service
#>   is down 5. Increase max_attempts if transient failures are common
#> [2026-06-15 12:45:51.380] [WARN ] HMDB download failed. Creating minimal placeholder SDF file.
#> [2026-06-15 12:45:51.384] [INFO ] > Starting: download_file [url=https://bovinedb.ca/system/downloads/current/structures.zip, destination=data/source/libraries/sop/bovinedb/structures.zip]
#> [2026-06-15 12:45:51.961] [INFO ] [OK] Completed: download_file [size_bytes=19260214] (577ms)
#> [2026-06-15 12:45:51.963] [INFO ] > Starting: download_file [url=https://www.ymdb.ca/system/downloads/current/ymdb.sdf.zip, destination=data/source/libraries/sop/ymdb/structures.zip]
#> [2026-06-15 12:45:52.063] [INFO ] [OK] Completed: download_file [size_bytes=1200611] (100ms)
#> [2026-06-15 12:45:52.066] [INFO ] > Starting: download_file [url=https://cannabisdatabase.ca/simple/download_compound_as_sdf, destination=data/source/libraries/sop/cannabisdatabase/compounds.sdf]
#> [2026-06-15 12:45:52.166] [WARN ] file download failed (attempt 1/3), retrying in 1s: Failed to perform HTTP request.
#> Caused by error in `curl::curl_fetch_disk()`:
#> ! SSL peer certificate or SSH remote key was not OK [cannabisdatabase.ca]:
#> SSL certificate problem: certificate has expired
#> [2026-06-15 12:45:53.243] [WARN ] file download failed (attempt 2/3), retrying in 2s: Failed to perform HTTP request.
#> Caused by error in `curl::curl_fetch_disk()`:
#> ! SSL peer certificate or SSH remote key was not OK [cannabisdatabase.ca]:
#> SSL certificate problem: certificate has expired
#> [2026-06-15 12:45:55.385] [WARN ] HMDB family download failed: file download failed
#> ✖ x file download failed after retries Expected: Successful operation Received:
#>   Failed to perform HTTP request. Caused by error in `curl::curl_fetch_disk()`:
#>   ! SSL peer certificate or SSH remote key was not OK [cannabisdatabase.ca]:
#>   SSL certificate problem: certificate has expired Reason: Tried 3 times with
#>   exponential backoff Fix: Possible solutions: 1. Check network connection 2.
#>   Verify server/service is available 3. Check authentication credentials 4. Try
#>   again later if service is down 5. Increase max_attempts if transient failures
#>   are common
#> [2026-06-15 12:45:55.386] [WARN ] HMDB download failed. Creating minimal placeholder SDF file.
#> [2026-06-15 12:45:55.389] [WARN ] Failed to create zip file, trying alternative method
#>  zip warning: missing end signature--probably not a zip file (did you
#>  zip warning: remember to use binary mode when you transferred it?)
#>  zip warning: (if you are trying to read a damaged archive try -F)
#> 
#> zip error: Zip file structure invalid (compounds.sdf)
#> ✔ lib_sop_hmd_fam_raw completed [16.7s, 62.64 MB]
#> + lib_spe_is_wik_pre_sop dispatched
#> [2026-06-15 12:45:55.523] [INFO ] > Starting: download_file [url=https://github.com/taxonomicallyinformedannotation/tima-example-files/raw/main/wikidata_spectral_5607185_prepared.tsv.gz, destination=data/interim/libraries/sop/wikidata_5607185_prepared.tsv.gz]
#> [2026-06-15 12:45:55.839] [INFO ] [OK] Completed: download_file [size_bytes=15074639] (316ms)
#> ✔ lib_spe_is_wik_pre_sop completed [318ms, 15.07 MB]
#> + lib_sop_pub dispatched
#> [2026-06-15 12:45:55.942] [INFO ] > Starting: download_file [url=https://zenodo.org/records/20439802/files/PubChemLite_CCSbase_20260529.csv?download=1, destination=data/source/libraries/sop/pubchemlite.csv]
#> [2026-06-15 12:47:10.708] [INFO ] [OK] Completed: download_file [size_bytes=293613581] (1m 15s)
#> ✔ lib_sop_pub completed [1m 14.8s, 293.61 MB]
#> + lib_spe_is_wik_pre_pos dispatched
#> [2026-06-15 12:47:10.914] [INFO ] > Starting: download_file [url=https://github.com/taxonomicallyinformedannotation/tima-isdb-pos/raw/main/wikidata_5607185_pos.rds, destination=data/interim/libraries/spectra/is/wikidata_5607185_pos.rds]
#> Downloading   6% ■■■                              17s
#> Downloading   8% ■■■■                             19s
#> Downloading  25% ■■■■■■■■                         14s
#> Downloading  41% ■■■■■■■■■■■■■                    11s
#> Downloading  58% ■■■■■■■■■■■■■■■■■■                8s
#> Downloading  74% ■■■■■■■■■■■■■■■■■■■■■■■           5s
#> Downloading  88% ■■■■■■■■■■■■■■■■■■■■■■■■■■■       2s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■   0s
#> [2026-06-15 12:47:31.038] [INFO ] [OK] Completed: download_file [size_bytes=1097022129] (20.1s)
#> ✔ lib_spe_is_wik_pre_pos completed [20.1s, 1.10 GB]
#> + lib_xrefs dispatched
#> [2026-06-15 12:47:31.534] [INFO ] Fetching compound cross-references from Wikidata / QLever
#> [2026-06-15 12:47:31.536] [INFO ] > Starting: get_compounds_xrefs [(no parameters)]
#> [2026-06-15 12:47:33.140] [WARN ] QLever request failed (possibly transient upstream error). Writing empty xrefs file: compounds.tsv.gz
#> [2026-06-15 12:47:33.157] [INFO ] > Starting: export_output [file=data/interim/xrefs/compounds.tsv.gz, n_rows=0]
#> [2026-06-15 12:47:33.159] [INFO ] [OK] Completed: export_output [size_bytes=35] (2ms)
#> ✔ lib_xrefs completed [1.6s, 35 B]
#> + lib_spe_is_wik_pre_neg dispatched
#> [2026-06-15 12:47:33.287] [INFO ] > Starting: download_file [url=https://github.com/taxonomicallyinformedannotation/tima-isdb-neg/raw/main/wikidata_5607185_neg.rds, destination=data/interim/libraries/spectra/is/wikidata_5607185_neg.rds]
#> Downloading   6% ■■■                              15s
#> Downloading  23% ■■■■■■■■                         12s
#> Downloading  40% ■■■■■■■■■■■■■                    10s
#> Downloading  59% ■■■■■■■■■■■■■■■■■■■               7s
#> Downloading  78% ■■■■■■■■■■■■■■■■■■■■■■■■          4s
#> Downloading  98% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■    0s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■   0s
#> [2026-06-15 12:47:49.574] [INFO ] [OK] Completed: download_file [size_bytes=874199749] (16.3s)
#> ✔ lib_spe_is_wik_pre_neg completed [16.3s, 874.20 MB]
#> + lib_spe_exp_mer_pre_pos dispatched
#> [2026-06-15 12:47:49.988] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/merlin_16984129_pos.rds, destination=data/interim/libraries/spectra/exp/merlin_16984129_pos.rds]
#> Downloading  74% ■■■■■■■■■■■■■■■■■■■■■■■           1s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■   0s
#> [2026-06-15 12:47:53.082] [INFO ] [OK] Completed: download_file [size_bytes=158718768] (3.1s)
#> ✔ lib_spe_exp_mer_pre_pos completed [3.1s, 158.72 MB]
#> + lib_spe_is_nor_pre_neg dispatched
#> [2026-06-15 12:47:53.245] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/isdbnormansusdat_14854025_neg.rds, destination=data/interim/libraries/spectra/is/isdbnormansusdat_14854025_neg.rds]
#> [2026-06-15 12:47:54.234] [INFO ] [OK] Completed: download_file [size_bytes=34220848] (989ms)
#> ✔ lib_spe_is_nor_pre_neg completed [991ms, 34.22 MB]
#> + lib_sop_lot dispatched
#> [2026-06-15 12:47:54.350] [INFO ] Retrieving latest version from Zenodo: 10.5281/zenodo.5794106
#> [2026-06-15 12:47:55.117] [INFO ] Downloading 260413_frozen_metadata.csv.gz from https://doi.org/10.5281/zenodo.5794106
#> [2026-06-15 12:47:55.118] [INFO ] > Starting: download_file [url=https://zenodo.org/api/records/19360665/files/260413_frozen_metadata.csv.gz/content, destination=data/source/libraries/sop/lotus.csv.gz]
#> [2026-06-15 12:49:04.264] [INFO ] [OK] Completed: download_file [size_bytes=90298678] (1m 9s)
#> [2026-06-15 12:49:04.265] [INFO ] Download completed: data/source/libraries/sop/lotus.csv.gz
#> ✔ lib_sop_lot completed [1m 9.9s, 90.30 MB]
#> + lib_spe_is_nor_pre_sop dispatched
#> [2026-06-15 12:49:04.395] [INFO ] > Starting: download_file [url=https://github.com/Adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/sop/isdbnormansusdat_14854025_prepared.tsv.gz, destination=data/interim/libraries/sop/isdbnormansusdat_14854025_prepared.tsv.gz]
#> [2026-06-15 12:49:04.669] [INFO ] [OK] Completed: download_file [size_bytes=1236540] (274ms)
#> ✔ lib_spe_is_nor_pre_sop completed [276ms, 1.24 MB]
#> + lib_spe_is_nor_pre_pos dispatched
#> [2026-06-15 12:49:04.770] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/isdbnormansusdat_14854025_pos.rds, destination=data/interim/libraries/spectra/is/isdbnormansusdat_14854025_pos.rds]
#> [2026-06-15 12:49:05.914] [INFO ] [OK] Completed: download_file [size_bytes=47223884] (1.1s)
#> ✔ lib_spe_is_nor_pre_pos completed [1.1s, 47.22 MB]
#> + lib_spe_exp_mer_pre_sop dispatched
#> [2026-06-15 12:49:06.035] [INFO ] > Starting: download_file [url=https://github.com/Adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/sop/merlin_16984129_prepared.tsv.gz, destination=data/interim/libraries/sop/merlin_16984129_prepared.tsv.gz]
#> [2026-06-15 12:49:06.224] [INFO ] [OK] Completed: download_file [size_bytes=823107] (189ms)
#> ✔ lib_spe_exp_mer_pre_sop completed [191ms, 823.11 kB]
#> + lib_spe_exp_gnp_pre_pos dispatched
#> [2026-06-15 12:49:06.327] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/gnps_11566051_pos.rds, destination=data/interim/libraries/spectra/exp/gnps_11566051_pos.rds]
#> Downloading  19% ■■■■■■■                           4s
#> Downloading  65% ■■■■■■■■■■■■■■■■■■■■              2s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■   0s
#> [2026-06-15 12:49:12.090] [INFO ] [OK] Completed: download_file [size_bytes=341237933] (5.8s)
#> ✔ lib_spe_exp_gnp_pre_pos completed [5.8s, 341.24 MB]
#> + lib_spe_exp_env_pre_neg dispatched
#> [2026-06-15 12:49:12.316] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/enveda180_neg.rds, destination=data/interim/libraries/spectra/exp/enveda180_neg.rds]
#> Downloading  33% ■■■■■■■■■■■                       2s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■   0s
#> [2026-06-15 12:49:15.822] [INFO ] [OK] Completed: download_file [size_bytes=215056852] (3.5s)
#> ✔ lib_spe_exp_env_pre_neg completed [3.5s, 215.06 MB]
#> + lib_spe_exp_gnp_pre_neg dispatched
#> [2026-06-15 12:49:16.009] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/gnps_11566051_neg.rds, destination=data/interim/libraries/spectra/exp/gnps_11566051_neg.rds]
#> [2026-06-15 12:49:17.878] [INFO ] [OK] Completed: download_file [size_bytes=91828026] (1.9s)
#> ✔ lib_spe_exp_gnp_pre_neg completed [1.9s, 91.83 MB]
#> + lib_spe_exp_mb_pre_neg dispatched
#> [2026-06-15 12:49:18.012] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/massbank_202510_neg.rds, destination=data/interim/libraries/spectra/exp/massbank_202510_neg.rds]
#> [2026-06-15 12:49:18.495] [INFO ] [OK] Completed: download_file [size_bytes=5972761] (482ms)
#> ✔ lib_spe_exp_mb_pre_neg completed [485ms, 5.97 MB]
#> + lib_spe_exp_mer_pre_neg dispatched
#> [2026-06-15 12:49:18.599] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/merlin_16984129_neg.rds, destination=data/interim/libraries/spectra/exp/merlin_16984129_neg.rds]
#> [2026-06-15 12:49:19.912] [INFO ] [OK] Completed: download_file [size_bytes=53975862] (1.3s)
#> ✔ lib_spe_exp_mer_pre_neg completed [1.3s, 53.98 MB]
#> + lib_sop_hmd dispatched
#> [2026-06-15 12:49:20.032] [INFO ] > Starting: download_file [url=https://hmdb.ca/system/downloads/current/structures.zip, destination=data/source/libraries/sop/hmdb/structures.zip]
#> [2026-06-15 12:49:20.134] [WARN ] file download failed (attempt 1/3), retrying in 1s: HTTP 403 Forbidden.
#> [2026-06-15 12:49:21.174] [WARN ] file download failed (attempt 2/3), retrying in 2s: HTTP 403 Forbidden.
#> [2026-06-15 12:49:23.260] [WARN ] HMDB download failed: file download failed
#> ✖ x file download failed after retries Expected: Successful operation Received:
#>   HTTP 403 Forbidden. Reason: Tried 3 times with exponential backoff Fix:
#>   Possible solutions: 1. Check network connection 2. Verify server/service is
#>   available 3. Check authentication credentials 4. Try again later if service
#>   is down 5. Increase max_attempts if transient failures are common
#> [2026-06-15 12:49:23.261] [WARN ] HMDB download failed. Creating minimal placeholder SDF file.
#> ✔ lib_sop_hmd completed [3.2s, 340 B]
#> + lib_spe_exp_gnp_pre_sop dispatched
#> [2026-06-15 12:49:23.376] [INFO ] > Starting: download_file [url=https://github.com/Adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/sop/gnps_11566051_prepared.tsv.gz, destination=data/interim/libraries/sop/gnps_11566051_prepared.tsv.gz]
#> [2026-06-15 12:49:23.512] [INFO ] [OK] Completed: download_file [size_bytes=493387] (136ms)
#> ✔ lib_spe_exp_gnp_pre_sop completed [138ms, 493.39 kB]
#> + lib_sop_ecm dispatched
#> [2026-06-15 12:49:23.615] [INFO ] > Starting: download_file [url=https://ecmdb.ca/download/ecmdb.json.zip, destination=data/source/libraries/sop/ecmdb.json.zip]
#> [2026-06-15 12:49:23.887] [INFO ] [OK] Completed: download_file [size_bytes=1334921] (272ms)
#> ✔ lib_sop_ecm completed [274ms, 1.33 MB]
#> + par_pre_par dispatched
#> ✔ par_pre_par completed [0ms, 1.69 kB]
#> + par_pre_par2 dispatched
#> ✔ par_pre_par2 completed [0ms, 33.06 kB]
#> + lib_spe_exp_mb_pre_sop dispatched
#> [2026-06-15 12:49:24.194] [INFO ] > Starting: download_file [url=https://github.com/Adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/sop/massbank_202510_prepared.tsv.gz, destination=data/interim/libraries/sop/massbank_202510_prepared.tsv.gz]
#> [2026-06-15 12:49:24.339] [INFO ] [OK] Completed: download_file [size_bytes=158982] (145ms)
#> ✔ lib_spe_exp_mb_pre_sop completed [148ms, 158.98 kB]
#> + lib_spe_exp_mb_pre_pos dispatched
#> [2026-06-15 12:49:24.439] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/massbank_202510_pos.rds, destination=data/interim/libraries/spectra/exp/massbank_202510_pos.rds]
#> Downloading  32% ■■■■■■■■■■■                       2s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■   0s
#> [2026-06-15 12:49:27.829] [INFO ] [OK] Completed: download_file [size_bytes=17559329] (3.4s)
#> ✔ lib_spe_exp_mb_pre_pos completed [3.4s, 17.56 MB]
#> + test_spectra_mini dispatched
#> ✔ test_spectra_mini completed [0ms, 7.77 MB]
#> + lib_spe_exp_env_pre_pos dispatched
#> [2026-06-15 12:49:28.058] [INFO ] > Starting: download_file [url=https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/enveda180_pos.rds, destination=data/interim/libraries/spectra/exp/enveda180_pos.rds]
#> Downloading   4% ■■                               22s
#> Downloading  13% ■■■■■                            18s
#> Downloading  25% ■■■■■■■■■                        17s
#> Downloading  33% ■■■■■■■■■■■                      18s
#> Downloading  42% ■■■■■■■■■■■■■                    17s
#> Downloading  53% ■■■■■■■■■■■■■■■■■                13s
#> Downloading  67% ■■■■■■■■■■■■■■■■■■■■■             9s
#> Downloading  82% ■■■■■■■■■■■■■■■■■■■■■■■■■         5s
#> Downloading  96% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■    1s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■   0s
#> [2026-06-15 12:49:53.150] [INFO ] [OK] Completed: download_file [size_bytes=1466741248] (25.1s)
#> ✔ lib_spe_exp_env_pre_pos completed [25.1s, 1.47 GB]
#> + lib_spe_exp_env_pre_sop dispatched
#> [2026-06-15 12:49:53.769] [INFO ] > Starting: download_file [url=https://github.com/Adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/sop/enveda180_prepared.tsv.gz, destination=data/interim/libraries/sop/enveda180_prepared.tsv.gz]
#> [2026-06-15 12:49:54.196] [INFO ] [OK] Completed: download_file [size_bytes=2876546] (427ms)
#> ✔ lib_spe_exp_env_pre_sop completed [429ms, 2.88 MB]
#> + lib_sop_hmd_fam_pre dispatched
#> [2026-06-15 12:49:54.295] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=CSFMETABOLOME, input=data/source/libraries/sop/csfmetabolome/structures.zip, tag=csf]
#> [2026-06-15 12:49:54.411] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/csfmetabolome_prepared.tsv.gz, n_rows=445]
#> [2026-06-15 12:49:54.414] [INFO ] [OK] Completed: export_output [size_bytes=19485] (4ms)
#> [2026-06-15 12:49:54.415] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=445] (120ms)
#> [2026-06-15 12:49:54.416] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=FECALMETABOLOME, input=data/source/libraries/sop/fecalmetabolome/structures.zip, tag=fecal]
#> [2026-06-15 12:49:56.165] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/fecalmetabolome_prepared.tsv.gz, n_rows=6810]
#> [2026-06-15 12:49:56.195] [INFO ] [OK] Completed: export_output [size_bytes=237060] (30ms)
#> [2026-06-15 12:49:56.196] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=6810] (1.8s)
#> [2026-06-15 12:49:56.197] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=SALIVAMETABOLOME, input=data/source/libraries/sop/salivametabolome/structures.zip, tag=saliva]
#> [2026-06-15 12:49:56.438] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/salivametabolome_prepared.tsv.gz, n_rows=1245]
#> [2026-06-15 12:49:56.446] [INFO ] [OK] Completed: export_output [size_bytes=47303] (7ms)
#> [2026-06-15 12:49:56.447] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=1245] (249ms)
#> [2026-06-15 12:49:56.448] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=SERUMMETABOLOME, input=data/source/libraries/sop/serummetabolome/structures.zip, tag=serum]
#> [2026-06-15 12:50:04.687] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/serummetabolome_prepared.tsv.gz, n_rows=25411]
#> [2026-06-15 12:50:04.755] [INFO ] [OK] Completed: export_output [size_bytes=812712] (68ms)
#> [2026-06-15 12:50:04.756] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=25411] (8.3s)
#> [2026-06-15 12:50:04.757] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=SWEATMETABOLOME, input=data/source/libraries/sop/sweatmetabolome/structures.zip, tag=sweat]
#> [2026-06-15 12:50:04.809] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/sweatmetabolome_prepared.tsv.gz, n_rows=89]
#> [2026-06-15 12:50:04.811] [INFO ] [OK] Completed: export_output [size_bytes=4110] (2ms)
#> [2026-06-15 12:50:04.812] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=89] (55ms)
#> [2026-06-15 12:50:04.813] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=URINEMETABOLOME, input=data/source/libraries/sop/urinemetabolome/structures.zip, tag=urine]
#> [2026-06-15 12:50:05.617] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/urinemetabolome_prepared.tsv.gz, n_rows=4364]
#> [2026-06-15 12:50:05.634] [INFO ] [OK] Completed: export_output [size_bytes=209222] (16ms)
#> [2026-06-15 12:50:05.635] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=4364] (822ms)
#> [2026-06-15 12:50:05.636] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=MCDB, input=data/source/libraries/sop/mcdb/structures.zip, tag=milk]
#> [2026-06-15 12:50:05.667] [WARN ] Empty dataframe in select_sop_columns
#> [2026-06-15 12:50:05.672] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/mcdb_prepared.tsv.gz, n_rows=0]
#> [2026-06-15 12:50:05.674] [INFO ] [OK] Completed: export_output [size_bytes=256] (1ms)
#> [2026-06-15 12:50:05.675] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=0] (39ms)
#> [2026-06-15 12:50:05.676] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=SMPDB, input=data/source/libraries/sop/smpdb/structures.zip, tag=pathway]
#> [2026-06-15 12:50:18.562] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/smpdb_prepared.tsv.gz, n_rows=49817]
#> [2026-06-15 12:50:18.679] [INFO ] [OK] Completed: export_output [size_bytes=1443937] (117ms)
#> [2026-06-15 12:50:18.681] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=49817] (13s)
#> [2026-06-15 12:50:18.682] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=MIMEDB, input=data/source/libraries/sop/mimedb/structures.zip, tag=microbiome]
#> [2026-06-15 12:50:18.712] [WARN ] Empty dataframe in select_sop_columns
#> [2026-06-15 12:50:18.718] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/mimedb_prepared.tsv.gz, n_rows=0]
#> [2026-06-15 12:50:18.719] [INFO ] [OK] Completed: export_output [size_bytes=256] (1ms)
#> [2026-06-15 12:50:18.720] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=0] (38ms)
#> [2026-06-15 12:50:18.721] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=T3DB, input=data/source/libraries/sop/t3db/structures.zip, tag=toxin]
#> [2026-06-15 12:50:18.750] [WARN ] Empty dataframe in select_sop_columns
#> [2026-06-15 12:50:18.755] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/t3db_prepared.tsv.gz, n_rows=0]
#> [2026-06-15 12:50:18.757] [INFO ] [OK] Completed: export_output [size_bytes=256] (1ms)
#> [2026-06-15 12:50:18.758] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=0] (37ms)
#> [2026-06-15 12:50:18.759] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=BOVINEDB, input=data/source/libraries/sop/bovinedb/structures.zip, tag=NA]
#> [2026-06-15 12:50:31.902] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/bovinedb_prepared.tsv.gz, n_rows=51684]
#> [2026-06-15 12:50:32.033] [INFO ] [OK] Completed: export_output [size_bytes=1568975] (131ms)
#> [2026-06-15 12:50:32.035] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=51684] (13.3s)
#> [2026-06-15 12:50:32.036] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=YMDB, input=data/source/libraries/sop/ymdb/structures.zip, tag=NA]
#> [2026-06-15 12:50:32.425] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/ymdb_prepared.tsv.gz, n_rows=2024]
#> [2026-06-15 12:50:32.436] [INFO ] [OK] Completed: export_output [size_bytes=83615] (12ms)
#> [2026-06-15 12:50:32.438] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=2024] (402ms)
#> [2026-06-15 12:50:32.439] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=CANNABISDATABASE, input=data/source/libraries/sop/cannabisdatabase/compounds.sdf, tag=NA]
#> [2026-06-15 12:50:32.467] [WARN ] Empty dataframe in select_sop_columns
#> [2026-06-15 12:50:32.473] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/cannabisdatabase_prepared.tsv.gz, n_rows=0]
#> [2026-06-15 12:50:32.474] [INFO ] [OK] Completed: export_output [size_bytes=256] (1ms)
#> [2026-06-15 12:50:32.475] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=0] (36ms)
#> ✔ lib_sop_hmd_fam_pre completed [38.2s, 4.43 MB]
#> + par_fin_par dispatched
#> ✔ par_fin_par completed [1ms, 341 B]
#> + par_fin_par2 dispatched
#> ✔ par_fin_par2 completed [2ms, 4.16 kB]
#> + par_usr_cre_com dispatched
#> ✔ par_usr_cre_com completed [1.8s, 200 B]
#> + par_usr_ann_mas dispatched
#> ✔ par_usr_ann_mas completed [1.7s, 4.40 kB]
#> + par_usr_pre_lib_sop_lot dispatched
#> ✔ par_usr_pre_lib_sop_lot completed [1.7s, 174 B]
#> + par_usr_pre_lib_sop_ecm dispatched
#> ✔ par_usr_pre_lib_sop_ecm completed [1.6s, 176 B]
#> + par_usr_pre_ann_spe dispatched
#> ✔ par_usr_pre_ann_spe completed [1.7s, 656 B]
#> + par_usr_pre_fea_edg dispatched
#> ✔ par_usr_pre_fea_edg completed [1.7s, 328 B]
#> + par_usr_wei_ann dispatched
#> ✔ par_usr_wei_ann completed [1.7s, 1.80 kB]
#> + par_usr_pre_fea_com dispatched
#> ✔ par_usr_pre_fea_com completed [1.7s, 200 B]
#> + par_usr_pre_ann_mzm dispatched
#> ✔ par_usr_pre_ann_mzm completed [1.8s, 635 B]
#> + par_usr_pre_fea_tab dispatched
#> ✔ par_usr_pre_fea_tab completed [1.7s, 274 B]
#> + par_usr_pre_ann_mzt dispatched
#> ✔ par_usr_pre_ann_mzt completed [1.7s, 546 B]
#> + par_usr_cre_edg_spe dispatched
#> ✔ par_usr_cre_edg_spe completed [1.7s, 475 B]
#> + par_usr_pre_lib_spe dispatched
#> ✔ par_usr_pre_lib_spe completed [1.7s, 322 B]
#> + par_usr_pre_lib_sop_mer dispatched
#> ✔ par_usr_pre_lib_sop_mer completed [1.8s, 3.02 kB]
#> + par_usr_pre_lib_sop_big dispatched
#> ✔ par_usr_pre_lib_sop_big completed [1.7s, 107 B]
#> + par_usr_ann_spe dispatched
#> ✔ par_usr_ann_spe completed [1.7s, 1.33 kB]
#> + par_usr_pre_lib_rt dispatched
#> ✔ par_usr_pre_lib_rt completed [1.7s, 487 B]
#> + par_usr_pre_ann_gnp dispatched
#> ✔ par_usr_pre_ann_gnp completed [1.7s, 633 B]
#> + par_usr_pre_lib_sop_pub dispatched
#> ✔ par_usr_pre_lib_sop_pub completed [1.7s, 195 B]
#> + par_usr_exp_mzt dispatched
#> ✔ par_usr_exp_mzt completed [1.7s, 425 B]
#> + par_usr_fil_ann dispatched
#> ✔ par_usr_fil_ann completed [1.7s, 808 B]
#> + par_usr_pre_lib_sop_clo dispatched
#> ✔ par_usr_pre_lib_sop_clo completed [1.7s, 267 B]
#> + par_usr_pre_lib_sop_hmd dispatched
#> ✔ par_usr_pre_lib_sop_hmd completed [1.7s, 178 B]
#> + par_usr_pre_ann_sir dispatched
#> ✔ par_usr_pre_ann_sir completed [1.8s, 859 B]
#> + par_usr_pre_tax dispatched
#> ✔ par_usr_pre_tax completed [1.7s, 438 B]
#> + par_cre_com dispatched
#> ✔ par_cre_com completed [2ms, 191 B]
#> + par_ann_mas dispatched
#> ✔ par_ann_mas completed [3ms, 1.85 kB]
#> + par_pre_lib_sop_lot dispatched
#> ✔ par_pre_lib_sop_lot completed [1ms, 185 B]
#> + par_pre_lib_sop_ecm dispatched
#> ✔ par_pre_lib_sop_ecm completed [1ms, 190 B]
#> + par_pre_ann_spe dispatched
#> ✔ par_pre_ann_spe completed [2ms, 322 B]
#> + par_pre_fea_edg dispatched
#> ✔ par_pre_fea_edg completed [1ms, 243 B]
#> + par_wei_ann dispatched
#> ✔ par_wei_ann completed [2ms, 958 B]
#> + par_pre_fea_com dispatched
#> ✔ par_pre_fea_com completed [1ms, 183 B]
#> + par_pre_ann_mzm dispatched
#> ✔ par_pre_ann_mzm completed [2ms, 329 B]
#> + par_pre_fea_tab dispatched
#> ✔ par_pre_fea_tab completed [1ms, 278 B]
#> + par_pre_ann_mzt dispatched
#> ✔ par_pre_ann_mzt completed [1ms, 313 B]
#> + par_cre_edg_spe dispatched
#> ✔ par_cre_edg_spe completed [2ms, 404 B]
#> + par_pre_lib_spe dispatched
#> ✔ par_pre_lib_spe completed [1ms, 407 B]
#> + par_pre_lib_sop_mer dispatched
#> ✔ par_pre_lib_sop_mer completed [2ms, 831 B]
#> + par_pre_lib_sop_big dispatched
#> ✔ par_pre_lib_sop_big completed [1ms, 153 B]
#> + par_ann_spe dispatched
#> ✔ par_ann_spe completed [2ms, 559 B]
#> + par_pre_lib_rt dispatched
#> ✔ par_pre_lib_rt completed [1ms, 375 B]
#> + par_pre_ann_gnp dispatched
#> ✔ par_pre_ann_gnp completed [1ms, 324 B]
#> + par_pre_lib_sop_pub dispatched
#> ✔ par_pre_lib_sop_pub completed [2ms, 192 B]
#> + par_exp_mzt dispatched
#> ✔ par_exp_mzt completed [2ms, 268 B]
#> + par_fil_ann dispatched
#> ✔ par_fil_ann completed [2ms, 373 B]
#> + par_pre_lib_sop_clo dispatched
#> ✔ par_pre_lib_sop_clo completed [1ms, 233 B]
#> + par_pre_lib_sop_hmd dispatched
#> ✔ par_pre_lib_sop_hmd completed [2ms, 192 B]
#> + par_pre_ann_sir dispatched
#> ✔ par_pre_ann_sir completed [2ms, 435 B]
#> + par_pre_tax dispatched
#> ✔ par_pre_tax completed [1ms, 330 B]
#> + lib_sop_lot_pre dispatched
#> [2026-06-15 12:51:21.395] [INFO ] > Starting: prepare_libraries_sop_lotus [input=data/source/libraries/sop/lotus.csv.gz]
#> [2026-06-15 12:51:30.877] [INFO ] [OK] Completed: prepare_libraries_sop_lotus [n_pairs=677545] (9.5s)
#> [2026-06-15 12:51:30.880] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/lotus_prepared.tsv.gz, n_rows=677545]
#> [2026-06-15 12:51:34.433] [INFO ] [OK] Completed: export_output [size_bytes=49541873] (3.6s)
#> ✔ lib_sop_lot_pre completed [13s, 49.54 MB]
#> + lib_sop_ecm_pre dispatched
#> [2026-06-15 12:51:34.755] [INFO ] Preparing ECMDB structure-organism pairs
#> [2026-06-15 12:51:35.410] [INFO ] Exporting parameters to: data/interim/params/260615_125135_prepare_libraries_sop_ecmdb.yaml
#> [2026-06-15 12:51:35.412] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/ecmdb_prepared.tsv.gz, n_rows=3760]
#> [2026-06-15 12:51:35.427] [INFO ] [OK] Completed: export_output [size_bytes=165776] (15ms)
#> ✔ lib_sop_ecm_pre completed [674ms, 165.78 kB]
#> + input_features dispatched
#> ✔ input_features completed [0ms, 451.55 kB]
#> + lib_spe_exp_int_pre dispatched
#> [2026-06-15 12:51:35.717] [INFO ] > Starting: prepare_libraries_spectra [library_name=internal, n_input_files=1]
#> [2026-06-15 12:51:35.722] [WARN ] Input file(s) not found; creating empty library template
#> [2026-06-15 12:51:37.145] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/internal_prepared.tsv.gz, n_rows=1]
#> [2026-06-15 12:51:37.147] [INFO ] [OK] Completed: export_output [size_bytes=79] (2ms)
#> [2026-06-15 12:51:37.240] [INFO ] Exporting parameters to: data/interim/params/260615_125137_prepare_libraries_spectra.yaml
#> [2026-06-15 12:51:37.242] [INFO ] [OK] Completed: prepare_libraries_spectra [n_structures=1, n_spectra_total=2, files_exported=3] (1.5s)
#> ✔ lib_spe_exp_int_pre completed [1.5s, 1.28 kB]
#> + lib_sop_mer_cla_cache dispatched
#> [2026-06-15 12:51:37.544] [INFO ] > Starting: download_file [url=https://github.com/Adafede/marimo/raw/refs/heads/main/apps/public/classyfire/classyfire_cache.csv, destination=data/interim/libraries/sop/merged/structures/taxonomies/classyfire_cache.csv]
#> Downloading  45% ■■■■■■■■■■■■■■■                   1s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■   0s
#> [2026-06-15 12:51:40.154] [INFO ] [OK] Completed: download_file [size_bytes=143087266] (2.6s)
#> ✔ lib_sop_mer_cla_cache completed [2.6s, 143.09 MB]
#> + lib_sop_mer_npc_cache dispatched
#> [2026-06-15 12:51:40.447] [INFO ] > Starting: download_file [url=https://github.com/Adafede/marimo/raw/refs/heads/main/apps/public/npclassifier/npclassifier_cache.csv, destination=data/interim/libraries/sop/merged/structures/taxonomies/npc.tsv.gz]
#> Downloading  32% ■■■■■■■■■■                        2s
#> Downloading  80% ■■■■■■■■■■■■■■■■■■■■■■■■■         1s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■   0s
#> [2026-06-15 12:51:43.877] [INFO ] [OK] Completed: download_file [size_bytes=201875293] (3.4s)
#> ✔ lib_sop_mer_npc_cache completed [3.4s, 201.88 MB]
#> + lib_sop_mer_str_pro dispatched
#> [2026-06-15 12:51:44.191] [INFO ] > Starting: download_file [url=https://github.com/taxonomicallyinformedannotation/tima-example-files/raw/main/processed.csv.gz, destination=data/interim/libraries/sop/merged/structures/processed.csv.gz]
#> [2026-06-15 12:51:45.118] [INFO ] [OK] Completed: download_file [size_bytes=96591814] (926ms)
#> ✔ lib_sop_mer_str_pro completed [928ms, 96.59 MB]
#> + lib_sop_big_pre dispatched
#> [2026-06-15 12:51:45.396] [INFO ] Preparing BiGG structure-organism pairs
#> [2026-06-15 12:52:10.999] [INFO ] > Starting: process_smiles [n_structures=1420]
#> [2026-06-15 12:52:11.000] [INFO ] Processing SMILES with RDKit
#> Downloading uv...Done!
#> Downloading cpython-3.12.13-linux-x86_64-gnu (download) (32.6MiB)
#>  Downloaded cpython-3.12.13-linux-x86_64-gnu (download)
#> Downloading numpy (15.9MiB)
#> Downloading pillow (6.8MiB)
#> Downloading rdkit (35.5MiB)
#>  Downloaded pillow
#>  Downloaded numpy
#>  Downloaded rdkit
#> Installed 5 packages in 33ms
#> [2026-06-15 12:52:15.647] [INFO ] Processing 1419 new SMILES with RDKit
#> [2026-06-15 12:52:15.649] [INFO ] Starting SMILES processing pipeline
#> [2026-06-15 12:52:15.649] [INFO ] Input: /tmp/RtmppBog3o/file270d23e4b542.smi
#> [2026-06-15 12:52:15.650] [INFO ] Output: /tmp/RtmppBog3o/file270d2fea4e84.csv.gz
#> [2026-06-15 12:52:15.650] [INFO ] Input file validated: /tmp/RtmppBog3o/file270d23e4b542.smi
#> [2026-06-15 12:52:15.650] [INFO ] Output file validated: /tmp/RtmppBog3o/file270d2fea4e84.csv.gz
#> [2026-06-15 12:52:15.650] [INFO ] Processing parameters: workers=8, batch_size=1000, progress_interval=10000
#> [2026-06-15 12:52:15.650] [INFO ] SMILES supplier initialized
#> [2026-06-15 12:52:17.639] [INFO ] Processing complete. Total molecules processed: 1419
#> [2026-06-15 12:52:17.677] [INFO ] Successfully processed 1419 SMILES
#> [2026-06-15 12:52:17.690] [INFO ] [OK] Completed: process_smiles [n_processed=1419] (6.7s)
#> [2026-06-15 12:52:24.130] [INFO ] > Starting: process_smiles [n_structures=2085]
#> [2026-06-15 12:52:24.131] [INFO ] Processing SMILES with RDKit
#> [2026-06-15 12:52:24.142] [INFO ] Processing 1242 new SMILES with RDKit
#> [2026-06-15 12:52:24.144] [INFO ] Starting SMILES processing pipeline
#> [2026-06-15 12:52:24.144] [INFO ] Input: /tmp/RtmppBog3o/file270d19879cbb.smi
#> [2026-06-15 12:52:24.144] [INFO ] Output: /tmp/RtmppBog3o/file270debdb079.csv.gz
#> [2026-06-15 12:52:24.144] [INFO ] Input file validated: /tmp/RtmppBog3o/file270d19879cbb.smi
#> [2026-06-15 12:52:24.144] [INFO ] Output file validated: /tmp/RtmppBog3o/file270debdb079.csv.gz
#> [2026-06-15 12:52:24.144] [INFO ] Processing parameters: workers=8, batch_size=1000, progress_interval=10000
#> [2026-06-15 12:52:24.145] [INFO ] SMILES supplier initialized
#> [2026-06-15 12:52:25.910] [INFO ] Processing complete. Total molecules processed: 1242
#> [2026-06-15 12:52:25.949] [INFO ] Successfully processed 1242 SMILES
#> [2026-06-15 12:52:25.960] [INFO ] [OK] Completed: process_smiles [n_processed=1242] (1.8s)
#> [2026-06-15 12:52:26.061] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/bigg_prepared.tsv.gz, n_rows=2355]
#> [2026-06-15 12:52:26.074] [INFO ] [OK] Completed: export_output [size_bytes=81935] (13ms)
#> ✔ lib_sop_big_pre completed [40.7s, 81.94 kB]
#> + input_spectra dispatched
#> ✔ input_spectra completed [0ms, 7.77 MB]
#> + lib_rt dispatched
#> [2026-06-15 12:52:26.853] [INFO ] Preparing retention time libraries
#> [2026-06-15 12:52:26.865] [WARN ] No retention time library found, returning empty retention time and sop tables.
#> [2026-06-15 12:52:26.913] [INFO ] Exporting parameters to: data/interim/params/260615_125226_prepare_libraries_rt.yaml
#> [2026-06-15 12:52:26.915] [INFO ] > Starting: export_output [file=data/interim/libraries/rt/prepared.tsv.gz, n_rows=1]
#> [2026-06-15 12:52:26.917] [INFO ] [OK] Completed: export_output [size_bytes=86] (2ms)
#> [2026-06-15 12:52:26.921] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/rt_prepared.tsv.gz, n_rows=1]
#> [2026-06-15 12:52:26.922] [INFO ] [OK] Completed: export_output [size_bytes=105] (2ms)
#> ✔ lib_rt completed [72ms, 191 B]
#> + lib_sop_pub_pre dispatched
#> [2026-06-15 12:52:27.298] [INFO ] > Starting: prepare_libraries_sop_pubchemlite [input=data/source/libraries/sop/pubchemlite.csv]
#> [2026-06-15 12:52:35.558] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/pubchemlite_prepared.tsv.gz, n_rows=566689]
#> [2026-06-15 12:52:37.389] [INFO ] [OK] Completed: export_output [size_bytes=29385634] (1.8s)
#> [2026-06-15 12:52:37.391] [INFO ] [OK] Completed: prepare_libraries_sop_pubchemlite [n_pairs=566689] (10.1s)
#> ✔ lib_sop_pub_pre completed [10.1s, 29.39 MB]
#> + lib_sop_clo_pre dispatched
#> [2026-06-15 12:52:38.239] [INFO ] Preparing closed structure-organism pairs library
#> [2026-06-15 12:52:38.240] [WARN ] Closed resource not accessible at: ~/Git/lotus-processor/data/processed/240412_closed_metadata.csv.gz. Returning empty template instead.
#> [2026-06-15 12:52:38.258] [INFO ] Exporting parameters to: data/interim/params/260615_125238_prepare_libraries_sop_closed.yaml
#> [2026-06-15 12:52:38.260] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/closed_prepared.tsv.gz, n_rows=1]
#> [2026-06-15 12:52:38.261] [INFO ] [OK] Completed: export_output [size_bytes=277] (2ms)
#> ✔ lib_sop_clo_pre completed [25ms, 277 B]
#> + lib_sop_hmd_pre dispatched
#> [2026-06-15 12:52:38.662] [INFO ] > Starting: prepare_libraries_sop_hmdb_like [source=HMDB, input=data/source/libraries/sop/hmdb/structures.zip, tag=NA]
#> [2026-06-15 12:52:38.692] [WARN ] Empty dataframe in select_sop_columns
#> [2026-06-15 12:52:38.698] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/hmdb_prepared.tsv.gz, n_rows=0]
#> [2026-06-15 12:52:38.699] [INFO ] [OK] Completed: export_output [size_bytes=256] (2ms)
#> [2026-06-15 12:52:38.701] [INFO ] [OK] Completed: prepare_libraries_sop_hmdb_like [n_pairs=0] (39ms)
#> ✔ lib_sop_hmd_pre completed [41ms, 256 B]
#> + fea_pre dispatched
#> [2026-06-15 12:52:39.095] [INFO ] > Starting: prepare_features_tables [input=data/source/example_features.csv, candidates=1]
#> [2026-06-15 12:52:39.216] [INFO ] Prepared 5328 feature-sample pairs
#> [2026-06-15 12:52:39.217] [INFO ] [OK] Completed: prepare_features_tables [n_features=5328] (123ms)
#> [2026-06-15 12:52:39.243] [INFO ] Exporting parameters to: data/interim/params/260615_125239_prepare_features_tables.yaml
#> [2026-06-15 12:52:39.245] [INFO ] > Starting: export_output [file=data/interim/features/example_features.tsv.gz, n_rows=5328]
#> [2026-06-15 12:52:39.260] [INFO ] [OK] Completed: export_output [size_bytes=95629] (14ms)
#> ✔ fea_pre completed [167ms, 95.63 kB]
#> + lib_spe_exp_int_pre_sop dispatched
#> ✔ lib_spe_exp_int_pre_sop completed [0ms, 79 B]
#> + lib_spe_exp_int_pre_pos dispatched
#> ✔ lib_spe_exp_int_pre_pos completed [0ms, 600 B]
#> + lib_spe_exp_int_pre_neg dispatched
#> ✔ lib_spe_exp_int_pre_neg completed [0ms, 600 B]
#> + fea_edg_spe dispatched
#> [2026-06-15 12:52:40.827] [INFO ] > Starting: create_edges_spectra [method=gnps, threshold=0.7, n_input_files=1]
#> [2026-06-15 12:52:40.828] [INFO ] Creating spectral similarity network edges
#> [2026-06-15 12:52:40.830] [INFO ] Importing spectra from: data/source/example_spectra.mgf
#> [2026-06-15 12:52:40.856] [INFO ] Reading MGF file (7.41 MB) with optimized parser: data/source/example_spectra.mgf
#> [2026-06-15 12:52:42.791] [INFO ] Processed 10000 spectra...
#> [2026-06-15 12:52:44.083] [INFO ] Total spectra read: 16282
#> [2026-06-15 12:52:50.230] [INFO ] Loaded 16282 spectra from file
#> [2026-06-15 12:52:50.251] [INFO ] Combining replicate spectra by FEATURE_ID
#> [2026-06-15 12:52:52.923] [INFO ] Combined replicates: 12195 -> 4087 spectra
#> [2026-06-15 12:52:52.956] [INFO ] Sanitizing 4087 spectra (cutoff: 0)
#> [2026-06-15 12:52:54.051] [INFO ] Sanitization complete: 3999/4087 spectra retained (97.8%, 88 removed)
#> [2026-06-15 12:52:54.052] [INFO ] Import complete: 3999 spectra ready for analysis
#> [2026-06-15 12:52:54.053] [INFO ] ======================================
#> [2026-06-15 12:52:54.054] [INFO ] Take yourself a break, you deserve it.
#> [2026-06-15 12:52:54.055] [INFO ] ======================================
#> [2026-06-15 12:52:54.056] [INFO ] > Starting: create_edges [n_spectra=3999, method=gnps, threshold=0.7, min_peaks=6]
#> [2026-06-15 12:53:08.413] [INFO ] Processed 500 / 3998 queries
#> [2026-06-15 12:53:20.703] [INFO ] Processed 1000 / 3998 queries
#> [2026-06-15 12:53:31.222] [INFO ] Processed 1500 / 3998 queries
#> [2026-06-15 12:53:39.372] [INFO ] Processed 2000 / 3998 queries
#> [2026-06-15 12:53:45.717] [INFO ] Processed 2500 / 3998 queries
#> [2026-06-15 12:53:50.196] [INFO ] Processed 3000 / 3998 queries
#> [2026-06-15 12:53:52.994] [INFO ] Processed 3500 / 3998 queries
#> [2026-06-15 12:53:53.924] [INFO ] Here is the distribution of edge similarity scores (0.1 bins) BEFORE filtering:
#> [2026-06-15 12:53:53.926] [INFO ] 
#>        bin       N    Pct
#>    [0,0.1] 5759390 72.05%
#>  (0.1,0.2] 1201848 15.03%
#>  (0.2,0.3]  509850  6.38%
#>  (0.3,0.4]  239674  3.00%
#>  (0.4,0.5]  126023  1.58%
#>  (0.5,0.6]   68810  0.86%
#>  (0.6,0.7]   39955  0.50%
#>  (0.7,0.8]   23824  0.30%
#>  (0.8,0.9]   10727  0.13%
#>    (0.9,1]   13900  0.17%
#> [2026-06-15 12:53:53.929] [INFO ] [OK] Completed: create_edges [n_edges=7265, n_comparisons=7994001, pass_rate=0.1%] (59.9s)
#> [2026-06-15 12:53:54.008] [INFO ] Exporting parameters to: data/interim/params/260615_125354_create_edges_spectra.yaml
#> [2026-06-15 12:53:54.010] [INFO ] > Starting: export_output [file=data/interim/features/example_edgesSpectra.tsv, n_rows=9905]
#> [2026-06-15 12:53:54.013] [INFO ] [OK] Completed: export_output [size_bytes=454521] (3ms)
#> [2026-06-15 12:53:54.014] [INFO ] [OK] Completed: create_edges_spectra [n_edges=9905] (1m 13s)
#> ✔ fea_edg_spe completed [1m 13.2s, 454.52 kB]
#> + lib_rt_sop dispatched
#> ✔ lib_rt_sop completed [0ms, 105 B]
#> + lib_rt_rts dispatched
#> ✔ lib_rt_rts completed [1ms, 86 B]
#> + lib_sop_mer dispatched
#> [2026-06-15 12:53:55.339] [INFO ] > Starting: prepare_libraries_sop_merged [n_libraries=27, filter_enabled=FALSE, filter_level=none]
#> [2026-06-15 12:54:07.548] [INFO ] Splitting SOP library into standardized components
#> [2026-06-15 12:54:10.770] [INFO ] > Starting: process_smiles [n_structures=2203115]
#> [2026-06-15 12:54:10.771] [INFO ] Processing SMILES with RDKit
#> [2026-06-15 12:54:25.610] [INFO ] Processing 21 new SMILES with RDKit
#> [2026-06-15 12:54:25.612] [INFO ] Starting SMILES processing pipeline
#> [2026-06-15 12:54:25.612] [INFO ] Input: /tmp/RtmppBog3o/file270d65fcf61c.smi
#> [2026-06-15 12:54:25.612] [INFO ] Output: /tmp/RtmppBog3o/file270d7c259be5.csv.gz
#> [2026-06-15 12:54:25.613] [INFO ] Input file validated: /tmp/RtmppBog3o/file270d65fcf61c.smi
#> [2026-06-15 12:54:25.613] [INFO ] Output file validated: /tmp/RtmppBog3o/file270d7c259be5.csv.gz
#> [2026-06-15 12:54:25.613] [INFO ] Processing parameters: workers=8, batch_size=1000, progress_interval=10000
#> [2026-06-15 12:54:25.613] [INFO ] SMILES supplier initialized
#> [12:54:25] Explicit valence for atom # 1 N, 3, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 1
#> [12:54:25] ERROR: Explicit valence for atom # 1 N, 3, is greater than permitted
#> [12:54:25] Explicit valence for atom # 1 Cl, 7, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 4
#> [12:54:25] ERROR: Explicit valence for atom # 1 Cl, 7, is greater than permitted
#> [12:54:25] Explicit valence for atom # 1 Br, 3, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 5
#> [12:54:25] ERROR: Explicit valence for atom # 1 Br, 3, is greater than permitted
#> [12:54:25] Explicit valence for atom # 1 Br, 5, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 6
#> [12:54:25] ERROR: Explicit valence for atom # 1 Br, 5, is greater than permitted
#> [12:54:25] Explicit valence for atom # 1 Cl, 3, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 7
#> [12:54:25] ERROR: Explicit valence for atom # 1 Cl, 3, is greater than permitted
#> [12:54:25] Explicit valence for atom # 1 Cl, 5, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 8
#> [12:54:25] ERROR: Explicit valence for atom # 1 Cl, 5, is greater than permitted
#> [12:54:25] Explicit valence for atom # 1 I, 7, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 9
#> [12:54:25] ERROR: Explicit valence for atom # 1 I, 7, is greater than permitted
#> [12:54:25] Explicit valence for atom # 1 Cl, 3, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 10
#> [12:54:25] ERROR: Explicit valence for atom # 1 Cl, 3, is greater than permitted
#> [12:54:25] Explicit valence for atom # 8 Br, 2, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 13
#> [12:54:25] ERROR: Explicit valence for atom # 8 Br, 2, is greater than permitted
#> [12:54:25] Explicit valence for atom # 9 Cl, 2, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 14
#> [12:54:25] ERROR: Explicit valence for atom # 9 Cl, 2, is greater than permitted
#> [12:54:25] Explicit valence for atom # 6 C, 5, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 16
#> [12:54:25] ERROR: Explicit valence for atom # 6 C, 5, is greater than permitted
#> [12:54:25] Explicit valence for atom # 31 O, 3, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 17
#> [12:54:25] ERROR: Explicit valence for atom # 31 O, 3, is greater than permitted
#> [12:54:25] Explicit valence for atom # 4 N, 4, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 18
#> [12:54:25] ERROR: Explicit valence for atom # 4 N, 4, is greater than permitted
#> [12:54:25] Explicit valence for atom # 26 N, 4, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 19
#> [12:54:25] ERROR: Explicit valence for atom # 26 N, 4, is greater than permitted
#> [12:54:25] Explicit valence for atom # 0 P, 11, is greater than permitted
#> [12:54:25] ERROR: Could not sanitize molecule on line 20
#> [12:54:25] ERROR: Explicit valence for atom # 0 P, 11, is greater than permitted
#> [12:54:25] Can't kekulize mol.  Unkekulized atoms: 6 7 8 9 10 11 12 13 14
#> [12:54:25] ERROR: Could not sanitize molecule on line 21
#> [12:54:25] ERROR: Can't kekulize mol.  Unkekulized atoms: 6 7 8 9 10 11 12 13 14
#> [12:54:25] Explicit valence for atom # 56 P, 7, is greater than permitted
#> [2026-06-15 12:54:25.618] [WARNING] Failed to process SMILES 'CC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCCC(C)=CCO[P-]([O])(=O)=O': Explicit valence for atom # 56 P, 7, is greater than permitted
#> [12:54:25] Explicit valence for atom # 4 P, 7, is greater than permitted
#> [2026-06-15 12:54:25.618] [WARNING] Failed to process SMILES '[H][C@](O)(CO[P-]([O])(=O)=O)C=O': Explicit valence for atom # 4 P, 7, is greater than permitted
#> [12:54:25] Explicit valence for atom # 6 Si, 6, is greater than permitted
#> [2026-06-15 12:54:25.619] [WARNING] Failed to process SMILES 'C1=CC=C(C=C1)[Si-](C2=CC=CC=C2)(C3=CC=CC=C3)(F)F': Explicit valence for atom # 6 Si, 6, is greater than permitted
#> [12:54:25] Explicit valence for atom # 4 P, 7, is greater than permitted
#> [2026-06-15 12:54:25.620] [WARNING] Failed to process SMILES 'C(C(F)(F)[P-](C(C(F)(F)F)(F)F)(C(C(F)(F)F)(F)F)(F)(F)F)(F)(F)F': Explicit valence for atom # 4 P, 7, is greater than permitted
#> [12:54:25] Explicit valence for atom # 7 Si, 6, is greater than permitted
#> [2026-06-15 12:54:25.621] [WARNING] Failed to process SMILES 'C1=CC=C2C(=C1)O[Si-]3(O2)(OC4=CC=CC=C4O3)CI': Explicit valence for atom # 7 Si, 6, is greater than permitted
#> [2026-06-15 12:54:25.621] [WARNING] Batch processing: 5/5 molecules failed
#> [2026-06-15 12:54:25.621] [INFO ] Processing complete. Total molecules processed: 0
#> [2026-06-15 12:54:25.650] [INFO ] Successfully processed 0 SMILES
#> [2026-06-15 12:54:40.477] [INFO ] [OK] Completed: process_smiles [n_processed=2109922] (29.7s)
#> [2026-06-15 12:55:07.143] [INFO ] Referenced structure-organism pairs (1,328,977)
#> [2026-06-15 12:55:18.307] [INFO ] Structures: 430,332 stereoisomers, 1,376,478 without stereochemistry, 1,538,577 constitutional isomers
#> [2026-06-15 12:55:56.833] [INFO ] Unique organisms (37,469)
#> [2026-06-15 12:55:56.993] [INFO ] Processing 813 organism name(s) for OTT taxonomy lookup
#> [2026-06-15 12:55:57.591] [INFO ] Querying OTT API in 9 batches
#> [2026-06-15 12:56:01.464] [INFO ] Retrieving detailed taxonomy for 4 unique OTT IDs
#> [2026-06-15 12:56:02.035] [INFO ] Got OTTaxonomy!
#> [2026-06-15 12:56:02.510] [INFO ] Enriching NPClassifier taxonomy from additional cache: data/interim/libraries/sop/merged/structures/taxonomies/npc.tsv.gz
#> [2026-06-15 12:56:12.806] [INFO ] Enriched NPClassifier taxonomy with 1105898 entries from additional cache (1105898 missing keys matched)
#> [2026-06-15 12:56:19.114] [INFO ] Updated additional NPClassifier cache (1783925 total entries): data/interim/libraries/sop/merged/structures/taxonomies/npc.tsv.gz
#> [2026-06-15 12:56:19.284] [INFO ] Enriching ClassyFire taxonomy from additional cache: data/interim/libraries/sop/merged/structures/taxonomies/classyfire_cache.csv
#> [2026-06-15 12:56:24.550] [INFO ] Enriched ClassyFire taxonomy with 181659 entries from additional cache (181659 missing keys matched)
#> [2026-06-15 12:56:26.293] [INFO ] Updated additional ClassyFire cache (1106056 total entries): data/interim/libraries/sop/merged/structures/taxonomies/classyfire_cache.csv
#> [2026-06-15 12:56:26.320] [INFO ] Exporting parameters to: data/interim/params/260615_125626_prepare_libraries_sop_merged.yaml
#> [2026-06-15 12:56:26.322] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/merged/keys.tsv.gz, n_rows=1328977]
#> [2026-06-15 12:56:28.386] [INFO ] [OK] Completed: export_output [size_bytes=31597147] (2.1s)
#> [2026-06-15 12:56:28.388] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/merged/organisms/taxonomies/ott.tsv.gz, n_rows=36758]
#> [2026-06-15 12:56:28.482] [INFO ] [OK] Completed: export_output [size_bytes=1013371] (93ms)
#> [2026-06-15 12:56:28.484] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/merged/structures/canonical.tsv.gz, n_rows=2107210]
#> [2026-06-15 12:56:33.255] [INFO ] [OK] Completed: export_output [size_bytes=37368926] (4.8s)
#> [2026-06-15 12:56:33.257] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/merged/structures/stereo.tsv.gz, n_rows=1806810]
#> [2026-06-15 12:56:40.138] [INFO ] [OK] Completed: export_output [size_bytes=91140050] (6.9s)
#> [2026-06-15 12:56:40.140] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/merged/structures/metadata.tsv.gz, n_rows=1541926]
#> [2026-06-15 12:56:41.826] [INFO ] [OK] Completed: export_output [size_bytes=30593742] (1.7s)
#> [2026-06-15 12:56:41.828] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/merged/structures/taxonomies/classyfire.tsv.gz, n_rows=405842]
#> [2026-06-15 12:56:42.254] [INFO ] [OK] Completed: export_output [size_bytes=8389638] (426ms)
#> [2026-06-15 12:56:42.256] [INFO ] > Starting: export_output [file=data/interim/libraries/sop/merged/structures/taxonomies/npc.tsv.gz, n_rows=1326516]
#> [2026-06-15 12:56:44.479] [INFO ] [OK] Completed: export_output [size_bytes=19172627] (2.2s)
#> [2026-06-15 12:56:44.480] [INFO ] [OK] Completed: prepare_libraries_sop_merged [n_pairs=1328977, n_structures=1806810, n_organisms=36758, files_exported=7] (2m 49s)
#> ✔ lib_sop_mer completed [2m 49.1s, 219.28 MB]
#> + lib_mer_str_met dispatched
#> ✔ lib_mer_str_met completed [0ms, 30.59 MB]
#> + lib_mer_str_stereo dispatched
#> ✔ lib_mer_str_stereo completed [0ms, 91.14 MB]
#> + lib_mer_str_tax_cla dispatched
#> ✔ lib_mer_str_tax_cla completed [1ms, 8.39 MB]
#> + lib_mer_str_tax_npc dispatched
#> ✔ lib_mer_str_tax_npc completed [0ms, 19.17 MB]
#> + lib_mer_org_tax_ott dispatched
#> ✔ lib_mer_org_tax_ott completed [0ms, 1.01 MB]
#> + lib_mer_key dispatched
#> ✔ lib_mer_key completed [0ms, 31.60 MB]
#> + ann_spe_exp_mzt_pre dispatched
#> [2026-06-15 12:56:49.187] [WARN ] No mzTab input provided for prepare_annotations_mztab, exporting empty annotations
#> [2026-06-15 12:56:49.190] [INFO ] > Starting: export_output [file=data/interim/annotations/example_mztabPrepared.tsv.gz, n_rows=1]
#> [2026-06-15 12:56:49.192] [INFO ] [OK] Completed: export_output [size_bytes=308] (2ms)
#> ✔ ann_spe_exp_mzt_pre completed [8ms, 308 B]
#> + ann_spe_exp_mzm_pre dispatched
#> [2026-06-15 12:56:49.565] [INFO ] > Starting: prepare_annotations_mzmine [n_files=1]
#> [2026-06-15 12:56:49.566] [WARN ] No mzmine annotations found, returning an empty file instead
#> [2026-06-15 12:56:49.569] [INFO ] [OK] Completed: prepare_annotations_mzmine [n_annotations=1] (4ms)
#> [2026-06-15 12:56:49.583] [INFO ] Exporting parameters to: data/interim/params/260615_125649_prepare_annotations_mzmine.yaml
#> [2026-06-15 12:56:49.585] [INFO ] > Starting: export_output [file=data/interim/annotations/example_mzminePrepared.tsv.gz, n_rows=1]
#> [2026-06-15 12:56:49.587] [INFO ] [OK] Completed: export_output [size_bytes=308] (1ms)
#> ✔ ann_spe_exp_mzm_pre completed [24ms, 308 B]
#> + ann_sir_pre dispatched
#> [2026-06-15 12:56:49.959] [INFO ] > Starting: prepare_annotations_sirius [version=6]
#> [2026-06-15 12:56:50.127] [INFO ] > Starting: process_smiles [n_structures=2563]
#> [2026-06-15 12:56:50.128] [INFO ] Processing SMILES with RDKit
#> [2026-06-15 12:57:01.146] [INFO ] Processing 9 new SMILES with RDKit
#> [2026-06-15 12:57:01.147] [INFO ] Starting SMILES processing pipeline
#> [2026-06-15 12:57:01.147] [INFO ] Input: /tmp/RtmppBog3o/file270d7ad25eec.smi
#> [2026-06-15 12:57:01.147] [INFO ] Output: /tmp/RtmppBog3o/file270d20ffeea5.csv.gz
#> [2026-06-15 12:57:01.147] [INFO ] Input file validated: /tmp/RtmppBog3o/file270d7ad25eec.smi
#> [2026-06-15 12:57:01.148] [INFO ] Output file validated: /tmp/RtmppBog3o/file270d20ffeea5.csv.gz
#> [2026-06-15 12:57:01.148] [INFO ] Processing parameters: workers=8, batch_size=1000, progress_interval=10000
#> [2026-06-15 12:57:01.148] [INFO ] SMILES supplier initialized
#> [12:57:01] Explicit valence for atom # 8 Cl, 3, is greater than permitted
#> [12:57:01] ERROR: Could not sanitize molecule on line 1
#> [12:57:01] ERROR: Explicit valence for atom # 8 Cl, 3, is greater than permitted
#> [12:57:01] Explicit valence for atom # 4 P, 7, is greater than permitted
#> [12:57:01] ERROR: Could not sanitize molecule on line 2
#> [12:57:01] ERROR: Explicit valence for atom # 4 P, 7, is greater than permitted
#> [12:57:01] Explicit valence for atom # 2 P, 7, is greater than permitted
#> [12:57:01] ERROR: Could not sanitize molecule on line 3
#> [12:57:01] ERROR: Explicit valence for atom # 2 P, 7, is greater than permitted
#> [12:57:01] Explicit valence for atom # 4 P, 7, is greater than permitted
#> [12:57:01] ERROR: Could not sanitize molecule on line 4
#> [12:57:01] ERROR: Explicit valence for atom # 4 P, 7, is greater than permitted
#> [12:57:01] Explicit valence for atom # 2 P, 7, is greater than permitted
#> [12:57:01] ERROR: Could not sanitize molecule on line 5
#> [12:57:01] ERROR: Explicit valence for atom # 2 P, 7, is greater than permitted
#> [12:57:01] Explicit valence for atom # 6 P, 7, is greater than permitted
#> [12:57:01] ERROR: Could not sanitize molecule on line 6
#> [12:57:01] ERROR: Explicit valence for atom # 6 P, 7, is greater than permitted
#> [12:57:01] Explicit valence for atom # 6 P, 7, is greater than permitted
#> [12:57:01] ERROR: Could not sanitize molecule on line 7
#> [12:57:01] ERROR: Explicit valence for atom # 6 P, 7, is greater than permitted
#> [12:57:01] Explicit valence for atom # 4 P, 7, is greater than permitted
#> [12:57:01] ERROR: Could not sanitize molecule on line 8
#> [12:57:01] ERROR: Explicit valence for atom # 4 P, 7, is greater than permitted
#> [12:57:01] Explicit valence for atom # 2 P, 7, is greater than permitted
#> [12:57:01] ERROR: Could not sanitize molecule on line 9
#> [12:57:01] ERROR: Explicit valence for atom # 2 P, 7, is greater than permitted
#> [2026-06-15 12:57:01.148] [INFO ] Processing complete. Total molecules processed: 0
#> [2026-06-15 12:57:01.178] [INFO ] Successfully processed 0 SMILES
#> [2026-06-15 12:57:08.974] [INFO ] [OK] Completed: process_smiles [n_processed=2554] (18.8s)
#> [2026-06-15 12:57:08.995] [INFO ] > Starting: complement_metadata [n_input=2571]
#> [2026-06-15 12:57:27.966] [INFO ] [OK] Completed: complement_metadata [n_enriched=2571] (19s)
#> [2026-06-15 12:57:27.978] [INFO ] [OK] Completed: prepare_annotations_sirius [n_canopus=15, n_formulas=19, n_structures=2571] (38s)
#> [2026-06-15 12:57:28.006] [INFO ] Exporting parameters to: data/interim/params/260615_125728_prepare_annotations_sirius.yaml
#> [2026-06-15 12:57:28.008] [INFO ] > Starting: export_output [file=data/interim/annotations/example_canopusPrepared.tsv.gz, n_rows=15]
#> [2026-06-15 12:57:28.009] [INFO ] [OK] Completed: export_output [size_bytes=830] (2ms)
#> [2026-06-15 12:57:28.011] [INFO ] > Starting: export_output [file=data/interim/annotations/example_formulaPrepared.tsv.gz, n_rows=19]
#> [2026-06-15 12:57:28.013] [INFO ] [OK] Completed: export_output [size_bytes=521] (2ms)
#> [2026-06-15 12:57:28.015] [INFO ] > Starting: export_output [file=data/interim/annotations/example_siriusPrepared.tsv.gz, n_rows=2571]
#> [2026-06-15 12:57:28.029] [INFO ] [OK] Completed: export_output [size_bytes=98016] (14ms)
#> ✔ ann_sir_pre completed [38.1s, 99.37 kB]
#> + ann_spe_exp_gnp_pre dispatched
#> [2026-06-15 12:57:29.885] [INFO ] > Starting: prepare_annotations_gnps [n_files=1]
#> [2026-06-15 12:57:29.887] [WARN ] No GNPS annotations found, returning an empty file instead
#> [2026-06-15 12:57:29.889] [INFO ] [OK] Completed: prepare_annotations_gnps [n_annotations=1] (4ms)
#> [2026-06-15 12:57:29.909] [INFO ] Exporting parameters to: data/interim/params/260615_125729_prepare_annotations_gnps.yaml
#> [2026-06-15 12:57:29.911] [INFO ] > Starting: export_output [file=data/interim/annotations/example_gnpsPrepared.tsv.gz, n_rows=1]
#> [2026-06-15 12:57:29.912] [INFO ] [OK] Completed: export_output [size_bytes=308] (1ms)
#> ✔ ann_spe_exp_gnp_pre completed [32ms, 308 B]
#> + tax_pre dispatched
#> [2026-06-15 12:57:30.371] [INFO ] > Starting: prepare_taxa [taxon=NULL]
#> [2026-06-15 12:57:30.538] [INFO ] Processing 2 organism name(s) for OTT taxonomy lookup
#> [2026-06-15 12:57:30.791] [INFO ] Querying OTT API in 1 batches
#> [2026-06-15 12:57:30.998] [INFO ] Retrying failed queries using genus names only
#> [2026-06-15 12:57:31.005] [INFO ] Retrying with 1 genus names: blk 
#> [2026-06-15 12:57:31.194] [INFO ] Retrieving detailed taxonomy for 1 unique OTT IDs
#> [2026-06-15 12:57:31.313] [INFO ] Got OTTaxonomy!
#> [2026-06-15 12:57:31.771] [INFO ] [OK] Completed: prepare_taxa [n_features=5328] (1.4s)
#> [2026-06-15 12:57:31.804] [INFO ] Exporting parameters to: data/interim/params/260615_125731_prepare_taxa.yaml
#> [2026-06-15 12:57:31.806] [INFO ] > Starting: export_output [file=data/interim/taxa/example_taxed.tsv.gz, n_rows=5328]
#> [2026-06-15 12:57:31.813] [INFO ] [OK] Completed: export_output [size_bytes=19697] (7ms)
#> ✔ tax_pre completed [1.4s, 19.70 kB]
#> + ann_ms1_pre dispatched
#> [2026-06-15 12:57:32.290] [INFO ] > Starting: annotate_masses [ms_mode=pos, tolerance_ppm=10, tolerance_dalton=0.01, tolerance_rt=0.05]
#> [2026-06-15 12:57:32.292] [INFO ] Starting mass-based annotation
#> [2026-06-15 12:57:32.293] [INFO ] ============================================================
#> [2026-06-15 12:57:32.294] [INFO ] Data Sanitizing: Pre-flight Checks
#> [2026-06-15 12:57:32.295] [INFO ] ============================================================
#> [2026-06-15 12:57:32.295] [INFO ] Checking features file...
#> [2026-06-15 12:57:32.332] [INFO ] [OK] Features file: 5328 rows, 5 columns
#> [2026-06-15 12:57:32.333] [INFO ] ============================================================
#> [2026-06-15 12:57:32.334] [INFO ] [OK] All pre-flight checks passed!
#> [2026-06-15 12:57:32.335] [INFO ] Data validation complete. Ready to proceed.
#> [2026-06-15 12:57:32.336] [INFO ] ============================================================
#> [2026-06-15 12:57:32.372] [INFO ] Processing 5328 features for annotation
#> [2026-06-15 12:57:32.374] [INFO ] > Starting: harmonize_adducts [n_rows=5328]
#> [2026-06-15 12:57:32.399] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=13, n_unique_after=13] (25ms)
#> [2026-06-15 12:57:32.418] [INFO ] > Starting: harmonize_adducts [n_rows=2112]
#> [2026-06-15 12:57:32.419] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=13, n_unique_after=13] (2ms)
#> [2026-06-15 12:57:32.420] [INFO ] Pre-assigned adducts kept as hypotheses alongside the [M+H]+ baseline: 2112
#> [2026-06-15 12:58:04.701] [INFO ] Built 46541 feature pairs in 0.63 seconds
#> [2026-06-15 12:58:04.954] [INFO ] Here are the top 16 observed m/z differences inside the RT windows:
#> [2026-06-15 12:58:04.956] [INFO ] 
#>                 bin   N    Pct
#>   (4.94258,4.96123] 372 17.11%
#>   (21.9709,21.9895] 292 13.43%
#>   (17.0097,17.0284] 211  9.71%
#>   (17.9982,18.0169] 182  8.37%
#>   (38.9991,39.0178] 154  7.08%
#>   (15.9653,15.9839] 131  6.03%
#>   (39.9876,40.0063] 120  5.52%
#>   (77.9982,78.0168] 109  5.01%
#>   (2.01438,2.03303]  95  4.37%
#>   (18.4831,18.5018]  85  3.91%
#>   (35.0265,35.0451]  83  3.82%
#>    (30.0094,30.028]  79  3.63%
#>  (0.988584,1.00724]  75  3.45%
#>   (13.9696,13.9883]  62  2.85%
#>   (162.039,162.058]  62  2.85%
#>   (15.9839,16.0026]  62  2.85%
#> [2026-06-15 12:58:05.940] [INFO ] Evidence engine: 5328 features x 208 adducts (prefilter=on, cap=208)
#> [2026-06-15 12:58:06.374] [INFO ] Evidence engine candidate materialization: 665175 rows
#> [2026-06-15 12:58:27.041] [INFO ] Evidence engine complete: 49486 rows, 23450 supported clusters
#> [2026-06-15 12:59:03.699] [INFO ] > Starting: harmonize_adducts [n_rows=5328]
#> [2026-06-15 12:59:03.701] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=8, n_unique_after=8] (2ms)
#> [2026-06-15 12:59:06.723] [INFO ] Pairwise-support filter removed 36771 modifier-bearing evidence hypothesis row(s) lacking direct adduct/cluster/loss support.
#> [2026-06-15 12:59:06.757] [INFO ] Evidence-based discovery added 1825 adduct edge(s)
#> [2026-06-15 12:59:06.758] [INFO ] Edge classification complete in 61.76 seconds: 1114 adduct edges, 489 cluster edges, 1651 loss edges
#> [2026-06-15 12:59:28.202] [INFO ] > Starting: harmonize_adducts [n_rows=1078]
#> [2026-06-15 12:59:28.210] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=15, n_unique_after=15] (8ms)
#> [2026-06-15 12:59:28.287] [INFO ] > Starting: harmonize_adducts [n_rows=19234]
#> [2026-06-15 12:59:28.312] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=41, n_unique_after=38] (25ms)
#> [2026-06-15 12:59:28.355] [INFO ] > Starting: harmonize_adducts [n_rows=2383]
#> [2026-06-15 12:59:28.504] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=90, n_unique_after=89] (149ms)
#> [2026-06-15 12:59:28.528] [INFO ] > Starting: harmonize_adducts [n_rows=6224]
#> [2026-06-15 12:59:30.306] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=908, n_unique_after=905] (1.8s)
#> [2026-06-15 12:59:45.950] [INFO ] Constrained multi-adduct expansion kept 3093 hypothesis row(s)
#> [2026-06-15 12:59:45.952] [INFO ] Generated node hypotheses in 39.19 seconds
#> [2026-06-15 12:59:46.748] [INFO ] Network-consensus pruning dropped 5102 (feature, adduct) candidate(s) with zero adduct-graph support when a supported alternative existed.
#> [2026-06-15 12:59:51.504] [INFO ] Annotation/edge adduct agreement removed 2827 unsupported (feature, adduct) assignment(s).
#> [2026-06-15 13:00:50.152] [INFO ] Conflict-resolution filter removed 6864 annotation row(s) with states incompatible with graph-consistent evidence.
#> [2026-06-15 13:00:50.153] [INFO ] Conflict-resolution pruning touched 1904 feature(s) and removed all annotations from 66 feature(s).
#> [2026-06-15 13:00:50.265] [INFO ] Library matching complete in 64.31 seconds: 455214 annotations
#> [2026-06-15 13:00:51.267] [INFO ] Coverage audit: kept 455214/544822 annotation rows across 5262/5328 features; 89608 annotation rows were pruned from 1904 feature(s).
#> [2026-06-15 13:00:51.269] [INFO ] > Starting: decorate_masses [n_annotations=455214]
#> [2026-06-15 13:00:51.405] [INFO ] MS1 annotations: 190304 unique structures across 5074 features
#> [2026-06-15 13:00:51.407] [INFO ] [OK] Completed: decorate_masses [n_structures=190304, n_features=5074] (137ms)
#> [2026-06-15 13:00:52.011] [INFO ] Breakdown of the annotated adduct species (library-matched):
#> [2026-06-15 13:00:52.017] [INFO ] 
#>                adduct N_features N_annotations Pct_features Pct_annotations
#>               [M+Na]+       2817        107073       24.37%          23.53%
#>                [M+H]+       2769        144873       23.96%          31.84%
#>              [M+H4N]+       2646         87716       22.89%          19.28%
#>                [M+K]+        331          8837        2.86%           1.94%
#>            [M+H3N+H]+        298         11654        2.58%           2.56%
#>               [2M+H]+        221         16460        1.91%           3.62%
#>                  [M]+        212          4840        1.83%           1.06%
#>            [M-H2O+H]+        170         10907        1.47%           2.40%
#>             [M-H+Fe]+        157          2584        1.36%           0.57%
#>            [M-H2+Fe]+        144          2781        1.25%           0.61%
#>              [2M+Na]+        135          7405        1.17%           1.63%
#>               [M+Cu]+        108          2189        0.93%           0.48%
#>               [2M+K]+         91          4235        0.79%           0.93%
#>              [M+Ca]2+         80          1542        0.69%           0.34%
#>             [2M+H4N]+         74          4028        0.64%           0.89%
#>             [2M+Ca]2+         70          2349        0.61%           0.52%
#>             [2M+Fe]2+         63          1484        0.55%           0.33%
#>             [2M+Mg]2+         42           616        0.36%           0.14%
#>            [M+H2O+H]+         35          1615        0.30%           0.35%
#>              [M+H2]2+         35           651        0.30%           0.14%
#>            [M-H+2Na]+         34          1486        0.29%           0.33%
#>           [M-H2O+Na]+         25           279        0.22%           0.06%
#>              [M+Fe]2+         24           454        0.21%           0.10%
#>          [M-H2O+H4N]+         23           259        0.20%           0.06%
#>           [M+H2O+Na]+         22           589        0.19%           0.13%
#>          [M+H2O+H4N]+         21           199        0.18%           0.04%
#>            [M+H2O+K]+         19           267        0.16%           0.06%
#>        [2M-C2H4+H4N]+         17          1040        0.15%           0.23%
#>            [M-H2O+K]+         17           129        0.15%           0.03%
#>           [2M-H2O+H]+         15          1349        0.13%           0.30%
#>              [M+H2O]+         15           160        0.13%           0.04%
#>              [M-H2O]+         15           143        0.13%           0.03%
#>         [M-H5ON+H4N]+         14           183        0.12%           0.04%
#>           [M-C2H4+H]+         11           344        0.10%           0.08%
#>           [M-2H2O+H]+         10           239        0.09%           0.05%
#>           [2M+H2O+H]+          8           808        0.07%           0.18%
#>           [2M+H3N+H]+          8           389        0.07%           0.09%
#>              [M+Mg]2+          8           133        0.07%           0.03%
#>    [M-C6H10O5-H2O+H]+          8           124        0.07%           0.03%
#>           [M+H2O+Cu]+          8            51        0.07%           0.01%
#>              [M+2H]2+          7           233        0.06%           0.05%
#>       [M-C2F4-H+2Na]+          7           199        0.06%           0.04%
#>           [M-H2O+Cu]+          7            45        0.06%           0.01%
#>         [M-CH2O+H4N]+          6           450        0.05%           0.10%
#>            [M-CH2+H]+          6           400        0.05%           0.09%
#>       [2M-C6H10O5+H]+          6           184        0.05%           0.04%
#>           [M-H4O2+H]+          6           181        0.05%           0.04%
#>          [M-C3H4O+H]+          6           163        0.05%           0.04%
#>         [M-H-H2O+Fe]+          6            79        0.05%           0.02%
#>        [M-C6H10O5+H]+          6            74        0.05%           0.02%
#>    [M-C6H12O6-H2O+H]+          6            65        0.05%           0.01%
#>            [M+CH2O2]+          6            41        0.05%           0.01%
#>             [M-CO+H]+          5           471        0.04%           0.10%
#>            [M-CO+Na]+          5           258        0.04%           0.06%
#>           [M-CH2+Na]+          5           247        0.04%           0.05%
#>         [M-H+H2O+Fe]+          5            54        0.04%           0.01%
#>        [M-C6H10O4+H]+          5            34        0.04%           0.01%
#>         [M+C2H3N+Na]+          4           640        0.03%           0.14%
#>           [M-CH2O+H]+          4           455        0.03%           0.10%
#>       [2M-C6H10O4+K]+          4           360        0.03%           0.08%
#>            [M-O+H4N]+          4           200        0.03%           0.04%
#>         [M-C2H4+H4N]+          4           187        0.03%           0.04%
#>          [2M+H2O+Na]+          4           170        0.03%           0.04%
#>       [2M-C6H10O5+K]+          4           161        0.03%           0.04%
#>          [2M-H2O+Na]+          4           134        0.03%           0.03%
#>       [2M-C6H10O4+H]+          4           131        0.03%           0.03%
#>               [M-CO]+          4           127        0.03%           0.03%
#>          [M+CH2O2+H]+          4           119        0.03%           0.03%
#>      [2M-C6H12O6+Na]+          4           108        0.03%           0.02%
#>            [M-CH2+K]+          4           108        0.03%           0.02%
#>         [M-C4H4O+Na]+          4           106        0.03%           0.02%
#>           [M-CO+H4N]+          4            99        0.03%           0.02%
#>      [2M-C6H10O5+Na]+          4            96        0.03%           0.02%
#>          [M-CH2+H4N]+          4            90        0.03%           0.02%
#>           [2M+H2O+K]+          4            84        0.03%           0.02%
#>        [M-C6H12O6+H]+          4            83        0.03%           0.02%
#>      [2M-C6H10O4+Na]+          4            68        0.03%           0.01%
#>       [2M-C6H12O6+H]+          4            68        0.03%           0.01%
#>        [M-H2-H2O+Fe]+          4            67        0.03%           0.01%
#>        [M-C6H10O4+K]+          4            58        0.03%           0.01%
#>    [M-C6H10O4-H2O+H]+          4            56        0.03%           0.01%
#>          [M-C2H4+Na]+          4            46        0.03%           0.01%
#>        [M-C3H4O+H4N]+          4            41        0.03%           0.01%
#>       [M-C6H10O4+Na]+          4            41        0.03%           0.01%
#>      [M-C6H10O5+H4N]+          4            34        0.03%           0.01%
#>              [M-O+K]+          4            28        0.03%           0.01%
#>    [2M-C12H20O10+Na]+          4            26        0.03%           0.01%
#>          [M-H4O2+Na]+          4            23        0.03%           0.01%
#>     [2M-C12H20O10+H]+          4            22        0.03%           0.00%
#>         [M-C2F4+H4N]+          4            21        0.03%           0.00%
#>     [2M-C12H20O10+K]+          4            20        0.03%           0.00%
#>          [M-C6H10O4]+          4            14        0.03%           0.00%
#>              [M+H3]3+          4            12        0.03%           0.00%
#>        [M-C12H20O10]+          4             9        0.03%           0.00%
#>     [M-C12H20O10+Na]+          4             7        0.03%           0.00%
#>          [2M-H4O2+H]+          3           726        0.03%           0.16%
#>          [M+C2H3N+H]+          3           515        0.03%           0.11%
#>          [M+C2H7N+H]+          3           264        0.03%           0.06%
#>          [M-C2O2+Na]+          3           198        0.03%           0.04%
#>       [M-CH2O-H2O+H]+          3           191        0.03%           0.04%
#>        [M-C4H4O+H4N]+          3           132        0.03%           0.03%
#>         [M-CH3N+H4N]+          3           115        0.03%           0.03%
#>         [2M-CH2+H4N]+          3            97        0.03%           0.02%
#>         [2M-CH2+Fe]2+          3            94        0.03%           0.02%
#>          [M-C4H4O+H]+          3            90        0.03%           0.02%
#>        [M+C2H7N+H4N]+          3            88        0.03%           0.02%
#>            [2M-CO+H]+          3            80        0.03%           0.02%
#>           [2M-H2O+K]+          3            76        0.03%           0.02%
#>           [M-CH2O+K]+          3            70        0.03%           0.02%
#>             [M-O+Na]+          3            70        0.03%           0.02%
#>        [M+CH2O2+H4N]+          3            67        0.03%           0.01%
#>          [M-CH2O+Na]+          3            67        0.03%           0.01%
#>      [M-C12H20O10+H]+          3            65        0.03%           0.01%
#>         [2M-C3H4O+H]+          3            60        0.03%           0.01%
#>              [M-CH2]+          3            44        0.03%           0.01%
#>           [M-C2H4+K]+          3            38        0.03%           0.01%
#>     [M-C6H8O6-H2O+H]+          3            32        0.03%           0.01%
#>             [M-C2H4]+          3            30        0.03%           0.01%
#>        [2M-CH2O+Ca]2+          3            28        0.03%           0.01%
#>            [M-CO+Cu]+          3            27        0.03%           0.01%
#>             [M-H4O2]+          3            25        0.03%           0.01%
#>      [M-C6H12O6+H4N]+          3            22        0.03%           0.00%
#>    [M-C6H12O6-H2+Fe]+          3            22        0.03%           0.00%
#>         [M-C5H8O4+H]+          3            21        0.03%           0.00%
#>        [M-H2+H2O+Fe]+          3            20        0.03%           0.00%
#>      [M-C12H20O10+K]+          3            14        0.03%           0.00%
#>    [M-C12H20O10+H4N]+          3            13        0.03%           0.00%
#>          [2M-CH2O+K]+          3             9        0.03%           0.00%
#>    [M-C6H10O4-H2+Fe]+          3             8        0.03%           0.00%
#>          [M-C6H12O6]+          3             8        0.03%           0.00%
#>     [M-C12H20O10+Cu]+          3             7        0.03%           0.00%
#>           [M-H4O2+K]+          3             7        0.03%           0.00%
#>         [2M-CH2+Ca]2+          2           832        0.02%           0.18%
#>        [M-CH2-H2O+H]+          2           312        0.02%           0.07%
#>             [2M-O+H]+          2           245        0.02%           0.05%
#>         [M+C2H7N+Na]+          2           245        0.02%           0.05%
#>              [M-O+H]+          2           243        0.02%           0.05%
#>        [2M-C5H8O4+H]+          2           220        0.02%           0.05%
#>         [M-CO-H2O+H]+          2           184        0.02%           0.04%
#>        [2M-C6H8O6+H]+          2           176        0.02%           0.04%
#>           [2M-CO2+K]+          2           175        0.02%           0.04%
#>         [2M+H2O+Mg]2+          2           137        0.02%           0.03%
#>         [M+CH2O2+Na]+          2           136        0.02%           0.03%
#>         [M-C3H4O+Na]+          2           129        0.02%           0.03%
#>        [2M-C2O2+Fe]2+          2           118        0.02%           0.03%
#>        [2M-C2H4+Ca]2+          2           106        0.02%           0.02%
#>         [M-C2O2+H4N]+          2           100        0.02%           0.02%
#>       [M-C6H6O3+H4N]+          2            91        0.02%           0.02%
#>          [2M-CO+Ca]2+          2            87        0.02%           0.02%
#>           [M-C2O2+K]+          2            85        0.02%           0.02%
#>         [M-C3H6O2+H]+          2            85        0.02%           0.02%
#>          [M-H2O-O+H]+          2            80        0.02%           0.02%
#>        [M-C6H14O7+H]+          2            78        0.02%           0.02%
#>        [2M-C6H8O4+H]+          2            71        0.02%           0.02%
#>        [2M-CH2O+H4N]+          2            65        0.02%           0.01%
#>       [2M-C6H14O7+H]+          2            64        0.02%           0.01%
#>      [M-C2H2O-H2O+H]+          2            63        0.02%           0.01%
#>             [M-CO+K]+          2            55        0.02%           0.01%
#>       [M-H2-H2O2+Fe]+          2            53        0.02%           0.01%
#>           [2M-O+Fe]2+          2            48        0.02%           0.01%
#>     [M-C6H6O3-H2O+H]+          2            48        0.02%           0.01%
#>         [2M+CH2O2+H]+          2            45        0.02%           0.01%
#>         [2M-CH2O+Na]+          2            45        0.02%           0.01%
#>          [M-C2F2+Na]+          2            44        0.02%           0.01%
#>       [M-H2O-H2O2+H]+          2            44        0.02%           0.01%
#>          [M-C3O3+Na]+          2            42        0.02%           0.01%
#>          [M-H2-O+Fe]+          2            41        0.02%           0.01%
#>        [2M-C2H4+Fe]2+          2            37        0.02%           0.01%
#>        [M-C7H4O4+Na]+          2            37        0.02%           0.01%
#>          [M-CH6O3+H]+          2            37        0.02%           0.01%
#>      [M-C2H2O3-H+Fe]+          2            34        0.02%           0.01%
#>     [M-C6H8O4-H2O+H]+          2            32        0.02%           0.01%
#>          [2M-2H2O+H]+          2            31        0.02%           0.01%
#>         [M-C6H8O6+H]+          2            28        0.02%           0.01%
#>        [M-C3H8O4+Na]+          2            27        0.02%           0.01%
#>             [M-CH2O]+          2            27        0.02%           0.01%
#>         [2M+H2O+Fe]2+          2            26        0.02%           0.01%
#>         [M-C6H6O3+H]+          2            25        0.02%           0.01%
#>      [2M-C6H14O7+Na]+          2            24        0.02%           0.01%
#>          [M+C2H7N+K]+          2            24        0.02%           0.01%
#>       [M-C8H12O6+Na]+          2            23        0.02%           0.01%
#>        [2M-CH2O+Fe]2+          2            22        0.02%           0.00%
#>        [M-C2H4O+H4N]+          2            22        0.02%           0.00%
#>     [M-C6H15NO6+H4N]+          2            17        0.02%           0.00%
#>          [M+CH2O2+K]+          2            16        0.02%           0.00%
#>          [M-C2H4O+H]+          2            16        0.02%           0.00%
#>      [M-C6H10O4+H4N]+          2            16        0.02%           0.00%
#>        [M-C6H8O4+Na]+          2            16        0.02%           0.00%
#>       [M-C6H10O5+Cu]+          2            14        0.02%           0.00%
#>       [M-CH2O-H2+Fe]+          2            14        0.02%           0.00%
#>           [M-H6O3+H]+          2            14        0.02%           0.00%
#>          [M-CH2O+Cu]+          2            13        0.02%           0.00%
#>         [2M-H2O+Fe]2+          2            12        0.02%           0.00%
#>    [M-C6H10O5-H2+Fe]+          2            12        0.02%           0.00%
#>          [M-C2H2O+K]+          2            10        0.02%           0.00%
#>        [M-C2H4-H+Fe]+          2             9        0.02%           0.00%
#>           [M-CH2+Cu]+          2             8        0.02%           0.00%
#>       [M-H+CH2O2+Fe]+          2             8        0.02%           0.00%
#>       [M-C6H12O6+Na]+          2             6        0.02%           0.00%
#>       [2M-C3H4O+Fe]2+          2             5        0.02%           0.00%
#>   [M-C12H20O10-H+Fe]+          2             5        0.02%           0.00%
#>       [M-C6H10O5+Na]+          2             5        0.02%           0.00%
#>      [M-C6H14O7+H4N]+          2             4        0.02%           0.00%
#>       [M-C6H14O7+Na]+          2             4        0.02%           0.00%
#>          [M+H2O+Mg]2+          1           178        0.01%           0.04%
#>       [2M-C8H8O2+Na]+          1           150        0.01%           0.03%
#>           [M-C2O2+H]+          1           147        0.01%           0.03%
#>        [2M-H4O2+H4N]+          1           146        0.01%           0.03%
#>            [M-CO2+H]+          1           128        0.01%           0.03%
#>         [M-H-HN+2Na]+          1           127        0.01%           0.03%
#>      [M-H2O+C2H7N+H]+          1           126        0.01%           0.03%
#>        [2M-C6H6O3+H]+          1           111        0.01%           0.02%
#>        [2M-C4H8O2+K]+          1           109        0.01%           0.02%
#>          [M-CHN+H4N]+          1            99        0.01%           0.02%
#>           [2M-CO+Na]+          1            98        0.01%           0.02%
#>       [2M-C2H2O+H4N]+          1            94        0.01%           0.02%
#>           [2M-H2S+H]+          1            92        0.01%           0.02%
#>        [2M-H8O4+H4N]+          1            92        0.01%           0.02%
#>         [2M-H2O+Mg]2+          1            87        0.01%           0.02%
#>      [2M-C3H6O2+Ca]2+          1            83        0.01%           0.02%
#>         [M+H4ClN+Cu]+          1            83        0.01%           0.02%
#>          [2M-CO+H4N]+          1            81        0.01%           0.02%
#>        [M-C2F2-H+Fe]+          1            77        0.01%           0.02%
#>          [2M-H4O2+K]+          1            75        0.01%           0.02%
#>         [M-H2O-HN+H]+          1            75        0.01%           0.02%
#>     [2M-C6H10O4+H4N]+          1            70        0.01%           0.02%
#>      [2M-C12H20O8+H]+          1            69        0.01%           0.02%
#>            [M+C2H7N]+          1            69        0.01%           0.02%
#>           [M-C3O3+H]+          1            69        0.01%           0.02%
#>            [M-NO+Na]+          1            68        0.01%           0.01%
#>              [M-CO2]+          1            65        0.01%           0.01%
#>        [M-H2-SO3+Fe]+          1            65        0.01%           0.01%
#>          [M-C3O3+Cu]+          1            64        0.01%           0.01%
#>     [2M-C13H14O6+Na]+          1            62        0.01%           0.01%
#>          [2M-C2O2+H]+          1            60        0.01%           0.01%
#>           [M-CH3N+H]+          1            59        0.01%           0.01%
#>        [M-C8H8O2+Na]+          1            58        0.01%           0.01%
#>         [2M-H2O+H4N]+          1            57        0.01%           0.01%
#>          [2M-H2S+Na]+          1            57        0.01%           0.01%
#>       [2M-C3H4O4+Na]+          1            55        0.01%           0.01%
#>           [2M-CH2+K]+          1            55        0.01%           0.01%
#>        [2M-C2H4O+Na]+          1            54        0.01%           0.01%
#>       [2M-C9H12O8+K]+          1            53        0.01%           0.01%
#>         [2M-H2O2+Na]+          1            50        0.01%           0.01%
#>       [2M-C6H12O6+K]+          1            49        0.01%           0.01%
#>         [2M-C2F2+Na]+          1            48        0.01%           0.01%
#>       [M-C2O2-H2O+H]+          1            48        0.01%           0.01%
#>    [M-C8H12O6-H2+Fe]+          1            47        0.01%           0.01%
#>       [2M-C4H4O+H4N]+          1            46        0.01%           0.01%
#>       [M-CH6O3-H+Fe]+          1            46        0.01%           0.01%
#>          [M-H2O2+Na]+          1            46        0.01%           0.01%
#>         [2M-CH6O3+H]+          1            45        0.01%           0.01%
#>         [2M-H4O2+Na]+          1            44        0.01%           0.01%
#>            [2M-O+Na]+          1            44        0.01%           0.01%
#>           [M-H2O2+H]+          1            44        0.01%           0.01%
#>          [M+C2H3N+K]+          1            43        0.01%           0.01%
#>           [M-CF3+Na]+          1            43        0.01%           0.01%
#>      [2M-C3H4O4+Fe]2+          1            41        0.01%           0.01%
#>      [M+CH2O2+H3N+H]+          1            41        0.01%           0.01%
#>            [M-H2S+H]+          1            41        0.01%           0.01%
#>       [2M+CH2O2+Fe]2+          1            40        0.01%           0.01%
#>         [M-C2H2O3+H]+          1            40        0.01%           0.01%
#>          [2M-C2F4+H]+          1            39        0.01%           0.01%
#>             [M-C2O2]+          1            39        0.01%           0.01%
#>             [M-CH3N]+          1            39        0.01%           0.01%
#>            [M-CHN+H]+          1            39        0.01%           0.01%
#>             [M-H2O2]+          1            39        0.01%           0.01%
#>          [M-C2H2O+H]+          1            38        0.01%           0.01%
#>        [2M-C6H6O3+K]+          1            37        0.01%           0.01%
#>                [M-O]+          1            37        0.01%           0.01%
#>       [2M-C10H8O3+K]+          1            34        0.01%           0.01%
#>       [2M-C5H8O4+Na]+          1            34        0.01%           0.01%
#>       [2M-C8H12O6+H]+          1            34        0.01%           0.01%
#>      [2M-C9H6O3+H4N]+          1            34        0.01%           0.01%
#>              [M-CHN]+          1            34        0.01%           0.01%
#>          [M-CO2+H4N]+          1            33        0.01%           0.01%
#>            [M-CO2+K]+          1            33        0.01%           0.01%
#>    [M-C8H10O4-H2O+H]+          1            32        0.01%           0.01%
#>     [2M-C6H10O4+Ca]2+          1            31        0.01%           0.01%
#>           [2M-H2S+K]+          1            30        0.01%           0.01%
#>         [M+C2H7N+Cu]+          1            30        0.01%           0.01%
#>      [M-C2H4O-H2+Fe]+          1            30        0.01%           0.01%
#>    [M-C6H14O7-H2+Fe]+          1            29        0.01%           0.01%
#>        [2M-C4O4H6+H]+          1            28        0.01%           0.01%
#>           [2M-H+2Na]+          1            28        0.01%           0.01%
#>            [2M-CO+K]+          1            27        0.01%           0.01%
#>      [M-C3H4O-H2O+H]+          1            27        0.01%           0.01%
#>        [M-C6H6O3+Cu]+          1            27        0.01%           0.01%
#>      [M-C5H8O4-H+Fe]+          1            26        0.01%           0.01%
#>      [2M-C4O4H6+Mg]2+          1            25        0.01%           0.01%
#>       [2M-C6H8O4+Na]+          1            25        0.01%           0.01%
#>      [2M-C9H12O8+Na]+          1            25        0.01%           0.01%
#>          [M-CF2+H4N]+          1            25        0.01%           0.01%
#>        [2M-C2F2+Mg]2+          1            24        0.01%           0.01%
#>       [M-C2H2O3+H4N]+          1            24        0.01%           0.01%
#>        [M-CF3-H2+Fe]+          1            24        0.01%           0.01%
#>         [M-H-SO3+Fe]+          1            24        0.01%           0.01%
#>             [M-HF+H]+          1            24        0.01%           0.01%
#>        [M-C3H4O4+Na]+          1            23        0.01%           0.01%
#>       [M-C4O4H6+H4N]+          1            23        0.01%           0.01%
#>      [2M-C8H12O6+Na]+          1            22        0.01%           0.00%
#>            [M-C3H4O]+          1            22        0.01%           0.00%
#>          [M-CO2+Fe]2+          1            22        0.01%           0.00%
#>         [2M+H2O+H4N]+          1            21        0.01%           0.00%
#>      [2M-C12H20O8+K]+          1            21        0.01%           0.00%
#>     [2M-C6H10O4+Mg]2+          1            21        0.01%           0.00%
#>          [2M-CH2O+H]+          1            21        0.01%           0.00%
#>       [M-C2H4-H2O+H]+          1            21        0.01%           0.00%
#>       [M-C2F4-H2+Fe]+          1            20        0.01%           0.00%
#>       [M-C2O2-H2+Fe]+          1            20        0.01%           0.00%
#>             [M-HN+H]+          1            20        0.01%           0.00%
#>            [M-CH6O3]+          1            19        0.01%           0.00%
#>            [M-C4H4O]+          1            18        0.01%           0.00%
#>      [M-C6H6O3-H+Fe]+          1            18        0.01%           0.00%
#>          [M-CH3N+Na]+          1            18        0.01%           0.00%
#>          [M-H2O+Mg]2+          1            18        0.01%           0.00%
#>         [2M-CO2+Ca]2+          1            17        0.01%           0.00%
#>        [M-C3O3-H+Fe]+          1            17        0.01%           0.00%
#>       [M-C4H8O4+H4N]+          1            17        0.01%           0.00%
#>        [M-CH6O3+H4N]+          1            17        0.01%           0.00%
#>     [2M-C6H10O5+Fe]2+          1            16        0.01%           0.00%
#>     [M-C4O4H6-H2O+H]+          1            16        0.01%           0.00%
#>       [M-H+C2H7N+Fe]+          1            16        0.01%           0.00%
#>        [M-H-H5ON+Fe]+          1            16        0.01%           0.00%
#>         [M-C2H4O+Cu]+          1            15        0.01%           0.00%
#>      [M-C4H4O-H2+Fe]+          1            15        0.01%           0.00%
#>       [M-C3H6O3+H4N]+          1            14        0.01%           0.00%
#>         [M-C4H8O2+H]+          1            14        0.01%           0.00%
#>     [M-C8H12O6-H+Fe]+          1            14        0.01%           0.00%
#>       [2M-C2H2O3+Na]+          1            13        0.01%           0.00%
#>         [M-C2H4O6+H]+          1            13        0.01%           0.00%
#>       [M-C6H12O6+Cu]+          1            13        0.01%           0.00%
#>        [M-C8H8O3+Na]+          1            13        0.01%           0.00%
#>      [M-CH6O3-H2O+H]+          1            13        0.01%           0.00%
#>          [M-H6O3+Na]+          1            13        0.01%           0.00%
#>        [M+C2H3N+H4N]+          1            12        0.01%           0.00%
#>         [M-C2H2O+Na]+          1            12        0.01%           0.00%
#>       [M-C3H4O3+H4N]+          1            12        0.01%           0.00%
#>             [M-C3O3]+          1            12        0.01%           0.00%
#>         [M-CH6O3+Na]+          1            12        0.01%           0.00%
#>        [2M-H2O2+Ca]2+          1            11        0.01%           0.00%
#>      [M-C3H7NO2+H4N]+          1            11        0.01%           0.00%
#>       [M-C4H4O-H+Fe]+          1            11        0.01%           0.00%
#>        [M-C4H8O2+Na]+          1            11        0.01%           0.00%
#>     [M-C5H8O4-H2+Fe]+          1            11        0.01%           0.00%
#>           [M-CHF2+H]+          1            11        0.01%           0.00%
#>            [M-HN+Na]+          1            11        0.01%           0.00%
#>           [M+H3N+Na]+          1            10        0.01%           0.00%
#>         [M-C2H4O+Na]+          1            10        0.01%           0.00%
#>           [M-CH3N+K]+          1            10        0.01%           0.00%
#>          [M-H2S+H4N]+          1            10        0.01%           0.00%
#>           [M-HN+H4N]+          1            10        0.01%           0.00%
#>         [2M-C2F4+Na]+          1             9        0.01%           0.00%
#>          [2M-C2H4+H]+          1             9        0.01%           0.00%
#>        [2M-C4H8O4+K]+          1             9        0.01%           0.00%
#>     [M-C6H10O4-H+Fe]+          1             9        0.01%           0.00%
#>          [M-CH2+Ca]2+          1             9        0.01%           0.00%
#>          [M-CO2+H2]2+          1             9        0.01%           0.00%
#>         [M-CO2-H+Fe]+          1             9        0.01%           0.00%
#>        [M-H2-H2S+Fe]+          1             9        0.01%           0.00%
#>          [M-H5ON+Cu]+          1             9        0.01%           0.00%
#>          [2M-C2H4+K]+          1             8        0.01%           0.00%
#>         [2M-C2O2+Na]+          1             8        0.01%           0.00%
#>       [2M-C6H14O7+K]+          1             8        0.01%           0.00%
#>     [M-C4H8O2-H2O+H]+          1             8        0.01%           0.00%
#>    [M-C7H12O6-H2+Fe]+          1             8        0.01%           0.00%
#>        [M-C8H12O6+K]+          1             8        0.01%           0.00%
#>        [M-C8H8O4+Na]+          1             8        0.01%           0.00%
#>          [M-CHO2+Cu]+          1             8        0.01%           0.00%
#>         [2M+H2O+Ca]2+          1             7        0.01%           0.00%
#>        [2M-C3H4O4+H]+          1             7        0.01%           0.00%
#>        [2M-C4H8O2+H]+          1             7        0.01%           0.00%
#>        [2M-H2O2+Fe]2+          1             7        0.01%           0.00%
#>         [M-C2H2O5+H]+          1             7        0.01%           0.00%
#>     [M-C3H6O2-H2O+H]+          1             7        0.01%           0.00%
#>    [M-C7H12O6-H2O+H]+          1             7        0.01%           0.00%
#>       [M-C9H12O8+Cu]+          1             7        0.01%           0.00%
#>          [M-CO-H+Fe]+          1             7        0.01%           0.00%
#>          [M-SO3+H4N]+          1             7        0.01%           0.00%
#>      [2M-C3H6O2+Fe]2+          1             6        0.01%           0.00%
#>        [2M-C5H8O4+K]+          1             6        0.01%           0.00%
#>     [2M-C6H12O6+Ca]2+          1             6        0.01%           0.00%
#>         [M-C4H4O+Cu]+          1             6        0.01%           0.00%
#>       [M-C8H12O6+Cu]+          1             6        0.01%           0.00%
#>          [2M-CO2+Na]+          1             5        0.01%           0.00%
#>        [M+C2H3N+H2]2+          1             5        0.01%           0.00%
#>       [M-C2F4-H2O+H]+          1             5        0.01%           0.00%
#>          [M-C2O2+Cu]+          1             5        0.01%           0.00%
#>     [M-C6H13NO5+H4N]+          1             5        0.01%           0.00%
#>      [M-C6H15NO6+Na]+          1             5        0.01%           0.00%
#>         [M-CH2-H+Fe]+          1             5        0.01%           0.00%
#>           [M-H3N+Na]+          1             5        0.01%           0.00%
#>             [M-H6O3]+          1             5        0.01%           0.00%
#>              [M-SO3]+          1             5        0.01%           0.00%
#>         [2M-C2H2O+K]+          1             4        0.01%           0.00%
#>          [M-C3H4O+K]+          1             4        0.01%           0.00%
#>     [M-C5H8O4-H2O+H]+          1             4        0.01%           0.00%
#>         [M-C6H8O4+H]+          1             4        0.01%           0.00%
#>       [M-C6H8O6+H4N]+          1             4        0.01%           0.00%
#>           [M-C6H8O6]+          1             4        0.01%           0.00%
#>        [M-C8H12O6+H]+          1             4        0.01%           0.00%
#>        [M-C9H12O8+K]+          1             4        0.01%           0.00%
#>         [M-CO-H2+Fe]+          1             4        0.01%           0.00%
#>           [M-H5ON+K]+          1             4        0.01%           0.00%
#>        [2M-H4O2+Ca]2+          1             3        0.01%           0.00%
#>             [2M-O+K]+          1             3        0.01%           0.00%
#>          [M-C4H4O+K]+          1             3        0.01%           0.00%
#>       [M-C6H10O4+Cu]+          1             3        0.01%           0.00%
#>           [M-SO3+Cu]+          1             3        0.01%           0.00%
#>        [2M-C3H4O+Na]+          1             2        0.01%           0.00%
#>           [2M-CH2+H]+          1             2        0.01%           0.00%
#>       [M-C12H20O8+H]+          1             2        0.01%           0.00%
#>   [M-C12H20O8-H2O+H]+          1             2        0.01%           0.00%
#>       [M-C13H14O6+H]+          1             2        0.01%           0.00%
#>  [M-C18H30O15-H2O+H]+          1             2        0.01%           0.00%
#>           [M-C2O4+H]+          1             2        0.01%           0.00%
#>      [M-C3H4O-H2+Fe]+          1             2        0.01%           0.00%
#>     [M-C4H8O4-H2O+H]+          1             2        0.01%           0.00%
#>     [M-C4H9NO-H2O+H]+          1             2        0.01%           0.00%
#>       [M-C4O4H6+Mg]2+          1             2        0.01%           0.00%
#>        [M-C6H10O5+K]+          1             2        0.01%           0.00%
#>        [M-C6H12O6+K]+          1             2        0.01%           0.00%
#>       [M-C6H14O7+Cu]+          1             2        0.01%           0.00%
#>       [M-C6H8O4+H4N]+          1             2        0.01%           0.00%
#>      [M-C8H12O6+H4N]+          1             2        0.01%           0.00%
#>        [M-CH2O-H+Fe]+          1             2        0.01%           0.00%
#>       [M-H2O-H4O2+H]+          1             2        0.01%           0.00%
#>         [2M-C3H4O+K]+          1             1        0.01%           0.00%
#>       [M-C5H10O2+Na]+          1             1        0.01%           0.00%
#>     [M-C6H10O5-H+Fe]+          1             1        0.01%           0.00%
#>          [M-C8H12O6]+          1             1        0.01%           0.00%
#>          [M-CH6O3+K]+          1             1        0.01%           0.00%
#>           [M-CO2+Na]+          1             1        0.01%           0.00%
#>      [M-H2+C2H7N+Fe]+          1             1        0.01%           0.00%
#>        [M-H3O4P+H4N]+          1             1        0.01%           0.00%
#>          [M-H4O2+Cu]+          1             1        0.01%           0.00%
#> [2026-06-15 13:00:52.029] [INFO ] Adduct hypotheses retained without library match (by source):
#> [2026-06-15 13:00:52.031] [INFO ] 
#>       source N_features N_adduct_types
#>     baseline        140              1
#>         pair         29              7
#>  preassigned         18              7
#>         loss          1              1
#> [2026-06-15 13:00:52.104] [INFO ] Exporting parameters to: data/interim/params/260615_130052_annotate_masses.yaml
#> [2026-06-15 13:00:52.107] [INFO ] > Starting: export_output [file=data/interim/features/example_edgesMasses.tsv, n_rows=5032]
#> [2026-06-15 13:00:52.109] [INFO ] [OK] Completed: export_output [size_bytes=83643] (2ms)
#> [2026-06-15 13:00:52.111] [INFO ] Exported edges: example_edgesMasses.tsv (5,032 rows)
#> [2026-06-15 13:00:56.926] [INFO ] > Starting: process_smiles [n_structures=190331]
#> [2026-06-15 13:00:56.927] [INFO ] Processing SMILES with RDKit
#> [2026-06-15 13:01:06.804] [INFO ] Processing 801 new SMILES with RDKit
#> [2026-06-15 13:01:06.806] [INFO ] Starting SMILES processing pipeline
#> [2026-06-15 13:01:06.806] [INFO ] Input: /tmp/RtmppBog3o/file270d2967a788.smi
#> [2026-06-15 13:01:06.806] [INFO ] Output: /tmp/RtmppBog3o/file270de4f8952.csv.gz
#> [2026-06-15 13:01:06.806] [INFO ] Input file validated: /tmp/RtmppBog3o/file270d2967a788.smi
#> [2026-06-15 13:01:06.806] [INFO ] Output file validated: /tmp/RtmppBog3o/file270de4f8952.csv.gz
#> [2026-06-15 13:01:06.806] [INFO ] Processing parameters: workers=8, batch_size=1000, progress_interval=10000
#> [2026-06-15 13:01:06.806] [INFO ] SMILES supplier initialized
#> [2026-06-15 13:01:07.970] [INFO ] Processing complete. Total molecules processed: 801
#> [2026-06-15 13:01:08.016] [INFO ] Successfully processed 801 SMILES
#> [2026-06-15 13:01:17.877] [INFO ] [OK] Completed: process_smiles [n_processed=190332] (21s)
#> [2026-06-15 13:01:22.540] [INFO ] > Starting: complement_metadata [n_input=455214]
#> [2026-06-15 13:02:00.720] [INFO ] [OK] Completed: complement_metadata [n_enriched=455214] (38.2s)
#> [2026-06-15 13:02:00.723] [INFO ] > Starting: export_output [file=data/interim/annotations/example_ms1Prepared.tsv.gz, n_rows=455214]
#> [2026-06-15 13:02:02.873] [INFO ] [OK] Completed: export_output [size_bytes=32096254] (2.1s)
#> [2026-06-15 13:02:02.874] [INFO ] Exported annotations: example_ms1Prepared.tsv.gz (455,214 rows)
#> [2026-06-15 13:02:02.876] [INFO ] > Starting: export_output [file=data/interim/annotations/example_ms1Prepared_coverage.tsv.gz, n_rows=10]
#> [2026-06-15 13:02:02.877] [INFO ] [OK] Completed: export_output [size_bytes=264] (2ms)
#> [2026-06-15 13:02:02.879] [INFO ] Exported coverage report: example_ms1Prepared_coverage.tsv.gz
#> [2026-06-15 13:02:02.880] [INFO ] All outputs exported in 70.85 seconds
#> [2026-06-15 13:02:02.881] [INFO ] [OK] Completed: annotate_masses [n_annotations=455214, n_edges=5032] (4m 31s)
#> ✔ ann_ms1_pre completed [4m 30.6s, 32.18 MB]
#> + ann_sir_pre_can dispatched
#> ✔ ann_sir_pre_can completed [0ms, 830 B]
#> + ann_sir_pre_for dispatched
#> ✔ ann_sir_pre_for completed [0ms, 521 B]
#> + ann_sir_pre_str dispatched
#> ✔ ann_sir_pre_str completed [0ms, 98.02 kB]
#> + ann_ms1_pre_ann dispatched
#> ✔ ann_ms1_pre_ann completed [1ms, 32.10 MB]
#> + ann_ms1_pre_edg dispatched
#> ✔ ann_ms1_pre_edg completed [1ms, 83.64 kB]
#> + ann_spe_neg dispatched
#> [2026-06-15 13:02:06.528] [INFO ] ============================================================
#> [2026-06-15 13:02:06.530] [INFO ] Data Sanitizing: Pre-flight Checks
#> [2026-06-15 13:02:06.531] [INFO ] ============================================================
#> [2026-06-15 13:02:06.531] [INFO ] Checking MGF file...
#> [2026-06-15 13:02:07.013] [INFO ] [OK] MGF file: 12195 MS2 spectra found
#> [2026-06-15 13:02:07.014] [INFO ] ============================================================
#> [2026-06-15 13:02:07.015] [INFO ] [OK] All pre-flight checks passed!
#> [2026-06-15 13:02:07.016] [INFO ] Data validation complete. Ready to proceed.
#> [2026-06-15 13:02:07.017] [INFO ] ============================================================
#> [2026-06-15 13:02:07.018] [INFO ] Starting spectral annotation in neg mode
#> [2026-06-15 13:02:07.019] [INFO ] Importing spectra from: data/source/example_spectra.mgf
#> [2026-06-15 13:02:07.020] [INFO ] Reading MGF file (7.41 MB) with optimized parser: data/source/example_spectra.mgf
#> [2026-06-15 13:02:09.010] [INFO ] Processed 10000 spectra...
#> [2026-06-15 13:02:10.294] [INFO ] Total spectra read: 16282
#> [2026-06-15 13:02:16.733] [INFO ] Loaded 16282 spectra from file
#> [2026-06-15 13:02:16.756] [INFO ] Combining replicate spectra by FEATURE_ID
#> [2026-06-15 13:02:16.761] [INFO ] Combined replicates: 0 -> 0 spectra
#> [2026-06-15 13:02:16.799] [WARN ] No spectra to sanitize
#> [2026-06-15 13:02:16.800] [INFO ] Import complete: 0 spectra ready for analysis
#> [2026-06-15 13:02:16.801] [WARN ] No query spectra loaded
#> [2026-06-15 13:02:16.804] [INFO ] Exporting parameters to: data/interim/params/260615_130216_annotate_spectra.yaml
#> [2026-06-15 13:02:16.805] [WARN ] Returning empty annotation template
#> [2026-06-15 13:02:16.808] [INFO ] > Starting: export_output [file=data/interim/annotations/example_spectralMatches_neg.tsv.gz, n_rows=1]
#> [2026-06-15 13:02:16.810] [INFO ] [OK] Completed: export_output [size_bytes=308] (2ms)
#> ✔ ann_spe_neg completed [10.3s, 308 B]
#> + ann_spe_pos dispatched
#> [2026-06-15 13:02:17.469] [INFO ] ============================================================
#> [2026-06-15 13:02:17.471] [INFO ] Data Sanitizing: Pre-flight Checks
#> [2026-06-15 13:02:17.472] [INFO ] ============================================================
#> [2026-06-15 13:02:17.472] [INFO ] Checking MGF file...
#> [2026-06-15 13:02:17.947] [INFO ] [OK] MGF file: 12195 MS2 spectra found
#> [2026-06-15 13:02:17.948] [INFO ] ============================================================
#> [2026-06-15 13:02:17.949] [INFO ] [OK] All pre-flight checks passed!
#> [2026-06-15 13:02:17.950] [INFO ] Data validation complete. Ready to proceed.
#> [2026-06-15 13:02:17.951] [INFO ] ============================================================
#> [2026-06-15 13:02:17.952] [INFO ] Starting spectral annotation in pos mode
#> [2026-06-15 13:02:17.953] [INFO ] Importing spectra from: data/source/example_spectra.mgf
#> [2026-06-15 13:02:17.954] [INFO ] Reading MGF file (7.41 MB) with optimized parser: data/source/example_spectra.mgf
#> [2026-06-15 13:02:19.979] [INFO ] Processed 10000 spectra...
#> [2026-06-15 13:02:21.278] [INFO ] Total spectra read: 16282
#> [2026-06-15 13:02:27.482] [INFO ] Loaded 16282 spectra from file
#> [2026-06-15 13:02:27.507] [INFO ] Combining replicate spectra by FEATURE_ID
#> [2026-06-15 13:02:28.369] [INFO ] Combined replicates: 12195 -> 4087 spectra
#> [2026-06-15 13:02:28.407] [INFO ] Sanitizing 4087 spectra (cutoff: 0)
#> [2026-06-15 13:02:29.725] [INFO ] Sanitization complete: 3999/4087 spectra retained (97.8%, 88 removed)
#> [2026-06-15 13:02:29.726] [INFO ] Import complete: 3999 spectra ready for analysis
#> [2026-06-15 13:02:29.728] [INFO ] > Starting: harmonize_adducts [n_rows=3999]
#> [2026-06-15 13:02:29.730] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=12, n_unique_after=12] (2ms)
#> [2026-06-15 13:02:32.609] [INFO ] > Starting: harmonize_adducts [n_rows=455214]
#> [2026-06-15 13:02:32.685] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=436, n_unique_after=434] (76ms)
#> [2026-06-15 13:02:33.888] [INFO ] Importing spectra from: data/interim/libraries/spectra/is/isdbnormansusdat_14854025_pos.rds
#> [2026-06-15 13:02:36.777] [INFO ] Loaded 210419 spectra from file
#> [2026-06-15 13:02:36.925] [INFO ] Import complete: 210419 spectra ready for analysis
#> [2026-06-15 13:02:36.927] [INFO ] Importing spectra from: data/interim/libraries/spectra/is/wikidata_5607185_pos.rds
#> [2026-06-15 13:03:06.256] [INFO ] Loaded 994408 spectra from file
#> [2026-06-15 13:03:09.337] [INFO ] Import complete: 994408 spectra ready for analysis
#> [2026-06-15 13:03:09.339] [INFO ] Importing spectra from: data/interim/libraries/spectra/exp/enveda180_pos.rds
#> [2026-06-15 13:03:32.423] [INFO ] Loaded 891903 spectra from file
#> [2026-06-15 13:03:32.989] [INFO ] Import complete: 891903 spectra ready for analysis
#> [2026-06-15 13:03:32.991] [INFO ] Importing spectra from: data/interim/libraries/spectra/exp/internal_pos.rds
#> [2026-06-15 13:03:32.992] [INFO ] Loaded 1 spectra from file
#> [2026-06-15 13:03:32.994] [INFO ] Import complete: 0 spectra ready for analysis
#> [2026-06-15 13:03:32.995] [INFO ] Importing spectra from: data/interim/libraries/spectra/exp/gnps_11566051_pos.rds
#> [2026-06-15 13:03:39.602] [INFO ] Loaded 272264 spectra from file
#> [2026-06-15 13:03:39.756] [INFO ] Import complete: 272263 spectra ready for analysis
#> [2026-06-15 13:03:39.757] [INFO ] Importing spectra from: data/interim/libraries/spectra/exp/massbank_202510_pos.rds
#> [2026-06-15 13:03:40.333] [INFO ] Loaded 62855 spectra from file
#> [2026-06-15 13:03:40.373] [INFO ] Import complete: 62855 spectra ready for analysis
#> [2026-06-15 13:03:40.375] [INFO ] Importing spectra from: data/interim/libraries/spectra/exp/merlin_16984129_pos.rds
#> [2026-06-15 13:03:43.865] [INFO ] Loaded 336677 spectra from file
#> [2026-06-15 13:03:44.098] [INFO ] Import complete: 336677 spectra ready for analysis
#> [2026-06-15 13:04:00.939] [INFO ] 
#>              library spectra unique_structures Pct_spectra
#>      ISDB - Wikidata  994408            994393      35.92%
#>            enveda180  891903            176005      32.22%
#>               merlin  336677             42534      12.16%
#>                 gnps  272263             22882       9.83%
#>  ISDB - NormanSusDat  210419             87502       7.60%
#>             massbank   62855              7140       2.27%
#> [2026-06-15 13:04:01.811] [INFO ] > Starting: harmonize_adducts [n_rows=2768525]
#> [2026-06-15 13:04:02.814] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=127, n_unique_after=120] (1s)
#> [2026-06-15 13:04:10.035] [INFO ] > Starting: calculate_entropy_similarity [n_library=1478359, n_query=3999, method=gnps]
#> [2026-06-15 13:04:10.036] [INFO ] Calculating entropy and similarity for 3999 spectra
#> [2026-06-15 13:04:25.535] [INFO ] Processed 500 / 3999 queries
#> [2026-06-15 13:04:38.838] [INFO ] Processed 1000 / 3999 queries
#> [2026-06-15 13:04:50.768] [INFO ] Processed 1500 / 3999 queries
#> [2026-06-15 13:05:01.992] [INFO ] Processed 2000 / 3999 queries
#> [2026-06-15 13:05:12.230] [INFO ] Processed 2500 / 3999 queries
#> [2026-06-15 13:05:21.901] [INFO ] Processed 3000 / 3999 queries
#> [2026-06-15 13:05:31.775] [INFO ] Processed 3500 / 3999 queries
#> [2026-06-15 13:05:40.459] [INFO ] Processed 3999 / 3999 queries
#> [2026-06-15 13:05:40.503] [INFO ] [OK] Completed: calculate_entropy_similarity [n_comparisons=1918012] (1m 30s)
#> [2026-06-15 13:05:40.523] [INFO ] > Starting: harmonize_adducts [n_rows=1478359]
#> [2026-06-15 13:05:40.614] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=82, n_unique_after=77] (91ms)
#> [2026-06-15 13:05:41.881] [INFO ] > Starting: calculate_entropy_similarity [n_library=1478073, n_query=3962, method=gnps]
#> [2026-06-15 13:05:41.951] [INFO ] Calculating entropy and similarity for 3962 spectra
#> [2026-06-15 13:05:59.994] [INFO ] Processed 500 / 3962 queries
#> [2026-06-15 13:06:14.074] [INFO ] Processed 1000 / 3962 queries
#> [2026-06-15 13:06:27.582] [INFO ] Processed 1500 / 3962 queries
#> [2026-06-15 13:06:39.780] [INFO ] Processed 2000 / 3962 queries
#> [2026-06-15 13:06:50.286] [INFO ] Processed 2500 / 3962 queries
#> [2026-06-15 13:07:00.282] [INFO ] Processed 3000 / 3962 queries
#> [2026-06-15 13:07:09.903] [INFO ] Processed 3500 / 3962 queries
#> [2026-06-15 13:07:18.387] [INFO ] Processed 3962 / 3962 queries
#> [2026-06-15 13:07:18.438] [INFO ] [OK] Completed: calculate_entropy_similarity [n_comparisons=2161686] (1m 37s)
#> [2026-06-15 13:07:18.517] [INFO ] Similarity computation complete in 188.48 seconds
#> [2026-06-15 13:07:18.526] [INFO ] > Starting: harmonize_adducts [n_rows=1478359]
#> [2026-06-15 13:07:18.619] [INFO ] [OK] Completed: harmonize_adducts [n_unique_before=82, n_unique_after=77] (93ms)
#> [2026-06-15 13:08:10.719] [INFO ] Here is the distribution of annotation similarity scores (0.1 bins):
#> [2026-06-15 13:08:10.725] [INFO ] 
#>        bin       N    Pct
#>    [0,0.1] 1335399 88.64%
#>  (0.1,0.2]  104150  6.91%
#>  (0.2,0.3]   35394  2.35%
#>  (0.3,0.4]   15546  1.03%
#>  (0.4,0.5]    7830  0.52%
#>  (0.5,0.6]    3849  0.26%
#>  (0.6,0.7]    1479  0.10%
#>  (0.7,0.8]    1023  0.07%
#>  (0.8,0.9]    1268  0.08%
#>    (0.9,1]     672  0.04%
#> [2026-06-15 13:08:10.882] [INFO ] 599379 Candidates annotated on 3866 features (threshold >= 0).
#> [2026-06-15 13:08:10.896] [INFO ] Exporting parameters to: data/interim/params/260615_130810_annotate_spectra.yaml
#> [2026-06-15 13:08:10.899] [INFO ] > Starting: export_output [file=data/interim/annotations/example_spectralMatches_pos.tsv.gz, n_rows=1506610]
#> [2026-06-15 13:08:16.456] [INFO ] [OK] Completed: export_output [size_bytes=86780668] (5.6s)
#> [2026-06-15 13:08:16.458] [INFO ] Exported annotations: example_spectralMatches_pos.tsv.gz (1,506,610 rows)
#> ✔ ann_spe_pos completed [5m 59s, 86.78 MB]
#> + fea_edg_pre dispatched
#> [2026-06-15 13:08:18.960] [INFO ] > Starting: prepare_features_edges [n_edge_types=2]
#> [2026-06-15 13:08:19.886] [INFO ] [OK] Completed: prepare_features_edges [n_edges=12911] (926ms)
#> [2026-06-15 13:08:19.915] [INFO ] Exporting parameters to: data/interim/params/260615_130819_prepare_features_edges.yaml
#> [2026-06-15 13:08:19.918] [INFO ] > Starting: export_output [file=data/interim/features/example_edges.tsv, n_rows=12911]
#> [2026-06-15 13:08:19.927] [INFO ] [OK] Completed: export_output [size_bytes=569417] (9ms)
#> ✔ fea_edg_pre completed [992ms, 569.42 kB]
#> + ann_spe_pre dispatched
#> [2026-06-15 13:08:20.490] [INFO ] Preparing spectral matching annotations from 2 file(s)
#> [2026-06-15 13:08:37.267] [INFO ] > Starting: process_smiles [n_structures=599379]
#> [2026-06-15 13:08:37.269] [INFO ] Processing SMILES with RDKit
#> [2026-06-15 13:08:50.679] [INFO ] Processing 208 new SMILES with RDKit
#> [2026-06-15 13:08:50.684] [INFO ] Starting SMILES processing pipeline
#> [2026-06-15 13:08:50.694] [INFO ] Input: /tmp/RtmppBog3o/file270d74a1811c.smi
#> [2026-06-15 13:08:50.694] [INFO ] Output: /tmp/RtmppBog3o/file270d4d36032c.csv.gz
#> [2026-06-15 13:08:50.699] [INFO ] Input file validated: /tmp/RtmppBog3o/file270d74a1811c.smi
#> [2026-06-15 13:08:50.700] [INFO ] Output file validated: /tmp/RtmppBog3o/file270d4d36032c.csv.gz
#> [2026-06-15 13:08:50.700] [INFO ] Processing parameters: workers=8, batch_size=1000, progress_interval=10000
#> [2026-06-15 13:08:50.708] [INFO ] SMILES supplier initialized
#> [2026-06-15 13:08:51.162] [INFO ] Processing complete. Total molecules processed: 208
#> [2026-06-15 13:08:51.198] [INFO ] Successfully processed 208 SMILES
#> [2026-06-15 13:08:56.824] [INFO ] [OK] Completed: process_smiles [n_processed=600652] (19.6s)
#> [2026-06-15 13:09:03.802] [INFO ] > Starting: complement_metadata [n_input=1506610]
#> [2026-06-15 13:09:53.773] [INFO ] [OK] Completed: complement_metadata [n_enriched=1506610] (50s)
#> [2026-06-15 13:09:53.792] [INFO ] Exporting parameters to: data/interim/params/260615_130953_prepare_annotations_spectra.yaml
#> [2026-06-15 13:09:53.793] [INFO ] > Starting: export_output [file=data/interim/annotations/example_spectralMatchesPrepared.tsv.gz, n_rows=1506610]
#> [2026-06-15 13:10:01.443] [INFO ] [OK] Completed: export_output [size_bytes=151641991] (7.6s)
#> ✔ ann_spe_pre completed [1m 41s, 151.64 MB]
#> + fea_com dispatched
#> [2026-06-15 13:10:04.567] [INFO ] > Starting: create_components [n_input_files=1]
#> [2026-06-15 13:10:04.568] [INFO ] Creating components from 1 edge file(s)
#> [2026-06-15 13:10:04.581] [INFO ] Loaded 12755 edges connecting 5328 unique features
#> [2026-06-15 13:10:04.599] [INFO ] Found 2867 components
#> [2026-06-15 13:10:04.613] [INFO ] Component sizes - Min: 1, Max: 1355, Mean: 1.9
#> [2026-06-15 13:10:04.629] [INFO ] Exporting parameters to: data/interim/params/260615_131004_create_components.yaml
#> [2026-06-15 13:10:04.631] [INFO ] > Starting: export_output [file=data/interim/features/example_components.tsv, n_rows=5328]
#> [2026-06-15 13:10:04.633] [INFO ] [OK] Completed: export_output [size_bytes=48861] (1ms)
#> [2026-06-15 13:10:04.633] [INFO ] Components written to: data/interim/features/example_components.tsv
#> [2026-06-15 13:10:04.635] [INFO ] [OK] Completed: create_components [n_components=2867, n_features=5328] (68ms)
#> ✔ fea_com completed [71ms, 48.86 kB]
#> + ann_fil dispatched
#> [2026-06-15 13:10:05.200] [INFO ] > Starting: filter_annotations [n_annotation_files=6, tolerance_rt=Inf]
#> [2026-06-15 13:10:05.201] [INFO ] Filtering annotations
#> [2026-06-15 13:10:05.240] [INFO ] Processing 5328 unique features for annotation filtering
#> [2026-06-15 13:10:13.852] [INFO ] Removing MS1 annotations superseded by quality spectral matches
#> [2026-06-15 13:10:15.091] [INFO ] Removed 13916 redundant MS1 annotations
#> [2026-06-15 13:10:15.093] [INFO ] Total annotations after MS1 deduplication: 1950482
#> [2026-06-15 13:10:24.329] [INFO ] Removed 620560 non-MS1 annotation row(s) incompatible with strong MS1 adduct assignments
#> [2026-06-15 13:10:27.024] [INFO ] Adduct-semantics filter: before=1950503, removed_total=620560, removed_spectral_mismatch=0, after=1329943
#> [2026-06-15 13:10:27.054] [INFO ] Joining RT library and computing RT deltas
#> [2026-06-15 13:10:28.627] [INFO ] Removed 18737 duplicate RT library matches (keeping best match per annotation)
#> [2026-06-15 13:10:28.632] [INFO ] RT deltas computed for 0 annotations (no hard cutoff applied; scoring handles RT penalty)
#> [2026-06-15 13:10:28.634] [INFO ] Removed 18737 duplicate RT library matches during join
#> [2026-06-15 13:10:29.064] [INFO ] Exporting parameters to: data/interim/params/260615_131029_filter_annotations.yaml
#> [2026-06-15 13:10:29.066] [INFO ] > Starting: export_output [file=data/interim/annotations/example_annotationsFiltered.tsv.gz, n_rows=1311206]
#> [2026-06-15 13:10:35.627] [INFO ] [OK] Completed: export_output [size_bytes=120537997] (6.6s)
#> [2026-06-15 13:10:35.628] [INFO ] [OK] Completed: filter_annotations [n_filtered=1311206] (30.4s)
#> ✔ ann_fil completed [30.4s, 120.54 MB]
#> + fea_com_pre dispatched
#> [2026-06-15 13:10:37.483] [INFO ] > Starting: prepare_features_components [n_files=1]
#> [2026-06-15 13:10:37.488] [INFO ] [OK] Completed: prepare_features_components [n_assignments=5328] (5ms)
#> [2026-06-15 13:10:37.505] [INFO ] Exporting parameters to: data/interim/params/260615_131037_prepare_features_components.yaml
#> [2026-06-15 13:10:37.507] [INFO ] > Starting: export_output [file=data/interim/features/example_componentsPrepared.tsv, n_rows=5328]
#> [2026-06-15 13:10:37.509] [INFO ] [OK] Completed: export_output [size_bytes=48856] (2ms)
#> ✔ fea_com_pre completed [30ms, 48.86 kB]
#> + ann_wei dispatched
#> [2026-06-15 13:10:37.880] [INFO ] Starting annotation weighting and scoring
#> [2026-06-15 13:10:37.881] [INFO ] > Starting: weight_annotations [n_candidates_neighbors=16, n_candidates_final=1]
#> [2026-06-15 13:10:57.849] [INFO ] 
#>    candidate_library      n    Pct
#>      ISDB - Wikidata 589247 44.95%
#>             TIMA MS1 422753 32.25%
#>            enveda180 190376 14.52%
#>  ISDB - NormanSusDat  41488  3.16%
#>               merlin  39637  3.02%
#>                 gnps  20558  1.57%
#>             massbank   4906  0.37%
#>               SIRIUS   2027  0.15%
#> [2026-06-15 13:11:08.394] [INFO ] > Starting: weight_bio [n_annotations=1143237, n_sop=1334044]
#> [2026-06-15 13:11:08.396] [INFO ] Weighting 1143237 annotations by biological source
#> [2026-06-15 13:11:19.483] [INFO ] [OK] Completed: weight_bio [n_weighted=1143237] (11.1s)
#> [2026-06-15 13:11:19.485] [INFO ] > Starting: decorate_bio [n_annotations=1143237]
#> [2026-06-15 13:11:19.926] [INFO ] Taxonomically informed metabolite annotation reranked:
#>     Kingdom  level: 172311 candidates (50795 unique)
#>     Phylum   level: 170798 candidates (50207 unique)
#>     Class    level: 136993 candidates (43407 unique)
#>     Order    level: 30550 candidates (10895 unique)
#>     Family   level: 24836 candidates (8617 unique)
#>     Tribe    level: 4873 candidates (1406 unique)
#>     Genus    level: 4060 candidates (1104 unique)
#>     Species  level: 2221 candidates (550 unique)
#>     Variety  level: 345 candidates (109 unique)
#>     Biota    level: 345 candidates (109 unique)
#> [2026-06-15 13:11:19.927] [INFO ] [OK] Completed: decorate_bio [n_processed=1143237] (442ms)
#> [2026-06-15 13:11:19.928] [INFO ] > Starting: clean_bio [n_annotations=1143237, minimal_consistency=0]
#> [2026-06-15 13:11:51.812] [INFO ] [OK] Completed: clean_bio [n_cleaned=1143237] (31.9s)
#> [2026-06-15 13:11:51.813] [INFO ] > Starting: weight_chemo [n_input=1143237]
#> [2026-06-15 13:11:51.815] [INFO ] Weighting 1143237 annotations by chemical consistency
#> [2026-06-15 13:11:55.828] [INFO ] [OK] Completed: weight_chemo [n_weighted=1143237] (4s)
#> [2026-06-15 13:11:55.830] [INFO ] > Starting: decorate_chemo [n_annotations=1143237]
#> [2026-06-15 13:11:59.899] [INFO ] Chemically informed metabolite annotation reranked:
#>   Classyfire:
#>     Kingdom level:    88881 candidates (62795 unique)
#>     Superclass level: 60898 candidates (40707 unique)
#>     Class level:      44680 candidates (27960 unique)
#>     Parent level:     24679 candidates (14739 unique)
#>   NPClassifier:
#>     Pathway level:    128689 candidates (90269 unique)
#>     Superclass level: 70376 candidates (49895 unique)
#>     Class level:      37360 candidates (26337 unique)
#> [2026-06-15 13:11:59.900] [INFO ] [OK] Completed: decorate_chemo [n_processed=1143237] (4.1s)
#> [2026-06-15 13:11:59.932] [INFO ] > Starting: clean_chemo [n_annotations=1143237, candidates_final=1, high_evidence=FALSE]
#> [2026-06-15 13:12:35.254] [INFO ] Sampling candidates for 3991 features with more than 7 candidates per score
#> [2026-06-15 13:12:35.256] [INFO ] > Starting: filter_high_evidence [n_input=728656, context=filtered]
#> [2026-06-15 13:12:35.304] [INFO ] [filtered]  Removed 726577 low-evidence candidates (99.7% of 728656 total)
#> [2026-06-15 13:12:35.305] [INFO ] [filtered]  2079 high-evidence candidates remaining (0.3%)
#> [2026-06-15 13:12:35.306] [INFO ] [OK] Completed: filter_high_evidence [n_filtered=2079, n_removed=726577] (51ms)
#> [2026-06-15 13:12:35.396] [INFO ] Summarizing annotation results
#> [2026-06-15 13:12:35.566] [INFO ] Annotated features: 792/5328 (14.9%)
#> [2026-06-15 13:12:40.331] [INFO ] Summarizing annotation results
#> [2026-06-15 13:12:52.477] [INFO ] Annotated features: 5131/5328 (96.3%)
#> [2026-06-15 13:12:55.650] [INFO ] [OK] Completed: clean_chemo [n_final_full=728667, n_final_filtered=9585, n_final_mini=9585, n_features=5328] (55.7s)
#> [2026-06-15 13:12:55.651] [INFO ] [OK] Completed: weight_annotations [n_annotations=NULL] (2m 18s)
#> [2026-06-15 13:12:55.676] [INFO ] Exporting parameters to: data/processed/20260615_131255_example/260615_131255_prepare_params.yaml
#> [2026-06-15 13:12:55.700] [INFO ] Exporting parameters to: data/processed/20260615_131255_example/260615_131255_prepare_params_advanced.yaml
#> [2026-06-15 13:12:55.703] [INFO ] > Starting: export_output [file=data/processed/20260615_131255_example/example_results_mini.tsv, n_rows=9585]
#> [2026-06-15 13:12:55.708] [INFO ] [OK] Completed: export_output [size_bytes=1737058] (5ms)
#> [2026-06-15 13:12:55.710] [INFO ] > Starting: export_output [file=data/processed/20260615_131255_example/example_results_filtered.tsv, n_rows=9585]
#> [2026-06-15 13:12:55.720] [INFO ] [OK] Completed: export_output [size_bytes=2501991] (10ms)
#> [2026-06-15 13:12:55.722] [INFO ] > Starting: export_output [file=data/processed/20260615_131255_example/example_results.tsv, n_rows=728667]
#> [2026-06-15 13:12:56.871] [INFO ] [OK] Completed: export_output [size_bytes=402200240] (1.1s)
#> [2026-06-15 13:12:56.872] [INFO ] Results exported: example_results.tsv
#> ✔ ann_wei completed [2m 19s, 404.70 MB]
#> + exp_mzt dispatched
#> [2026-06-15 13:12:58.675] [INFO ] > Starting: write_mztab [input=example_results_filtered.tsv, output=example_results.mztab]
#> [2026-06-15 13:13:04.593] [INFO ] [OK] Completed: write_mztab [n_sml=2297, n_smf=5328, n_sme=5052] (5.9s)
#> ✔ exp_mzt completed [6s, 4.02 MB]
#> ✔ ended pipeline [27m 28.9s, 154 completed, 0 skipped]
#> There were 50 or more warnings (use warnings() to see the first 50)

The final exported file is formatted in order to be easily imported in Cytoscape to further explore your data!

We hope you enjoyed using TIMA and are pleased to hear from you!

For any remark or suggestion, please fill an issue or feel free to contact us directly.

Reuse

Citation

BibTeX citation:
@online{rutz2026,
  author = {Rutz, Adriano},
  title = {3 {Performing} {Taxonomically} {Informed} {Metabolite}
    {Annotation}},
  date = {2026-06-15},
  url = {https://taxonomicallyinformedannotation.github.io/tima/vignettes/articles/III-processing.html},
  langid = {en}
}
For attribution, please cite this work as:
Rutz, Adriano. 2026. “3 Performing Taxonomically Informed Metabolite Annotation.” June 15. https://taxonomicallyinformedannotation.github.io/tima/vignettes/articles/III-processing.html.