tima::tima_full()
#> + par_def_ann_spe dispatched
#> ✔ par_def_ann_spe completed [28ms, 2.14 kB]
#> + par_def_wei_ann dispatched
#> ✔ par_def_wei_ann completed [1ms, 5.27 kB]
#> + par_def_pre_ann_gnp dispatched
#> ✔ par_def_pre_ann_gnp completed [0ms, 1.42 kB]
#> + par_def_pre_lib_sop_mer dispatched
#> ✔ par_def_pre_lib_sop_mer completed [1ms, 3.40 kB]
#> + yaml_paths dispatched
#> ✔ yaml_paths completed [1ms, 11.52 kB]
#> + par_def_pre_lib_sop_lot dispatched
#> ✔ par_def_pre_lib_sop_lot completed [1ms, 494 B]
#> + par_def_ann_mas dispatched
#> ✔ par_def_ann_mas completed [1ms, 6.09 kB]
#> + par_def_pre_lib_sop_hmd dispatched
#> ✔ par_def_pre_lib_sop_hmd completed [0ms, 492 B]
#> + par_def_fil_ann dispatched
#> ✔ par_def_fil_ann completed [1ms, 1.34 kB]
#> + par_def_pre_lib_sop_clo dispatched
#> ✔ par_def_pre_lib_sop_clo completed [1ms, 523 B]
#> + par_def_pre_lib_spe dispatched
#> ✔ par_def_pre_lib_spe completed [1ms, 1.57 kB]
#> + par_def_pre_fea_com dispatched
#> ✔ par_def_pre_fea_com completed [0ms, 358 B]
#> + par_def_cre_com dispatched
#> ✔ par_def_cre_com completed [1ms, 375 B]
#> + par_def_cre_edg_spe dispatched
#> ✔ par_def_cre_edg_spe completed [1ms, 1.42 kB]
#> + par_def_pre_fea_edg dispatched
#> ✔ par_def_pre_fea_edg completed [1ms, 706 B]
#> + par_def_pre_fea_tab dispatched
#> ✔ par_def_pre_fea_tab completed [1ms, 860 B]
#> + par_def_pre_lib_rt dispatched
#> ✔ par_def_pre_lib_rt completed [1ms, 2.20 kB]
#> + par_def_pre_ann_spe dispatched
#> ✔ par_def_pre_ann_spe completed [0ms, 1.46 kB]
#> + par_def_pre_ann_sir dispatched
#> ✔ par_def_pre_ann_sir completed [1ms, 1.93 kB]
#> + par_def_pre_tax dispatched
#> ✔ par_def_pre_tax completed [1ms, 1.51 kB]
#> + par_def_pre_lib_sop_ecm dispatched
#> ✔ par_def_pre_lib_sop_ecm completed [1ms, 492 B]
#> + paths dispatched
#> ✔ paths completed [1ms, 2.52 kB]
#> + lib_spe_exp_gnp_pre_sop dispatched
#> [2025-11-20 12:11:00.373 ] [INFO ] Downloading file from: https://github.com/Adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/sop/gnps_11566051_prepared.tsv.gz
#> [2025-11-20 12:11:00.889 ] [INFO ] Successfully downloaded 1.35 MB to: data/interim/libraries/sop/gnps_11566051_prepared.tsv.gz
#> ✔ lib_spe_exp_gnp_pre_sop completed [517ms, 1.42 MB]
#> + lib_spe_exp_mb_pre_sop dispatched
#> [2025-11-20 12:11:01.060 ] [INFO ] Downloading file from: https://github.com/Adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/sop/massbank_2025051_prepared.tsv.gz
#> [2025-11-20 12:11:01.209 ] [INFO ] Successfully downloaded 0.46 MB to: data/interim/libraries/sop/massbank_2025051_prepared.tsv.gz
#> ✔ lib_spe_exp_mb_pre_sop completed [151ms, 480.97 kB]
#> + lib_spe_exp_mer_pre_sop dispatched
#> [2025-11-20 12:11:01.375 ] [INFO ] Downloading file from: https://github.com/Adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/sop/merlin_13911806_prepared.tsv.gz
#> [2025-11-20 12:11:01.596 ] [INFO ] Successfully downloaded 1.14 MB to: data/interim/libraries/sop/merlin_13911806_prepared.tsv.gz
#> ✔ lib_spe_exp_mer_pre_sop completed [222ms, 1.19 MB]
#> + lib_spe_is_wik_pre_sop dispatched
#> [2025-11-20 12:11:01.766 ] [INFO ] Downloading file from: https://github.com/taxonomicallyinformedannotation/tima-example-files/raw/main/wikidata_spectral_5607185_prepared.tsv.gz
#> [2025-11-20 12:11:02.082 ] [INFO ] Successfully downloaded 36.15 MB to: data/interim/libraries/sop/wikidata_5607185_prepared.tsv.gz
#> ✔ lib_spe_is_wik_pre_sop completed [318ms, 37.90 MB]
#> + lib_spe_exp_mb_pre_pos dispatched
#> [2025-11-20 12:11:02.265 ] [INFO ] Downloading file from: https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/massbank_2025051_pos.rds
#> [2025-11-20 12:11:02.970 ] [INFO ] Successfully downloaded 18.51 MB to: data/interim/libraries/spectra/exp/massbank_2025051_pos.rds
#> ✔ lib_spe_exp_mb_pre_pos completed [707ms, 19.41 MB]
#> + par_pre_par dispatched
#> ✔ par_pre_par completed [0ms, 1.38 kB]
#> + lib_spe_exp_mer_pre_neg dispatched
#> [2025-11-20 12:11:03.314 ] [INFO ] Downloading file from: https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/merlin_13911806_neg.rds
#> [2025-11-20 12:11:04.150 ] [INFO ] Successfully downloaded 30.08 MB to: data/interim/libraries/spectra/exp/merlin_13911806_neg.rds
#> ✔ lib_spe_exp_mer_pre_neg completed [838ms, 31.54 MB]
#> + lib_spe_is_wik_pre_neg dispatched
#> [2025-11-20 12:11:04.336 ] [INFO ] Downloading file from: https://github.com/taxonomicallyinformedannotation/tima-isdb-neg/raw/main/wikidata_5607185_neg.rds
#> Downloading 9% ■■■■ 10s
#> Downloading 37% ■■■■■■■■■■■■ 6s
#> Downloading 71% ■■■■■■■■■■■■■■■■■■■■■■ 3s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0s
#> [2025-11-20 12:11:14.223 ] [INFO ] Successfully downloaded 655.49 MB to: data/interim/libraries/spectra/is/wikidata_5607185_neg.rds
#> ✔ lib_spe_is_wik_pre_neg completed [9.9s, 687.33 MB]
#> + par_pre_par2 dispatched
#> ✔ par_pre_par2 completed [0ms, 21.49 kB]
#> + lib_spe_is_wik_pre_pos dispatched
#> [2025-11-20 12:11:14.806 ] [INFO ] Downloading file from: https://github.com/taxonomicallyinformedannotation/tima-isdb-pos/raw/main/wikidata_5607185_pos.rds
#> Downloading 7% ■■■ 12s
#> Downloading 18% ■■■■■■ 10s
#> Downloading 41% ■■■■■■■■■■■■■ 7s
#> Downloading 70% ■■■■■■■■■■■■■■■■■■■■■■ 3s
#> Downloading 94% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 1s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0s
#> [2025-11-20 12:11:27.034 ] [INFO ] Successfully downloaded 823.93 MB to: data/interim/libraries/spectra/is/wikidata_5607185_pos.rds
#> ✔ lib_spe_is_wik_pre_pos completed [12.2s, 863.95 MB]
#> + lib_sop_lot dispatched
#> [2025-11-20 12:11:27.514 ] [INFO ] Retrieving latest version from Zenodo: 10.5281/zenodo.5794106
#> [2025-11-20 12:11:29.905 ] [INFO ] Downloading 230106_frozen_metadata.csv.gz from https://doi.org/10.5281/zenodo.5794106 (The LOTUS Initiative for Open Natural Products Research: frozen dataset union wikidata (with metadata))
#> [2025-11-20 12:11:29.906 ] [INFO ] Downloading file from: https://zenodo.org/records/7534071/files/230106_frozen_metadata.csv.gz
#> [2025-11-20 12:12:06.386 ] [INFO ] Successfully downloaded 88.67 MB to: data/source/libraries/sop/lotus.csv.gz
#> ✔ lib_sop_lot completed [38.9s, 92.98 MB]
#> + lib_sop_hmd dispatched
#> [2025-11-20 12:12:06.595 ] [INFO ] Downloading file from: https://hmdb.ca/system/downloads/current/structures.zip
#> Downloading 38% ■■■■■■■■■■■■ 2s
#> Downloading 74% ■■■■■■■■■■■■■■■■■■■■■■■ 1s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0s
#> [2025-11-20 12:12:09.724 ] [INFO ] Successfully downloaded 92.01 MB to: data/source/libraries/sop/hmdb/structures.zip
#> ✔ lib_sop_hmd completed [3.1s, 96.48 MB]
#> + lib_spe_exp_gnp_pre_neg dispatched
#> [2025-11-20 12:12:09.932 ] [INFO ] Downloading file from: https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/gnps_11566051_neg.rds
#> [2025-11-20 12:12:12.073 ] [INFO ] Successfully downloaded 146.98 MB to: data/interim/libraries/spectra/exp/gnps_11566051_neg.rds
#> ✔ lib_spe_exp_gnp_pre_neg completed [2.1s, 154.12 MB]
#> + lib_spe_exp_mer_pre_pos dispatched
#> [2025-11-20 12:12:12.308 ] [INFO ] Downloading file from: https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/merlin_13911806_pos.rds
#> [2025-11-20 12:12:13.836 ] [INFO ] Successfully downloaded 81 MB to: data/interim/libraries/spectra/exp/merlin_13911806_pos.rds
#> ✔ lib_spe_exp_mer_pre_pos completed [1.5s, 84.94 MB]
#> + lib_sop_ecm dispatched
#> [2025-11-20 12:12:14.051 ] [INFO ] Downloading file from: https://ecmdb.ca/download/ecmdb.json.zip
#> [2025-11-20 12:12:15.270 ] [INFO ] Successfully downloaded 1.27 MB to: data/source/libraries/sop/ecmdb.json.zip
#> ✔ lib_sop_ecm completed [1.2s, 1.33 MB]
#> + lib_spe_exp_mb_pre_neg dispatched
#> [2025-11-20 12:12:15.442 ] [INFO ] Downloading file from: https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/massbank_2025051_neg.rds
#> [2025-11-20 12:12:15.984 ] [INFO ] Successfully downloaded 6.73 MB to: data/interim/libraries/spectra/exp/massbank_2025051_neg.rds
#> ✔ lib_spe_exp_mb_pre_neg completed [543ms, 7.06 MB]
#> + lib_spe_exp_gnp_pre_pos dispatched
#> [2025-11-20 12:12:16.162 ] [INFO ] Downloading file from: https://github.com/adafede/SpectRalLibRaRies/raw/main/data/interim/libraries/spectra/exp/gnps_11566051_pos.rds
#> Downloading 17% ■■■■■■ 5s
#> Downloading 64% ■■■■■■■■■■■■■■■■■■■■ 2s
#> Downloading 100% ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 0s
#> [2025-11-20 12:12:22.360 ] [INFO ] Successfully downloaded 458.98 MB to: data/interim/libraries/spectra/exp/gnps_11566051_pos.rds
#> ✔ lib_spe_exp_gnp_pre_pos completed [6.2s, 481.27 MB]
#> + par_fin_par dispatched
#> ✔ par_fin_par completed [1ms, 307 B]
#> + par_fin_par2 dispatched
#> ✔ par_fin_par2 completed [2ms, 2.99 kB]
#> + par_usr_pre_lib_sop_mer dispatched
#> ✔ par_usr_pre_lib_sop_mer completed [1.4s, 1.55 kB]
#> + par_usr_pre_lib_sop_lot dispatched
#> ✔ par_usr_pre_lib_sop_lot completed [1.5s, 174 B]
#> + par_usr_pre_tax dispatched
#> ✔ par_usr_pre_tax completed [1.4s, 438 B]
#> + par_usr_pre_ann_gnp dispatched
#> ✔ par_usr_pre_ann_gnp completed [1.3s, 708 B]
#> + par_usr_pre_lib_sop_hmd dispatched
#> ✔ par_usr_pre_lib_sop_hmd completed [1.4s, 178 B]
#> + par_usr_cre_com dispatched
#> ✔ par_usr_cre_com completed [1.4s, 200 B]
#> + par_usr_pre_lib_sop_clo dispatched
#> ✔ par_usr_pre_lib_sop_clo completed [1.4s, 205 B]
#> + par_usr_cre_edg_spe dispatched
#> ✔ par_usr_cre_edg_spe completed [1.4s, 452 B]
#> + par_usr_pre_fea_com dispatched
#> ✔ par_usr_pre_fea_com completed [1.4s, 200 B]
#> + par_usr_pre_fea_edg dispatched
#> ✔ par_usr_pre_fea_edg completed [1.4s, 328 B]
#> + par_usr_pre_lib_sop_ecm dispatched
#> ✔ par_usr_pre_lib_sop_ecm completed [1.4s, 176 B]
#> + par_usr_fil_ann dispatched
#> ✔ par_usr_fil_ann completed [1.3s, 668 B]
#> + par_usr_pre_fea_tab dispatched
#> ✔ par_usr_pre_fea_tab completed [1.4s, 274 B]
#> + par_usr_pre_lib_rt dispatched
#> ✔ par_usr_pre_lib_rt completed [1.4s, 487 B]
#> + par_usr_ann_spe dispatched
#> ✔ par_usr_ann_spe completed [1.4s, 1.03 kB]
#> + par_usr_pre_ann_spe dispatched
#> ✔ par_usr_pre_ann_spe completed [1.4s, 731 B]
#> + par_usr_pre_lib_spe dispatched
#> ✔ par_usr_pre_lib_spe completed [1.3s, 298 B]
#> + par_usr_pre_ann_sir dispatched
#> ✔ par_usr_pre_ann_sir completed [1.4s, 900 B]
#> + par_usr_ann_mas dispatched
#> ✔ par_usr_ann_mas completed [1.4s, 2.68 kB]
#> + par_usr_wei_ann dispatched
#> ✔ par_usr_wei_ann completed [1.3s, 1.78 kB]
#> + par_pre_lib_sop_mer dispatched
#> ✔ par_pre_lib_sop_mer completed [1ms, 558 B]
#> + par_pre_lib_sop_lot dispatched
#> ✔ par_pre_lib_sop_lot completed [1ms, 186 B]
#> + par_pre_tax dispatched
#> ✔ par_pre_tax completed [1ms, 330 B]
#> + par_pre_ann_gnp dispatched
#> ✔ par_pre_ann_gnp completed [0ms, 336 B]
#> + par_pre_lib_sop_hmd dispatched
#> ✔ par_pre_lib_sop_hmd completed [1ms, 191 B]
#> + par_cre_com dispatched
#> ✔ par_cre_com completed [1ms, 191 B]
#> + par_pre_lib_sop_clo dispatched
#> ✔ par_pre_lib_sop_clo completed [0ms, 213 B]
#> + par_cre_edg_spe dispatched
#> ✔ par_cre_edg_spe completed [1ms, 389 B]
#> + par_pre_fea_com dispatched
#> ✔ par_pre_fea_com completed [1ms, 184 B]
#> + par_pre_fea_edg dispatched
#> ✔ par_pre_fea_edg completed [0ms, 244 B]
#> + par_pre_lib_sop_ecm dispatched
#> ✔ par_pre_lib_sop_ecm completed [1ms, 191 B]
#> + par_fil_ann dispatched
#> ✔ par_fil_ann completed [1ms, 346 B]
#> + par_pre_fea_tab dispatched
#> ✔ par_pre_fea_tab completed [1ms, 278 B]
#> + par_pre_lib_rt dispatched
#> ✔ par_pre_lib_rt completed [0ms, 375 B]
#> + par_ann_spe dispatched
#> ✔ par_ann_spe completed [1ms, 497 B]
#> + par_pre_ann_spe dispatched
#> ✔ par_pre_ann_spe completed [1ms, 334 B]
#> + par_pre_lib_spe dispatched
#> ✔ par_pre_lib_spe completed [0ms, 404 B]
#> + par_pre_ann_sir dispatched
#> ✔ par_pre_ann_sir completed [1ms, 405 B]
#> + par_ann_mas dispatched
#> ✔ par_ann_mas completed [1ms, 1.13 kB]
#> + par_wei_ann dispatched
#> ✔ par_wei_ann completed [1ms, 952 B]
#> + lib_sop_mer_str_pro dispatched
#> [2025-11-20 12:12:57.673 ] [INFO ] Downloading file from: https://github.com/taxonomicallyinformedannotation/tima-example-files/raw/main/processed.csv.gz
#> [2025-11-20 12:12:58.248 ] [INFO ] Successfully downloaded 86.32 MB to: data/interim/libraries/sop/merged/structures/processed.csv.gz
#> ✔ lib_sop_mer_str_pro completed [576ms, 90.51 MB]
#> + lib_sop_lot_pre dispatched
#> [2025-11-20 12:12:58.459 ] [INFO ] Loading LOTUS database from: data/source/libraries/sop/lotus.csv.gz
#> [2025-11-20 12:13:09.361 ] [INFO ] Prepared 791809 unique structure-organism pairs from LOTUS
#> [2025-11-20 12:13:09.363 ] [INFO ] Exporting data to: data/interim/libraries/sop/lotus_prepared.tsv.gz
#> [2025-11-20 12:13:12.840 ] [INFO ] Successfully exported 791809 rows to data/interim/libraries/sop/lotus_prepared.tsv.gz
#> ✔ lib_sop_lot_pre completed [14.4s, 46.52 MB]
#> + lib_sop_hmd_pre dispatched
#> [2025-11-20 12:13:13.217 ] [INFO ] Preparing HMDB structure-organism pairs
#> [2025-11-20 12:13:57.468 ] [INFO ] Exporting data to: data/interim/libraries/sop/hmdb_prepared.tsv.gz
#> [2025-11-20 12:13:58.071 ] [INFO ] Successfully exported 217776 rows to data/interim/libraries/sop/hmdb_prepared.tsv.gz
#> ✔ lib_sop_hmd_pre completed [44.9s, 8.06 MB]
#> + lib_sop_clo_pre dispatched
#> [2025-11-20 12:13:58.622 ] [INFO ] Preparing closed structure-organism pairs library
#> [2025-11-20 12:13:58.626 ] [WARN ] Closed resource not accessible at: ~/Git/lotus-processor/data/processed/240412_closed_metadata.csv.gz. Returning empty template instead.
#> [2025-11-20 12:13:58.642 ] [INFO ] Exporting parameters to: data/interim/params/251120_121358_prepare_libraries_sop_closed.yaml
#> [2025-11-20 12:13:58.643 ] [INFO ] Exporting data to: data/interim/libraries/sop/closed_prepared.tsv.gz
#> [2025-11-20 12:13:58.644 ] [INFO ] Successfully exported 1 rows to data/interim/libraries/sop/closed_prepared.tsv.gz
#> ✔ lib_sop_clo_pre completed [23ms, 273 B]
#> + lib_sop_ecm_pre dispatched
#> [2025-11-20 12:13:58.869 ] [INFO ] Preparing ECMDB structure-organism pairs
#> [2025-11-20 12:13:59.496 ] [INFO ] Exporting parameters to: data/interim/params/251120_121359_prepare_libraries_sop_ecmdb.yaml
#> [2025-11-20 12:13:59.497 ] [INFO ] Exporting data to: data/interim/libraries/sop/ecmdb_prepared.tsv.gz
#> [2025-11-20 12:13:59.512 ] [INFO ] Successfully exported 3760 rows to data/interim/libraries/sop/ecmdb_prepared.tsv.gz
#> ✔ lib_sop_ecm_pre completed [645ms, 177.47 kB]
#> + par_pre_fea_tab_fil_fea_raw dispatched
#> ✔ par_pre_fea_tab_fil_fea_raw completed [0ms, 451.55 kB]
#> + lib_rt dispatched
#> [2025-11-20 12:13:59.916 ] [INFO ] Preparing retention time libraries
#> [2025-11-20 12:13:59.927 ] [WARN ] No retention time library found, returning empty retention time and sop tables.
#> [2025-11-20 12:13:59.969 ] [INFO ] Exporting parameters to: data/interim/params/251120_121359_prepare_libraries_rt.yaml
#> [2025-11-20 12:13:59.970 ] [INFO ] Exporting data to: data/interim/libraries/rt/prepared.tsv.gz
#> [2025-11-20 12:13:59.971 ] [INFO ] Successfully exported 1 rows to data/interim/libraries/rt/prepared.tsv.gz
#> [2025-11-20 12:13:59.973 ] [INFO ] Exporting data to: data/interim/libraries/sop/rt_prepared.tsv.gz
#> [2025-11-20 12:13:59.974 ] [INFO ] Successfully exported 1 rows to data/interim/libraries/sop/rt_prepared.tsv.gz
#> ✔ lib_rt completed [60ms, 191 B]
#> + par_ann_spe_fil_spe_raw dispatched
#> ✔ par_ann_spe_fil_spe_raw completed [0ms, 7.77 MB]
#> + lib_spe_exp_int_pre dispatched
#> [2025-11-20 12:14:00.365 ] [INFO ] Preparing spectral libraries
#> [2025-11-20 12:14:00.370 ] [WARN ] Your input file does not exist, returning empty lib instead.
#> [2025-11-20 12:14:01.601 ] [INFO ] Exporting data to: data/interim/libraries/sop/internal_prepared.tsv.gz
#> [2025-11-20 12:14:01.602 ] [INFO ] Successfully exported 1 rows to data/interim/libraries/sop/internal_prepared.tsv.gz
#> [2025-11-20 12:14:01.702 ] [INFO ] Exporting parameters to: data/interim/params/251120_121401_prepare_libraries_spectra.yaml
#> ✔ lib_spe_exp_int_pre completed [1.3s, 155 B]
#> + input_features dispatched
#> ✔ input_features completed [1ms, 451.55 kB]
#> + lib_rt_sop dispatched
#> ✔ lib_rt_sop completed [1ms, 105 B]
#> + lib_rt_rts dispatched
#> ✔ lib_rt_rts completed [0ms, 86 B]
#> + input_spectra dispatched
#> ✔ input_spectra completed [0ms, 7.77 MB]
#> + lib_spe_exp_int_pre_sop dispatched
#> ✔ lib_spe_exp_int_pre_sop completed [0ms, 106 B]
#> + lib_spe_exp_int_pre_pos dispatched
#> ✔ lib_spe_exp_int_pre_pos completed [0ms, 599 B]
#> + lib_spe_exp_int_pre_neg dispatched
#> ✔ lib_spe_exp_int_pre_neg completed [0ms, 599 B]
#> + fea_pre dispatched
#> [2025-11-20 12:14:04.236 ] [INFO ] Preparing features table from: data/source/example_features.csv
#> [2025-11-20 12:14:04.330 ] [INFO ] Prepared 5328 feature-sample pairs
#> [2025-11-20 12:14:04.353 ] [INFO ] Exporting parameters to: data/interim/params/251120_121404_prepare_features_tables.yaml
#> [2025-11-20 12:14:04.354 ] [INFO ] Exporting data to: data/interim/features/example_features.tsv.gz
#> [2025-11-20 12:14:04.367 ] [INFO ] Successfully exported 5328 rows to data/interim/features/example_features.tsv.gz
#> ✔ fea_pre completed [154ms, 95.63 kB]
#> + fea_edg_spe dispatched
#> [2025-11-20 12:14:04.679 ] [INFO ] Creating spectral similarity network edges
#> [2025-11-20 12:14:04.680 ] [INFO ] Importing spectra from: data/source/example_spectra.mgf
#> [2025-11-20 12:14:04.706 ] [INFO ] Reading MGF file (7.41 MB) with optimized parser: data/source/example_spectra.mgf
#> [2025-11-20 12:14:06.513 ] [INFO ] Processed 10000 spectra...
#> [2025-11-20 12:14:07.779 ] [INFO ] Total spectra read: 16282
#> [2025-11-20 12:14:14.181 ] [INFO ] Loaded 16282 spectra from file
#> [2025-11-20 12:14:16.434 ] [INFO ] Sanitizing 4087 spectra (cutoff: 0)
#> [2025-11-20 12:14:16.452 ] [INFO ] Sanitization complete: 3840/4087 spectra retained (94%, 247 removed)
#> [2025-11-20 12:14:16.453 ] [INFO ] Import complete: 3840 spectra ready for analysis
#> [2025-11-20 12:14:16.455 ] [INFO ] =============================================
#> [2025-11-20 12:14:16.456 ] [INFO ] = Take yourself a break, you deserve it. =
#> [2025-11-20 12:14:16.457 ] [INFO ] =============================================
#> [2025-11-20 12:20:04.414 ] [INFO ] Created 9223 edges passing thresholds
#> [2025-11-20 12:20:04.484 ] [INFO ] Exporting parameters to: data/interim/params/251120_122004_create_edges_spectra.yaml
#> [2025-11-20 12:20:04.485 ] [INFO ] Exporting data to: data/interim/features/example_edgesSpectra.tsv
#> [2025-11-20 12:20:04.488 ] [INFO ] Successfully exported 11577 rows to data/interim/features/example_edgesSpectra.tsv
#> ✔ fea_edg_spe completed [5m 59.8s, 533.82 kB]
#> + lib_sop_mer dispatched
#> [2025-11-20 12:20:04.826 ] [INFO ] Preparing merged structure-organism pairs library
#> [2025-11-20 12:20:20.484 ] [INFO ] Splitting concatenated SOP library into standardized components
#> [2025-11-20 12:20:24.899 ] [INFO ] Processing SMILES strings with RDKit
#> Downloading uv...Done!
#> Downloading cpython-3.12.12-linux-x86_64-gnu (download) (30.9MiB)
#> Downloading cpython-3.12.12-linux-x86_64-gnu (download)
#> Downloading numpy (15.8MiB)
#> Downloading pillow (6.7MiB)
#> Downloading rdkit (34.5MiB)
#> Downloading pillow
#> Downloading numpy
#> Downloading rdkit
#> Installed 3 packages in 29ms
#> [2025-11-20 12:20:39.017 ] [INFO ] Processing 61 new SMILES with RDKit
#> 2025-11-20 12:20:39,019 - __main__ - INFO - Starting SMILES processing pipeline
#> 2025-11-20 12:20:39,019 - __main__ - INFO - Input: /tmp/Rtmp4ViNU1/file1eec3272663d.smi
#> 2025-11-20 12:20:39,019 - __main__ - INFO - Output: /tmp/Rtmp4ViNU1/file1eec764dccce.csv.gz
#> 2025-11-20 12:20:39,019 - __main__ - INFO - Input file validated: /tmp/Rtmp4ViNU1/file1eec3272663d.smi
#> 2025-11-20 12:20:39,019 - __main__ - INFO - Output file validated: /tmp/Rtmp4ViNU1/file1eec764dccce.csv.gz
#> 2025-11-20 12:20:39,019 - __main__ - INFO - Processing parameters: workers=8, batch_size=1000, progress_interval=10000
#> 2025-11-20 12:20:39,019 - __main__ - INFO - SMILES supplier initialized
#> [12:20:39] Explicit valence for atom # 1 N, 3, is greater than permitted
#> [12:20:39] ERROR: Could not sanitize molecule on line 1
#> [12:20:39] ERROR: Explicit valence for atom # 1 N, 3, is greater than permitted
#> [12:20:39] Explicit valence for atom # 0 P, 11, is greater than permitted
#> [12:20:39] ERROR: Could not sanitize molecule on line 2
#> [12:20:39] ERROR: Explicit valence for atom # 0 P, 11, is greater than permitted
#> [12:20:39] Explicit valence for atom # 21 N, 4, is greater than permitted
#> [12:20:39] ERROR: Could not sanitize molecule on line 3
#> [12:20:39] ERROR: Explicit valence for atom # 21 N, 4, is greater than permitted
#> [12:20:39] Explicit valence for atom # 1 Cl, 4, is greater than permitted
#> [12:20:39] ERROR: Could not sanitize molecule on line 4
#> [12:20:39] ERROR: Explicit valence for atom # 1 Cl, 4, is greater than permitted
#> [12:20:39] Explicit valence for atom # 6 C, 5, is greater than permitted
#> [12:20:39] ERROR: Could not sanitize molecule on line 5
#> [12:20:39] ERROR: Explicit valence for atom # 6 C, 5, is greater than permitted
#> [12:20:39] Explicit valence for atom # 18 S, 7, is greater than permitted
#> [12:20:39] ERROR: Could not sanitize molecule on line 6
#> [12:20:39] ERROR: Explicit valence for atom # 18 S, 7, is greater than permitted
#> [12:20:39] SMILES Parse Error: syntax error while parsing: OC1=CC=CC(=C1)C-1=C2\CCC(=N2)\C(=C2/N\C(\C=C2)=C(/C2=N/C(/C=C2)=C(\C2=CC=C\-1N2)C1=CC(O)=CC=C1)C1=CC(O)=CC=C1)\C1=CC(O)=CC=C1
#> [12:20:39] SMILES Parse Error: check for mistakes around position 76:
#> [12:20:39] C(/C=C2)=C(\C2=CC=C\-1N2)C1=CC(O)=CC=C1)C
#> [12:20:39] ~~~~~~~~~~~~~~~~~~~~^
#> [12:20:39] SMILES Parse Error: extra open parentheses while parsing: OC1=CC=CC(=C1)C-1=C2\CCC(=N2)\C(=C2/N\C(\C=C2)=C(/C2=N/C(/C=C2)=C(\C2=CC=C\-1N2)C1=CC(O)=CC=C1)C1=CC(O)=CC=C1)\C1=CC(O)=CC=C1
#> [12:20:39] SMILES Parse Error: check for mistakes around position 32:
#> [12:20:39] C1)C-1=C2\CCC(=N2)\C(=C2/N\C(\C=C2)=C(/C2
#> [12:20:39] ~~~~~~~~~~~~~~~~~~~~^
#> [12:20:39] SMILES Parse Error: extra open parentheses while parsing: OC1=CC=CC(=C1)C-1=C2\CCC(=N2)\C(=C2/N\C(\C=C2)=C(/C2=N/C(/C=C2)=C(\C2=CC=C\-1N2)C1=CC(O)=CC=C1)C1=CC(O)=CC=C1)\C1=CC(O)=CC=C1
#> [12:20:39] SMILES Parse Error: check for mistakes around position 49:
#> [12:20:39] )\C(=C2/N\C(\C=C2)=C(/C2=N/C(/C=C2)=C(\C2
#> [12:20:39] ~~~~~~~~~~~~~~~~~~~~^
#> [12:20:39] SMILES Parse Error: extra open parentheses while parsing: OC1=CC=CC(=C1)C-1=C2\CCC(=N2)\C(=C2/N\C(\C=C2)=C(/C2=N/C(/C=C2)=C(\C2=CC=C\-1N2)C1=CC(O)=CC=C1)C1=CC(O)=CC=C1)\C1=CC(O)=CC=C1
#> [12:20:39] SMILES Parse Error: check for mistakes around position 66:
#> [12:20:39] )=C(/C2=N/C(/C=C2)=C(\C2=CC=C\-1N2)C1=CC(
#> [12:20:39] ~~~~~~~~~~~~~~~~~~~~^
#> [12:20:39] SMILES Parse Error: Failed parsing SMILES 'OC1=CC=CC(=C1)C-1=C2\CCC(=N2)\C(=C2/N\C(\C=C2)=C(/C2=N/C(/C=C2)=C(\C2=CC=C\-1N2)C1=CC(O)=CC=C1)C1=CC(O)=CC=C1)\C1=CC(O)=CC=C1' for input: 'OC1=CC=CC(=C1)C-1=C2\CCC(=N2)\C(=C2/N\C(\C=C2)=C(/C2=N/C(/C=C2)=C(\C2=CC=C\-1N2)C1=CC(O)=CC=C1)C1=CC(O)=CC=C1)\C1=CC(O)=CC=C1'
#> [12:20:39] ERROR: Smiles parse error on line 7
#> [12:20:39] ERROR: Cannot create molecule from : 'OC1=CC=CC(=C1)C-1=C2\CCC(=N2)\C(=C2/N\C(\C=C2)=C(/C2=N/C(/C=C2)=C(\C2=CC=C\-1N2)C1=CC(O)=CC=C1)C1=CC(O)=CC=C1)\C1=CC(O)=CC=C1'
#> 2025-11-20 12:20:39,075 - __main__ - INFO - Processing complete. Total molecules processed: 54
#> [2025-11-20 12:20:39.105 ] [INFO ] Successfully processed 54 SMILES
#> [2025-11-20 12:21:01.832 ] [INFO ] Led to 877903 referenced structure-organism pairs
#> [2025-11-20 12:21:04.529 ] [INFO ] Corresponding to 393389 unique stereoisomers (excluding structures without stereochemistry)...
#> [2025-11-20 12:21:06.305 ] [INFO ] ... and 1007790 unique structures without stereochemistry...
#> [2025-11-20 12:21:06.394 ] [INFO ] ... or 1184311 unique constitutional isomers (ignoring stereochemistry)
#> [2025-11-20 12:21:26.730 ] [INFO ] ... among 36800 unique organisms
#> [2025-11-20 12:21:26.821 ] [INFO ] Processing 919 organism name(s) for OTT taxonomy lookup
#> [2025-11-20 12:21:27.244 ] [INFO ] Querying OTT API in 10 batches
#> [2025-11-20 12:21:30.668 ] [INFO ] Retrieving detailed taxonomy for 13 unique OTT IDs
#> [2025-11-20 12:21:32.227 ] [INFO ] Got OTTaxonomy!
#> [2025-11-20 12:21:32.261 ] [INFO ] Exporting parameters to: data/interim/params/251120_122132_prepare_libraries_sop_merged.yaml
#> [2025-11-20 12:21:32.262 ] [INFO ] Exporting data to: data/interim/libraries/sop/merged/keys.tsv.gz
#> [2025-11-20 12:21:33.443 ] [INFO ] Successfully exported 877903 rows to data/interim/libraries/sop/merged/keys.tsv.gz
#> [2025-11-20 12:21:33.444 ] [INFO ] Exporting data to: data/interim/libraries/sop/merged/organisms/taxonomies/ott.tsv.gz
#> [2025-11-20 12:21:33.534 ] [INFO ] Successfully exported 35894 rows to data/interim/libraries/sop/merged/organisms/taxonomies/ott.tsv.gz
#> [2025-11-20 12:21:33.535 ] [INFO ] Exporting data to: data/interim/libraries/sop/merged/structures/stereo.tsv.gz
#> [2025-11-20 12:21:37.017 ] [INFO ] Successfully exported 1404391 rows to data/interim/libraries/sop/merged/structures/stereo.tsv.gz
#> [2025-11-20 12:21:37.019 ] [INFO ] Exporting data to: data/interim/libraries/sop/merged/structures/metadata.tsv.gz
#> [2025-11-20 12:21:38.673 ] [INFO ] Successfully exported 1458341 rows to data/interim/libraries/sop/merged/structures/metadata.tsv.gz
#> [2025-11-20 12:21:38.675 ] [INFO ] Exporting data to: data/interim/libraries/sop/merged/structures/names.tsv.gz
#> [2025-11-20 12:21:39.272 ] [INFO ] Successfully exported 423199 rows to data/interim/libraries/sop/merged/structures/names.tsv.gz
#> [2025-11-20 12:21:39.273 ] [INFO ] Exporting data to: data/interim/libraries/sop/merged/structures/taxonomies/classyfire.tsv.gz
#> [2025-11-20 12:21:39.425 ] [INFO ] Successfully exported 146393 rows to data/interim/libraries/sop/merged/structures/taxonomies/classyfire.tsv.gz
#> [2025-11-20 12:21:39.427 ] [INFO ] Exporting data to: data/interim/libraries/sop/merged/structures/taxonomies/npc.tsv.gz
#> [2025-11-20 12:21:39.727 ] [INFO ] Successfully exported 141818 rows to data/interim/libraries/sop/merged/structures/taxonomies/npc.tsv.gz
#> ✔ lib_sop_mer completed [1m 34.9s, 250 B]
#> + ann_spe_pos dispatched
#> [2025-11-20 12:21:40.836 ] [INFO ] Starting spectral annotation in pos mode
#> [2025-11-20 12:21:40.840 ] [INFO ] Importing spectra from: data/source/example_spectra.mgf
#> [2025-11-20 12:21:40.841 ] [INFO ] Reading MGF file (7.41 MB) with optimized parser: data/source/example_spectra.mgf
#> [2025-11-20 12:21:42.542 ] [INFO ] Processed 10000 spectra...
#> [2025-11-20 12:21:43.783 ] [INFO ] Total spectra read: 16282
#> [2025-11-20 12:21:49.357 ] [INFO ] Loaded 16282 spectra from file
#> [2025-11-20 12:21:50.169 ] [INFO ] Sanitizing 4087 spectra (cutoff: 0)
#> [2025-11-20 12:21:50.281 ] [INFO ] Sanitization complete: 3840/4087 spectra retained (94%, 247 removed)
#> [2025-11-20 12:21:50.282 ] [INFO ] Import complete: 3840 spectra ready for analysis
#> [2025-11-20 12:21:50.284 ] [INFO ] Importing spectra from: data/interim/libraries/spectra/is/wikidata_5607185_pos.rds
#> [2025-11-20 12:22:08.875 ] [INFO ] Loaded 998198 spectra from file
#> [2025-11-20 12:22:09.540 ] [INFO ] Import complete: 998198 spectra ready for analysis
#> [2025-11-20 12:22:09.541 ] [INFO ] Importing spectra from: data/interim/libraries/spectra/exp/internal_pos.rds
#> [2025-11-20 12:22:09.542 ] [INFO ] Loaded 1 spectra from file
#> [2025-11-20 12:22:09.544 ] [INFO ] Import complete: 0 spectra ready for analysis
#> [2025-11-20 12:22:09.544 ] [INFO ] Importing spectra from: data/interim/libraries/spectra/exp/gnps_11566051_pos.rds
#> [2025-11-20 12:22:15.102 ] [INFO ] Loaded 354789 spectra from file
#> [2025-11-20 12:22:15.290 ] [INFO ] Import complete: 354788 spectra ready for analysis
#> [2025-11-20 12:22:15.291 ] [INFO ] Importing spectra from: data/interim/libraries/spectra/exp/massbank_2025051_pos.rds
#> [2025-11-20 12:22:15.987 ] [INFO ] Loaded 66388 spectra from file
#> [2025-11-20 12:22:16.043 ] [INFO ] Import complete: 66388 spectra ready for analysis
#> [2025-11-20 12:22:16.044 ] [INFO ] Importing spectra from: data/interim/libraries/spectra/exp/merlin_13911806_pos.rds
#> [2025-11-20 12:22:20.813 ] [INFO ] Loaded 208280 spectra from file
#> [2025-11-20 12:22:20.966 ] [INFO ] Import complete: 208273 spectra ready for analysis
#> [2025-11-20 12:22:36.130 ] [INFO ] library spectra unique_connectivities
#> ISDB - Wikidata 998198 998198
#> gnps 354788 22675
#> merlin 208273 26197
#> massbank 66388 5901
#> [2025-11-20 12:22:37.499 ] [INFO ] Calculating entropy and similarity for 3840 spectra
#> [2025-11-20 12:27:23.697 ] [INFO ] 321348 Candidates were annotated on 3679 features, with at least 0 similarity score.
#> [2025-11-20 12:27:23.730 ] [INFO ] Exporting parameters to: data/interim/params/251120_122723_annotate_spectra.yaml
#> [2025-11-20 12:27:23.732 ] [INFO ] Exporting data to: data/interim/annotations/example_spectralMatches_pos.tsv.gz
#> [2025-11-20 12:27:25.518 ] [INFO ] Successfully exported 629774 rows to data/interim/annotations/example_spectralMatches_pos.tsv.gz
#> ✔ ann_spe_pos completed [5m 44.7s, 37.54 MB]
#> + ann_spe_neg dispatched
#> [2025-11-20 12:27:26.889 ] [INFO ] Starting spectral annotation in neg mode
#> [2025-11-20 12:27:26.896 ] [INFO ] Importing spectra from: data/source/example_spectra.mgf
#> [2025-11-20 12:27:26.941 ] [INFO ] Reading MGF file (7.41 MB) with optimized parser: data/source/example_spectra.mgf
#> [2025-11-20 12:27:28.612 ] [INFO ] Processed 10000 spectra...
#> [2025-11-20 12:27:29.822 ] [INFO ] Total spectra read: 16282
#> [2025-11-20 12:27:34.968 ] [INFO ] Loaded 16282 spectra from file
#> [2025-11-20 12:27:34.982 ] [WARN ] No spectra to sanitize
#> [2025-11-20 12:27:34.983 ] [INFO ] Import complete: 0 spectra ready for analysis
#> [2025-11-20 12:27:34.984 ] [WARN ] No query spectra loaded; returning empty dataframe
#> [2025-11-20 12:27:35.017 ] [INFO ] Exporting parameters to: data/interim/params/251120_122735_annotate_spectra.yaml
#> [2025-11-20 12:27:35.018 ] [INFO ] Exporting data to: data/interim/annotations/example_spectralMatches_neg.tsv.gz
#> [2025-11-20 12:27:35.020 ] [INFO ] Successfully exported 1 rows to data/interim/annotations/example_spectralMatches_neg.tsv.gz
#> ✔ ann_spe_neg completed [8.1s, 187 B]
#> + edg_spe dispatched
#> ✔ edg_spe completed [0ms, 533.82 kB]
#> + lib_mer_key dispatched
#> ✔ lib_mer_key completed [1ms, 18.06 MB]
#> + lib_mer_str_met dispatched
#> ✔ lib_mer_str_met completed [0ms, 36.02 MB]
#> + lib_mer_str_nam dispatched
#> ✔ lib_mer_str_nam completed [0ms, 11.28 MB]
#> + lib_mer_str_stereo dispatched
#> ✔ lib_mer_str_stereo completed [0ms, 43.58 MB]
#> + lib_mer_str_tax_cla dispatched
#> ✔ lib_mer_str_tax_cla completed [0ms, 2.51 MB]
#> + lib_mer_str_tax_npc dispatched
#> ✔ lib_mer_str_tax_npc completed [0ms, 2.44 MB]
#> + lib_mer_org_tax_ott dispatched
#> ✔ lib_mer_org_tax_ott completed [0ms, 939.13 kB]
#> + ann_ms1_pre dispatched
#> [2025-11-20 12:27:38.497 ] [INFO ] Starting mass-based annotation in pos mode
#> [2025-11-20 12:27:38.533 ] [INFO ] Processing 5328 features for annotation
#> [2025-11-20 12:27:59.970 ] [INFO ] Already 2112 adducts previously detected
#> [2025-11-20 12:28:00.034 ] [INFO ] Here are the top 10 observed m/z differences inside the RT windows:
#> [2025-11-20 12:28:00.035 ] [INFO ] bin N
#> (4.8501,5.0366] 352
#> (21.822,22.009] 283
#> (16.973,17.16] 208
#> (17.906,18.092] 192
#> (15.854,16.041] 172
#> (39.914,40.1] 143
#> (38.981,39.168] 137
#> (34.878,35.065] 115
#> (77.962,78.148] 114
#> (1.8659,2.0524] 108
#> [2025-11-20 12:28:00.036 ] [INFO ] These differences may help identify potential preprocessing issues
#> [2025-11-20 12:28:39.436 ] [INFO ] MS1 annotation results: 48099 unique structures annotated across 4224 features
#> [2025-11-20 12:28:39.484 ] [INFO ] Exporting parameters to: data/interim/params/251120_122839_annotate_masses.yaml
#> [2025-11-20 12:28:39.485 ] [INFO ] Exporting data to: data/interim/features/example_edgesMasses.tsv
#> [2025-11-20 12:28:39.486 ] [INFO ] Successfully exported 2653 rows to data/interim/features/example_edgesMasses.tsv
#> [2025-11-20 12:28:39.487 ] [INFO ] Exporting data to: data/interim/annotations/example_ms1Prepared.tsv.gz
#> [2025-11-20 12:28:40.218 ] [INFO ] Successfully exported 187762 rows to data/interim/annotations/example_ms1Prepared.tsv.gz
#> ✔ ann_ms1_pre completed [1m 1.7s, 157 B]
#> + ann_spe_exp_gnp_pre dispatched
#> [2025-11-20 12:28:41.010 ] [INFO ] Preparing GNPS annotations
#> [2025-11-20 12:28:41.013 ] [WARN ] No GNPS annotations found, returning an empty file instead
#> [2025-11-20 12:28:41.031 ] [INFO ] Exporting parameters to: data/interim/params/251120_122841_prepare_annotations_gnps.yaml
#> [2025-11-20 12:28:41.032 ] [INFO ] Exporting data to: data/interim/annotations/example_gnpsPrepared.tsv.gz
#> [2025-11-20 12:28:41.033 ] [INFO ] Successfully exported 1 rows to data/interim/annotations/example_gnpsPrepared.tsv.gz
#> ✔ ann_spe_exp_gnp_pre completed [23ms, 237 B]
#> + ann_spe_pre dispatched
#> [2025-11-20 12:28:41.410 ] [INFO ] Preparing spectral matching annotations from 2 file(s)
#> [2025-11-20 12:29:18.375 ] [INFO ] Exporting parameters to: data/interim/params/251120_122918_prepare_annotations_spectra.yaml
#> [2025-11-20 12:29:18.377 ] [INFO ] Exporting data to: data/interim/annotations/example_spectralMatchesPrepared.tsv.gz
#> [2025-11-20 12:29:21.220 ] [INFO ] Successfully exported 629774 rows to data/interim/annotations/example_spectralMatchesPrepared.tsv.gz
#> ✔ ann_spe_pre completed [39.8s, 61.25 MB]
#> + ann_sir_pre dispatched
#> [2025-11-20 12:29:22.221 ] [INFO ] Preparing SIRIUS 6 annotations
#> [2025-11-20 12:29:40.210 ] [INFO ] Exporting parameters to: data/interim/params/251120_122940_prepare_annotations_sirius.yaml
#> [2025-11-20 12:29:40.211 ] [INFO ] Exporting data to: data/interim/annotations/example_canopusPrepared.tsv.gz
#> [2025-11-20 12:29:40.212 ] [INFO ] Successfully exported 14 rows to data/interim/annotations/example_canopusPrepared.tsv.gz
#> [2025-11-20 12:29:40.213 ] [INFO ] Exporting data to: data/interim/annotations/example_formulaPrepared.tsv.gz
#> [2025-11-20 12:29:40.214 ] [INFO ] Successfully exported 16 rows to data/interim/annotations/example_formulaPrepared.tsv.gz
#> [2025-11-20 12:29:40.215 ] [INFO ] Exporting data to: data/interim/annotations/example_siriusPrepared.tsv.gz
#> [2025-11-20 12:29:40.219 ] [INFO ] Successfully exported 479 rows to data/interim/annotations/example_siriusPrepared.tsv.gz
#> ✔ ann_sir_pre completed [18s, 165 B]
#> + tax_pre dispatched
#> [2025-11-20 12:29:41.416 ] [INFO ] Preparing taxonomic assignments for features
#> [2025-11-20 12:29:41.420 ] [INFO ] Using metadata for organism assignments
#> [2025-11-20 12:29:41.573 ] [INFO ] Processing 2 organism name(s) for OTT taxonomy lookup
#> [2025-11-20 12:29:41.813 ] [INFO ] Querying OTT API in 1 batches
#> [2025-11-20 12:29:42.040 ] [INFO ] Retrying failed queries using genus names only
#> [2025-11-20 12:29:42.045 ] [INFO ] Retrying with 1 genus names: blk
#> [2025-11-20 12:29:42.244 ] [INFO ] Retrieving detailed taxonomy for 1 unique OTT IDs
#> [2025-11-20 12:29:42.364 ] [INFO ] Got OTTaxonomy!
#> [2025-11-20 12:29:42.791 ] [INFO ] Exporting parameters to: data/interim/params/251120_122942_prepare_taxa.yaml
#> [2025-11-20 12:29:42.792 ] [INFO ] Exporting data to: data/interim/taxa/example_taxed.tsv.gz
#> [2025-11-20 12:29:42.799 ] [INFO ] Successfully exported 5328 rows to data/interim/taxa/example_taxed.tsv.gz
#> ✔ tax_pre completed [1.4s, 19.70 kB]
#> + ann_ms1_pre_edg dispatched
#> ✔ ann_ms1_pre_edg completed [0ms, 81.71 kB]
#> + ann_ms1_pre_ann dispatched
#> ✔ ann_ms1_pre_ann completed [0ms, 10.81 MB]
#> + ann_sir_pre_can dispatched
#> ✔ ann_sir_pre_can completed [1ms, 784 B]
#> + ann_sir_pre_for dispatched
#> ✔ ann_sir_pre_for completed [0ms, 487 B]
#> + ann_sir_pre_str dispatched
#> ✔ ann_sir_pre_str completed [0ms, 24.42 kB]
#> + fea_edg_pre dispatched
#> [2025-11-20 12:29:44.810 ] [INFO ] Preparing molecular network edges
#> [2025-11-20 12:29:44.844 ] [INFO ] Prepared 17751 total edges
#> [2025-11-20 12:29:44.862 ] [INFO ] Exporting parameters to: data/interim/params/251120_122944_prepare_features_edges.yaml
#> [2025-11-20 12:29:44.863 ] [INFO ] Exporting data to: data/interim/features/example_edges.tsv
#> [2025-11-20 12:29:44.866 ] [INFO ] Successfully exported 17751 rows to data/interim/features/example_edges.tsv
#> ✔ fea_edg_pre completed [57ms, 758.28 kB]
#> + ann_fil dispatched
#> [2025-11-20 12:29:45.193 ] [INFO ] Filtering annotations
#> [2025-11-20 12:29:45.230 ] [INFO ] Processing 5328 unique features for annotation filtering
#> [2025-11-20 12:29:49.909 ] [INFO ] Removing MS1 annotations superseded by spectral matches
#> [2025-11-20 12:29:53.017 ] [INFO ] Removed 79186 redundant MS1 annotations
#> [2025-11-20 12:29:53.018 ] [INFO ] Total annotations before RT filtering: 738830
#> [2025-11-20 12:29:54.158 ] [INFO ] Filtering annotations outside Inf min RT tolerance
#> [2025-11-20 12:29:56.817 ] [INFO ] Removed 0 annotations based on retention time tolerance
#> [2025-11-20 12:29:57.066 ] [INFO ] Exporting parameters to: data/interim/params/251120_122957_filter_annotations.yaml
#> [2025-11-20 12:29:57.067 ] [INFO ] Exporting data to: data/interim/annotations/example_annotationsFiltered.tsv.gz
#> [2025-11-20 12:30:00.022 ] [INFO ] Successfully exported 739349 rows to data/interim/annotations/example_annotationsFiltered.tsv.gz
#> ✔ ann_fil completed [14.8s, 56.22 MB]
#> + fea_com dispatched
#> [2025-11-20 12:30:00.816 ] [INFO ] Creating components from 1 edge file(s)
#> [2025-11-20 12:30:00.830 ] [INFO ] Loaded 15234 edges connecting 5909 unique features
#> [2025-11-20 12:30:00.841 ] [INFO ] Found 2513 components
#> [2025-11-20 12:30:00.860 ] [INFO ] Component sizes - Min: 1, Max: 1586, Mean: 2.4
#> [2025-11-20 12:30:00.875 ] [INFO ] Exporting parameters to: data/interim/params/251120_123000_create_components.yaml
#> [2025-11-20 12:30:00.876 ] [INFO ] Exporting data to: data/interim/features/example_components.tsv
#> [2025-11-20 12:30:00.877 ] [INFO ] Successfully exported 5909 rows to data/interim/features/example_components.tsv
#> [2025-11-20 12:30:00.878 ] [INFO ] Components written to: data/interim/features/example_components.tsv
#> ✔ fea_com completed [63ms, 51.39 kB]
#> + int_com dispatched
#> ✔ int_com completed [0ms, 51.39 kB]
#> + fea_com_pre dispatched
#> [2025-11-20 12:30:01.553 ] [INFO ] Preparing molecular network components from 1 file(s)
#> [2025-11-20 12:30:01.557 ] [INFO ] Prepared 5909 unique feature-component assignments
#> [2025-11-20 12:30:01.572 ] [INFO ] Exporting parameters to: data/interim/params/251120_123001_prepare_features_components.yaml
#> [2025-11-20 12:30:01.573 ] [INFO ] Exporting data to: data/interim/features/example_componentsPrepared.tsv
#> [2025-11-20 12:30:01.574 ] [INFO ] Successfully exported 5909 rows to data/interim/features/example_componentsPrepared.tsv
#> ✔ fea_com_pre completed [22ms, 51.38 kB]
#> + ann_pre dispatched
#> [2025-11-20 12:30:01.913 ] [INFO ] Starting annotation weighting and scoring
#> [2025-11-20 12:30:19.472 ] [INFO ] candidate_library n
#> ISDB - Wikidata 577751
#> TIMA MS1 82324
#> gnps 25161
#> merlin 23150
#> massbank 3591
#> SIRIUS 479
#> [2025-11-20 12:30:26.600 ] [INFO ] Weighting 695299 annotations by biological source
#> [2025-11-20 12:30:32.058 ] [INFO ] Taxonomically informed metabolite annotation reranked:
#> Kingdom level: 41238 structures
#> Phylum level: 40784 structures
#> Class level: 35057 structures
#> Order level: 9353 structures
#> Family level: 7515 structures
#> Tribe level: 1184 structures
#> Genus level: 919 structures
#> Species level: 402 structures
#> Variety level: 0 structures
#> [2025-11-20 12:30:47.489 ] [INFO ] Weighting 695285 annotations by chemical consistency
#> [2025-11-20 12:30:51.377 ] [INFO ] Chemically informed metabolite annotation reranked:
#> Classyfire:
#> Kingdom level: 72980 structures
#> Superclass level: 72946 structures
#> Class level: 71293 structures
#> Parent level: 67979 structures
#> NPClassifier:
#> Pathway level: 76290 structures
#> Superclass level: 75535 structures
#> Class level: 71204 structures
#> [2025-11-20 12:30:51.381 ] [INFO ] Cleaning chemically weighted annotations
#> [2025-11-20 12:30:51.382 ] [INFO ] Filtering top 1 candidates and keeping only MS1 candidates with minimum 0 biological score OR 0 chemical score
#> [2025-11-20 12:30:58.518 ] [INFO ] [filtered] Removed 20161 low-confidence candidates (92.1% of 21884 total)
#> [2025-11-20 12:30:58.519 ] [INFO ] [filtered] 1723 high-confidence candidates remaining (7.9%)
#> [2025-11-20 12:30:58.522 ] [INFO ] Summarizing annotation results
#> [2025-11-20 12:31:07.964 ] [INFO ] Summarizing annotation results
#> [2025-11-20 12:32:00.646 ] [INFO ] Exporting parameters to: data/processed/20251120_123200_example/251120_123200_prepare_params.yaml
#> [2025-11-20 12:32:00.666 ] [INFO ] Exporting parameters to: data/processed/20251120_123200_example/251120_123200_prepare_params_advanced.yaml
#> [2025-11-20 12:32:00.668 ] [INFO ] Exporting data to: data/processed/20251120_123200_example/example_results_mini.tsv
#> [2025-11-20 12:32:00.671 ] [INFO ] Successfully exported 5824 rows to data/processed/20251120_123200_example/example_results_mini.tsv
#> [2025-11-20 12:32:00.672 ] [INFO ] Exporting data to: data/processed/20251120_123200_example/example_results_filtered.tsv
#> [2025-11-20 12:32:00.677 ] [INFO ] Successfully exported 5824 rows to data/processed/20251120_123200_example/example_results_filtered.tsv
#> [2025-11-20 12:32:00.678 ] [INFO ] Exporting data to: data/processed/20251120_123200_example/example_results.tsv
#> [2025-11-20 12:32:01.465 ] [INFO ] Successfully exported 695221 rows to data/processed/20251120_123200_example/example_results.tsv
#> ✔ ann_pre completed [1m 59.6s, 371.33 MB]
#> ✔ ended pipeline [21m 6.5s, 126 completed, 0 skipped]
#> There were 15 warnings (use warnings() to see them)3 Performing Taxonomically Informed Metabolite Annotation
This vignette describes how Taxonomically Informed Metabolite Annotation is performed. If you followed all previous steps successfully, this should be a piece of cake, you deserve it!
The final exported file is formatted in order to be easily imported in Cytoscape to further explore your data!
We hope you enjoyed using TIMA and are pleased to hear from you!
For any remark or suggestion, please fill an issue or feel free to contact us directly.
Reuse
Citation
BibTeX citation:
@online{rutz2025,
author = {Rutz, Adriano},
title = {3 {Performing} {Taxonomically} {Informed} {Metabolite}
{Annotation}},
date = {2025-11-20},
url = {https://taxonomicallyinformedannotation.github.io/tima/vignettes/articles/III-processing.html},
langid = {en}
}
For attribution, please cite this work as:
Rutz, Adriano. 2025. “3 Performing Taxonomically Informed
Metabolite Annotation.” November 20, 2025. https://taxonomicallyinformedannotation.github.io/tima/vignettes/articles/III-processing.html.