Skip to contents

This vignette describes how Taxonomically Informed Metabolite Annotation is performed. If you followed all previous steps successfully, this should be a piece of cake, you deserve it!

targets::tar_make(names = tidyselect::matches("ann_pre$"))
#> ✔ skipped target yaml_paths
#> ✔ skipped target paths
#> ✔ skipped target par_pre_par
#> ✔ skipped target par_def_ann_mas
#> ✔ skipped target paths_test_mode
#> ✔ skipped target par_def_pre_lib_sop_ecm
#> ✔ skipped target par_def_pre_fea_edg
#> ✔ skipped target paths_urls_massbank_url
#> ✔ skipped target paths_data_source_libraries_sop_lotus
#> ✔ skipped target par_def_pre_lib_spe
#> ✔ skipped target paths_urls_massbank_version
#> ✔ skipped target par_pre_par2
#> ✔ skipped target par_def_pre_lib_sop_mer
#> ✔ skipped target par_def_ann_spe
#> ✔ skipped target par_def_pre_ann_spe
#> ▶ dispatched target par_def_wei_ann
#> ● completed target par_def_wei_ann [0.001 seconds]
#> ✔ skipped target par_def_cre_edg_spe
#> ✔ skipped target paths_data_source_libraries_sop_ecmdb
#> ✔ skipped target paths_urls_ecmdb_metabolites
#> ✔ skipped target paths_urls_examples_spectral_lib_pos
#> ✔ skipped target paths_data_source_libraries_spectra_is_lotus_neg
#> ✔ skipped target paths_data_source_spectra
#> ✔ skipped target paths_urls_examples_spectral_lib_neg
#> ✔ skipped target par_def_pre_fea_tab
#> ✔ skipped target par_def_pre_lib_sop_lot
#> ✔ skipped target paths_urls_lotus_pattern
#> ✔ skipped target paths_data_source_libraries_spectra_is_lotus_pos
#> ✔ skipped target par_def_pre_fea_com
#> ✔ skipped target par_def_fil_ann
#> ✔ skipped target par_def_pre_ann_sir
#> ✔ skipped target paths_data_source_libraries_sop_hmdb
#> ✔ skipped target par_def_pre_lib_sop_clo
#> ✔ skipped target paths_urls_hmdb_structures
#> ✔ skipped target par_def_cre_com
#> ✔ skipped target par_def_pre_lib_sop_hmd
#> ✔ skipped target par_def_pre_ann_gnp
#> ✔ skipped target par_def_pre_lib_rt
#> ✔ skipped target par_def_pre_tax
#> ✔ skipped target paths_urls_examples_spectra_mini
#> ✔ skipped target paths_urls_massbank_file
#> ✔ skipped target paths_urls_lotus_doi
#> ✔ skipped target par_fin_par
#> ✔ skipped target par_fin_par2
#> ✔ skipped target lib_sop_ecm
#> ▶ dispatched target lib_spe_is_lot_neg
#> File already exists. Skipping.
#> ● completed target lib_spe_is_lot_neg [0 seconds]
#> ▶ dispatched target lib_spe_is_lot_pos
#> File already exists. Skipping.
#> ● completed target lib_spe_is_lot_pos [0 seconds]
#> ✔ skipped target lib_sop_hmd
#> ✔ skipped target lib_spe_exp_mb_raw
#> ▶ dispatched target lib_sop_lot
#> A file with the same size is already present. Skipping
#> ● completed target lib_sop_lot [2.455 seconds]
#> ✔ skipped target par_usr_ann_mas
#> ✔ skipped target par_usr_pre_ann_gnp
#> ✔ skipped target par_usr_pre_lib_sop_ecm
#> ✔ skipped target par_usr_ann_spe
#> ✔ skipped target par_usr_pre_ann_spe
#> ▶ dispatched target par_usr_wei_ann
#> 2024-08-23 21:50:02 Loading default params 
#> 2024-08-23 21:50:02 All params 
#> 2024-08-23 21:50:02 Small params 
#> 2024-08-23 21:50:02 Changing params 
#> 2024-08-23 21:50:02 Changing filenames 
#> 2024-08-23 21:50:03 Exporting params ... 
#> ● completed target par_usr_wei_ann [0.282 seconds]
#> ✔ skipped target par_usr_fil_ann
#> ✔ skipped target par_usr_pre_fea_com
#> ✔ skipped target par_usr_pre_lib_sop_clo
#> ✔ skipped target par_usr_pre_fea_edg
#> ✔ skipped target par_usr_pre_ann_sir
#> ✔ skipped target par_usr_pre_lib_sop_lot
#> ✔ skipped target par_usr_pre_lib_spe
#> ✔ skipped target par_usr_pre_lib_sop_hmd
#> ✔ skipped target par_usr_cre_edg_spe
#> ✔ skipped target par_usr_pre_lib_sop_mer
#> ✔ skipped target par_usr_pre_lib_rt
#> ✔ skipped target par_usr_pre_fea_tab
#> ✔ skipped target par_usr_cre_com
#> ✔ skipped target par_usr_pre_tax
#> ✔ skipped target lib_spe_is_lot_pre_neg
#> ✔ skipped target lib_spe_is_lot_pre_pos
#> ✔ skipped target lib_spe_exp_mb_pre
#> ✔ skipped target par_ann_mas
#> ✔ skipped target par_pre_ann_gnp
#> ✔ skipped target par_pre_lib_sop_ecm
#> ✔ skipped target par_ann_spe
#> ✔ skipped target par_pre_ann_spe
#> ▶ dispatched target par_wei_ann
#> ● completed target par_wei_ann [0.001 seconds]
#> ✔ skipped target par_fil_ann
#> ✔ skipped target par_pre_fea_com
#> ✔ skipped target par_pre_lib_sop_clo
#> ✔ skipped target par_pre_fea_edg
#> ✔ skipped target par_pre_ann_sir
#> ✔ skipped target par_pre_lib_sop_lot
#> ✔ skipped target par_pre_lib_spe
#> ✔ skipped target par_pre_lib_sop_hmd
#> ✔ skipped target par_cre_edg_spe
#> ✔ skipped target par_pre_lib_sop_mer
#> ✔ skipped target par_pre_lib_rt
#> ✔ skipped target par_pre_fea_tab
#> ✔ skipped target par_cre_com
#> ✔ skipped target par_pre_tax
#> ✔ skipped target lib_spe_exp_mb_pre_pos
#> ✔ skipped target lib_spe_exp_mb_pre_neg
#> ✔ skipped target lib_spe_exp_mb_pre_sop
#> ✔ skipped target lib_sop_ecm_pre
#> ✔ skipped target input_spectra
#> ✔ skipped target lib_sop_clo_pre
#> ✔ skipped target lib_sop_lot_pre
#> ✔ skipped target lib_spe_exp_int_pre
#> ✔ skipped target lib_sop_hmd_pre
#> ✔ skipped target lib_rt
#> ✔ skipped target input_features
#> ✔ skipped target fea_edg_spe
#> ✔ skipped target lib_spe_exp_int_pre_pos
#> ✔ skipped target lib_spe_exp_int_pre_neg
#> ✔ skipped target lib_spe_exp_int_pre_sop
#> ✔ skipped target lib_rt_sop
#> ✔ skipped target lib_rt_rts
#> ✔ skipped target fea_pre
#> ✔ skipped target edg_spe
#> ✔ skipped target ann_spe_pos
#> ✔ skipped target ann_spe_neg
#> ✔ skipped target lib_sop_mer
#> ✔ skipped target lib_mer_str_met
#> ✔ skipped target lib_mer_str_nam
#> ✔ skipped target lib_mer_str_stereo
#> ✔ skipped target lib_mer_str_tax_cla
#> ✔ skipped target lib_mer_str_tax_npc
#> ✔ skipped target lib_mer_key
#> ✔ skipped target lib_mer_org_tax_ott
#> ✔ skipped target ann_sir_pre
#> ✔ skipped target ann_spe_exp_gnp_pre
#> ✔ skipped target ann_spe_pre
#> ✔ skipped target ann_ms1_pre
#> ✔ skipped target tax_pre
#> ✔ skipped target ann_sir_pre_for
#> ✔ skipped target ann_sir_pre_can
#> ✔ skipped target ann_sir_pre_str
#> ✔ skipped target ann_ms1_pre_edg
#> ✔ skipped target ann_ms1_pre_ann
#> ✔ skipped target fea_edg_pre
#> ✔ skipped target ann_fil
#> ✔ skipped target fea_com
#> ✔ skipped target int_com
#> ✔ skipped target fea_com_pre
#> ▶ dispatched target ann_pre
#> 2024-08-23 21:50:03 Loading files ... 
#> 2024-08-23 21:50:03 ... components 
#> 2024-08-23 21:50:03 ... edges 
#> 2024-08-23 21:50:03 ... structure-organism pairs 
#> 2024-08-23 21:50:09 ... canopus 
#> 2024-08-23 21:50:09 ... formula 
#> 2024-08-23 21:50:09 ... annotations 
#> 2024-08-23 21:50:10 Got c("ISDB", "TIMA MS1") initial annotations 
#>  2024-08-23 21:50:10 Got c(976, 294052) initial annotations 
#> 2024-08-23 21:50:11 Re-arranging annotations 
#> 2024-08-23 21:50:12 adding biological organism metadata 
#> 2024-08-23 21:50:12 performing taxonomically informed scoring 
#> 2024-08-23 21:50:12 filtering top  3  candidates and keeping only MS1 candidates with minimum 
#>  0  biological score 
#>  OR 0 chemical score 
#>  
#> 2024-08-23 21:50:12 adding "notClassified" 
#>  
#> 2024-08-23 21:50:13 calculating biological score at all levels ... 
#>  
#> 2024-08-23 21:50:13 ... domain 
#>  
#> 2024-08-23 21:50:13 ... kingdom 
#>  
#> 2024-08-23 21:50:13 ... phylum 
#>  
#> 2024-08-23 21:50:13 ... class 
#>  
#> 2024-08-23 21:50:13 ... order 
#>  
#> 2024-08-23 21:50:13 ... family 
#>  
#> 2024-08-23 21:50:13 ... tribe 
#>  
#> 2024-08-23 21:50:13 ... genus 
#>  
#> 2024-08-23 21:50:13 ... species 
#>  
#> 2024-08-23 21:50:14 ... varietas 
#>  
#> 2024-08-23 21:50:14 ... keeping best biological score 
#>  
#> 2024-08-23 21:50:15 ... calculating weighted biological score 
#>  
#> 2024-08-23 21:50:16 taxonomically informed scoring led to 
#>  46866 annotations reranked at the kingdom level, 
#>  46430 annotations reranked at the phylum level, 
#>  38609 annotations reranked at the class level, 
#>  11231 annotations reranked at the order level, 
#>  9263 annotations reranked at the family level, 
#>  1533 annotations reranked at the tribe level, 
#>  1214 annotations reranked at the genus level, 
#>  459 annotations reranked at the species level, 
#>  and 0 annotations reranked at the variety level. 
#>  WITHOUT TAKING CONSISTENCY SCORE INTO ACCOUNT! (for later predictions) 
#> 2024-08-23 21:50:16 calculating chemical consistency
#>               features with at least 2 neighbors ... 
#>  
#> 2024-08-23 21:50:17 ... among all edges ... 
#>  
#> 2024-08-23 21:50:17 ... at the (classyfire) kingdom level 
#>  
#> 2024-08-23 21:50:17 ... at the (NPC) pathway level 
#>  
#> 2024-08-23 21:50:17 ... at the (classyfire) superclass level 
#>  
#> 2024-08-23 21:50:17 ... at the (NPC) superclass level 
#>  
#> 2024-08-23 21:50:18 ... at the (classyfire) class level 
#>  
#> 2024-08-23 21:50:18 ... at the (NPC) class level 
#>  
#> 2024-08-23 21:50:19 ... at the (classyfire) parent level 
#>  
#> 2024-08-23 21:50:20 splitting already computed predictions 
#>  
#> 2024-08-23 21:50:21 joining all except -1 together 
#>  
#> 2024-08-23 21:50:22 adding dummy consistency for features
#>               with less than 2 neighbors 
#>  
#> 2024-08-23 21:50:23 adding already computed predictions back 
#>  
#> 2024-08-23 21:50:24 calculating chemical score at all levels ... 
#>  
#> 2024-08-23 21:50:24 ... (classyfire) kingdom 
#>  
#> 2024-08-23 21:50:24 ... (NPC) pathway 
#>  
#> 2024-08-23 21:50:24 ... (classyfire) superclass 
#>  
#> 2024-08-23 21:50:24 ... (NPC) superclass 
#>  
#> 2024-08-23 21:50:24 ... (classyfire) class 
#>  
#> 2024-08-23 21:50:24 ... (NPC) class 
#>  
#> 2024-08-23 21:50:24 ... (classyfire) parent 
#>  
#> 2024-08-23 21:50:24 ... keeping best chemical score 
#>  
#> 2024-08-23 21:50:25 ... calculating weighted chemical score 
#>  
#> 2024-08-23 21:50:25 chemically informed scoring led to 
#>  34048 annotations reranked at the (classyfire) kingdom level, 
#>  22452 annotations reranked at the (NPC) pathway level, 
#>  17048 annotations reranked at the (classyfire) superclass level, 
#>  10044 annotations reranked at the (NPC) superclass level, 
#>  17015 annotations reranked at the (classyfire) class level, 
#>  9870 annotations reranked at the (NPC) class level, and 
#>  9454 annotations reranked at the (classyfire) parent level. 
#>  WITHOUT TAKING CONSISTENCY SCORE INTO ACCOUNT! 
#> 2024-08-23 21:50:26 Keeping high confidence candidates only... 
#> 2024-08-23 21:50:26 Removed 293743 low confidence candidates out of the 295814 total ones. 
#> 2024-08-23 21:50:26 2071 high confidence candidates remaining. 
#> 2024-08-23 21:50:27 adding initial metadata (RT, etc.) and simplifying columns 
#>  
#> 2024-08-23 21:50:27 adding references 
#>  
#> 2024-08-23 21:50:28 selecting columns to export 
#>  
#> 2024-08-23 21:50:28 adding consensus again to droped candidates 
#>  
#> 2024-08-23 21:50:31 Exporting ... 
#> 2024-08-23 21:50:31 Directory data/processed/240823_215031_example created. 
#> 2024-08-23 21:50:31 ... path to used parameters is data/processed/240823_215031_example 
#> 2024-08-23 21:50:31 ... path to used parameters is data/processed/240823_215031_example 
#> 2024-08-23 21:50:31 ... path to export is data/processed/240823_215031_example/example_results.tsv 
#> ● completed target ann_pre [27.786 seconds]
#> ▶ ended pipeline [32.989 seconds]
#> 

The final exported file is formatted in order to be easily imported in Cytoscape to further explore your data!

We hope you enjoyed using TIMA and are pleased to hear from you!

For any remark or suggestion, please fill an issue or feel free to contact us directly.