4 Performing Taxonomically Informed Metabolite Annotation

This vignette describes how to perform Taxonomically Informed Metabolite Annotation. If you followed all previous steps successfully, this should be a piece of cake, you deserve it!

library("timaR")

Before running the corresponding code, do not forget to modify inst/params/user/weight_annotations.yaml.

source(file = "inst/scripts/weight_annotations.R")
#> 2023-08-29 14:12:53.448922 This script performs taxonomically informed scoring and followed by chemical consistency informed scoring 
#> 2023-08-29 14:12:53.454833 Authors:  AR , PMA 
#>  
#> 2023-08-29 14:12:53.455229 Contributors: ... 
#> ✔ skip target yaml_paths
#> ✔ skip target dic_add
#> ✔ skip target dic_neu_los
#> ✔ skip target paths
#> ✔ skip target par_def_ann_spe
#> ✔ skip target par_def_pre_fea_tab
#> ✔ skip target lib_sop_ecm
#> ✔ skip target par_def_pre_lib_sop_ecm
#> ✔ skip target par_def_cre_edg_spe
#> ✔ skip target par_def_pre_fea_com
#> ✔ skip target par_def_pre_lib_sop_mer
#> ✔ skip target par_def_pre_ann_spe
#> ✔ skip target par_def_pre_lib_rt
#> ▶ start target lib_spe_is_lot_pos
#> File already exists. Skipping.
#> ● built target lib_spe_is_lot_pos [0.001 seconds]
#> ▶ start target lib_spe_is_lot_neg
#> File already exists. Skipping.
#> ● built target lib_spe_is_lot_neg [0.007 seconds]
#> ✔ skip target par_def_pre_tax
#> ✔ skip target par_def_pre_lib_sop_clo
#> ▶ start target lib_sop_lot
#> A file with the same size is already present. Skipping
#> ● built target lib_sop_lot [1.83 seconds]
#> ✔ skip target par_def_fil_ann
#> ✔ skip target par_pre_par
#> ✔ skip target par_def_ann_mas
#> ✔ skip target par_def_pre_fea_edg
#> ✔ skip target par_def_cre_com
#> ✔ skip target par_def_wei_ann
#> ✔ skip target par_def_pre_lib_sop_lot
#> ✔ skip target par_def_pre_lib_add
#> ✔ skip target lib_spe_is_lot_pre_pos
#> ✔ skip target lib_spe_is_lot_pre_neg
#> ✔ skip target par_fin_par
#> ✔ skip target par_usr_cre_edg_spe
#> ✔ skip target par_usr_pre_ann_spe
#> ✔ skip target par_usr_fil_ann
#> ✔ skip target par_usr_pre_lib_rt
#> ✔ skip target par_usr_pre_fea_tab
#> ✔ skip target par_usr_ann_spe
#> ✔ skip target par_usr_pre_fea_com
#> ✔ skip target par_usr_pre_tax
#> ✔ skip target par_usr_pre_lib_sop_clo
#> ✔ skip target par_usr_wei_ann
#> ✔ skip target par_usr_ann_mas
#> ✔ skip target par_usr_pre_lib_sop_mer
#> ✔ skip target par_usr_cre_com
#> ✔ skip target par_usr_pre_lib_sop_ecm
#> ✔ skip target par_usr_pre_lib_sop_lot
#> ✔ skip target par_usr_pre_fea_edg
#> ✔ skip target par_usr_pre_lib_add
#> ✔ skip target par_cre_edg_spe
#> ✔ skip target par_pre_ann_spe
#> ✔ skip target par_fil_ann
#> ✔ skip target par_pre_lib_rt
#> ✔ skip target par_pre_fea_tab
#> ✔ skip target par_ann_spe
#> ✔ skip target par_pre_fea_com
#> ✔ skip target par_pre_tax
#> ✔ skip target par_pre_lib_sop_clo
#> ✔ skip target par_wei_ann
#> ✔ skip target par_ann_mas
#> ✔ skip target par_pre_lib_sop_mer
#> ✔ skip target par_cre_com
#> ✔ skip target par_pre_lib_sop_ecm
#> ✔ skip target par_pre_lib_sop_lot
#> ✔ skip target par_pre_fea_edg
#> ✔ skip target par_pre_lib_add
#> ✔ skip target lib_rt
#> ✔ skip target gnps_tables
#> ✔ skip target lib_sop_clo_pre
#> ✔ skip target lib_sop_ecm_pre
#> ✔ skip target lib_sop_lot_pre
#> ✔ skip target lib_rt_sop
#> ✔ skip target lib_rt_rts
#> ✔ skip target gnps_metadata
#> ✔ skip target gnps_edges
#> ✔ skip target gnps_features
#> ✔ skip target gnps_spectra
#> ✔ skip target gnps_components
#> ✔ skip target lib_sop_mer
#> ✔ skip target input_metadata
#> ✔ skip target input_features
#> ✔ skip target input_spectra
#> ✔ skip target lib_mer_str_tax_cla
#> ✔ skip target lib_mer_org_tax_ott
#> ✔ skip target lib_mer_str_2d_3d
#> ✔ skip target lib_mer_str_met
#> ✔ skip target lib_mer_str_nam
#> ✔ skip target lib_mer_str_tax_npc
#> ✔ skip target lib_mer_key
#> ✔ skip target fea_pre
#> ✔ skip target ann_spe_is_lot_neg
#> ✔ skip target fea_edg_spe
#> ✔ skip target ann_spe_is_lot_pos
#> ✔ skip target tax_pre
#> ✔ skip target lib_add
#> ✔ skip target edg_spe
#> ✔ skip target ann_spe_is_pre
#> ✔ skip target ann_ms1_pre
#> ✔ skip target ann_ms1_pre_ann
#> ✔ skip target ann_ms1_pre_edg
#> ✔ skip target ann_fil
#> ✔ skip target fea_edg_pre
#> ✔ skip target fea_com
#> ✔ skip target int_com
#> ✔ skip target fea_com_pre
#> ▶ start target ann_pre
#> 2023-08-29 14:13:00.510288 Loading files ... 
#> 2023-08-29 14:13:00.513001 ... components 
#> 2023-08-29 14:13:00.619872 ... edges 
#> 2023-08-29 14:13:00.657766 ... structure-organism pairs 
#> 2023-08-29 14:13:06.514085 ... annotations 
#> 2023-08-29 14:13:28.274829 adding biological organism metadata 
#> 2023-08-29 14:13:30.099232 performing taxonomically informed scoring 
#> 2023-08-29 14:13:30.161354 filtering top  3  candidates and keeping only MS1 candidates with minimum 
#>  0.5  biological score 
#>  OR 0.666 chemical score 
#>  
#> 2023-08-29 14:13:43.501027 selecting DB columns 
#>  
#> 2023-08-29 14:13:44.222037 keeping distinct candidates per taxonomical rank 
#>  
#> 2023-08-29 14:13:45.795574 calculating biological scores ... 
#>  
#> 2023-08-29 14:13:45.795786 ... domain 
#>  
#> 2023-08-29 14:13:47.77545 ... kingdom 
#>  
#> 2023-08-29 14:13:50.71623 ... phylum 
#>  
#> 2023-08-29 14:13:53.520924 ... class 
#>  
#> 2023-08-29 14:13:57.230276 ... order 
#>  
#> 2023-08-29 14:14:04.006272 ... family 
#>  
#> 2023-08-29 14:14:08.930156 ... tribe 
#>  
#> 2023-08-29 14:14:11.99058 ... genus 
#>  
#> 2023-08-29 14:14:18.009001 ... species 
#>  
#> 2023-08-29 14:14:26.167878 ... varietas 
#>  
#> 2023-08-29 14:14:27.572494 keeping best biological score only 
#>  
#> 2023-08-29 14:14:51.105946 taxonomically informed scoring led to 
#>  95702 annotations reranked at the kingdom level, 
#>  94484 annotations reranked at the phylum level, 
#>  88995 annotations reranked at the class level, 
#>  29942 annotations reranked at the order level, 
#>  22805 annotations reranked at the family level, 
#>  3196 annotations reranked at the tribe level, 
#>  2265 annotations reranked at the genus level, 
#>  826 annotations reranked at the species level, 
#>  and 0 annotations reranked at the variety level. 
#>  
#> 2023-08-29 14:14:53.025388 adding "notClassified" 
#>  
#> 2023-08-29 14:14:54.768526 calculating chemical consistency
#>               features with at least 2 neighbors ... 
#>  
#> 2023-08-29 14:14:54.768765 ... among edges ... 
#>  
#> 2023-08-29 14:14:59.818068 ... at the (classyfire) kingdom level 
#>  
#> 2023-08-29 14:15:03.815033 ... at the (NPC) pathway level 
#>  
#> 2023-08-29 14:15:08.538888 ... at the (classyfire) superclass level 
#>  
#> 2023-08-29 14:15:12.751976 ... at the (NPC) superclass level 
#>  
#> 2023-08-29 14:15:23.575866 ... at the (classyfire) class level 
#>  
#> 2023-08-29 14:15:36.717266 ... at the (NPC) class level 
#>  
#> 2023-08-29 14:16:07.747641 ... at the (classyfire) parent level 
#>  
#> 2023-08-29 14:16:43.215121 joining all except -1 together 
#>  
#> 2023-08-29 14:16:53.257391 adding dummy consistency for features
#>               with less than 2 neighbors 
#>  
#> 2023-08-29 14:16:56.649831 calculating chemical score at all levels ... 
#>  
#> 2023-08-29 14:16:56.650042 ... (classyfire) kingdom 
#>  
#> 2023-08-29 14:17:05.745146 ... (NPC) pathway 
#>  
#> 2023-08-29 14:17:14.410753 ... (classyfire) superclass 
#>  
#> 2023-08-29 14:17:19.668003 ... (NPC) superclass 
#>  
#> 2023-08-29 14:17:21.22134 ... (classyfire) class 
#>  
#> 2023-08-29 14:17:22.860509 ... (NPC) class 
#>  
#> 2023-08-29 14:17:23.353894 ... (classyfire) parent 
#>  
#> 2023-08-29 14:17:24.327558 ... joining best score 
#>  
#> 2023-08-29 14:17:27.831513 ... cleaning 
#>  
#> 2023-08-29 14:17:51.165036 chemically informed scoring led to 
#>  136002 annotations reranked at the (classyfire) kingdom level, 
#>  100850 annotations reranked at the (NPC) pathway level, 
#>  86961 annotations reranked at the (classyfire) superclass level, 
#>  61647 annotations reranked at the (NPC) superclass level, 
#>  52600 annotations reranked at the (classyfire) class level, 
#>  15047 annotations reranked at the (NPC) class level, and 
#>  15046 annotations reranked at the (classyfire) parent level. 
#>  
#> 2023-08-29 14:17:56.407413 adding initial metadata (RT, etc.) and simplifying columns 
#>  
#> 2023-08-29 14:17:56.818269 adding references 
#>  
#> 2023-08-29 14:18:06.974977 selecting columns to export 
#>  
#> 2023-08-29 14:18:09.745539 adding consensus again to droped candidates 
#>  
#> 2023-08-29 14:18:33.934627 Exporting ... 
#> 2023-08-29 14:18:33.942758 Directory data/processed/230829_141833_example created. 
#> 2023-08-29 14:18:33.943056 ... path to used parameters is data/processed/230829_141833_example 
#> 2023-08-29 14:18:33.952483 ... path to export is data/processed/230829_141833_example/annotations.tsv 
#> ● built target ann_pre [5.562 minutes]
#> ▶ end pipeline [5.642 minutes]
#> 2023-08-29 14:18:34.542438 Script finished in 5.684895 mins

The final exported file is formatted in order to be easily imported in Cytoscape to further explore your data!

We hope you enjoyed using TIMA and are pleased to hear from you!

For any remark or suggestion, please fill an issue or feel free to contact us directly.

Adriano Rutz

2023-08-29