library("tima")
smiles <- "C=C[C@H]1[C@H](OC=C2C1=CCOC2=O)O[C@H]3[C@H]([C@H]([C@H]([C@H](O3)CO)O)O)O"
df <- data.frame(structure_smiles_initial = smiles)
process_smiles(df)Process SMILES strings
Description
Processes SMILES using RDKit (via Python) to standardize structures, generate InChIKeys, calculate molecular properties, and extract 2D representations. Results are cached to avoid reprocessing.
Usage
process_smiles(df, smiles_colname = "structure_smiles_initial", cache = NULL)
Arguments
df
|
Data frame containing SMILES strings |
smiles_colname
|
Column name containing SMILES (default: "structure_smiles_initial") |
cache
|
Path to cached processed SMILES file, or NULL to skip caching |
Value
Data frame with processed SMILES including InChIKey, molecular formula, exact mass, 2D SMILES, xLogP, and connectivity layer