library("tima")
smiles <- "C=C[C@H]1[C@H](OC=C2C1=CCOC2=O)O[C@H]3[C@H]([C@H]([C@H]([C@H](O3)CO)O)O)O"
data.frame(
"structure_smiles_initial" = smiles
) |>
process_smiles()Process SMILES
Description
This function processes SMILES strings using RDKit (via Python) to standardize structures, generate InChIKeys, calculate molecular properties, and extract 2D representations. Results are cached to avoid reprocessing.
Usage
process_smiles(df, smiles_colname = "structure_smiles_initial", cache = NULL)
Arguments
df
|
Data frame containing SMILES strings to process |
smiles_colname
|
Character string name of the column containing SMILES (default: "structure_smiles_initial") |
cache
|
Character string path to cached processed SMILES file, or NULL to skip caching (default: NULL) |
Value
Data frame with processed SMILES including InChIKey, molecular formula, exact mass, 2D SMILES, xLogP, and connectivity layer