QSPR approach introduced at the NACRW conference (Fort Lauderdale, Miami)

The discovery of unexpected, emerging contaminants and their by-products and metabolites or newly produced pesticides through suspected and non-targeted approaches is gaining more and more attention and open new horizons in many fields such as food safety, ecotoxicology, environment, health… Nevertheless, the process of moving from annotation to identification can be time-consuming, complex and fraught. The prediction of liquid chromatographic retention times (RT) by different approaches can be an useful and operational way to efficiently discriminate and select between several molecular formulas and between several molecular structures.

The development of quantitative structure-retention relationship (QSRR) models, which are types of models allowing to establish a link between a chemical structure and a property, here the chromatographic retention time, requires an adequate selection of molecular descriptors necessarily obtained based on a chemical structure known. This requires also a selection of the best machine learning/IA algorithm and its optimization.

Here we will present different strategies for the selection of descriptors, different types of machine learning/IA algorithms according to the different situations we are confronted with. These strategies may vary depending on the level of information/annotation we have. We will conclude by proposing a methodology based in part on QSRR to improve and secure the annotation process as it has been published by Schymanski and colleagues in 2014.

We published this article in relation with this presentation: https://doi.org/10.1016/j.talanta.2023.125214