Submit Manuscript  

Article Details

Chemometric Analysis of Inter- and Intra-Molecular Diversification Factors by a Machine Learning Simplex Approach. A Review and Research on Astragalus saponins

[ Vol. 17 , Issue. 25 ]


Abir Sarraj-Laabidi, Habib Messai, Asma Hammami-Semmar and Nabil Semmar   Pages 2820 - 2848 ( 29 )


Metabolisms represent highly organized systems characterized by strong regulations satisfying the mass conservation principle. This makes a whole chemical resource to be competitively shared between several ways at both intra-and inter-molecular scales. Whole resource sharing can be statistically translated by a constant sum-unit constraint which represents the basis of simplex mixture rule. In this work, a new simplex-based simulation approach was developed to extract scaffold information on metabolic processes controlling molecular diversity from a wide set of observed chemical structures. Starting from a wide dataset of chemical structures initially classified into p clusters, a machine learning process was applied by linearly combining the p clusters j through several (N) samplings of a constant number (n) of molecules by respecting different clusters’ weights (wj/w) given by Scheffé’s mixture matrix. At the output of mixture design, the N molecular linear combinations lead to calculate N barycentric molecules integrating the characteristics of the different weighted clusters. The mixture-design was iterated by bootstrap technique for extensive exploration of chemical variability between and within clusters. Finally, the K response matrices resulting from K iterated mixture designs were averaged to calculate a smoothed matrix containing scaffold information on regulation processes responsible for molecular diversification at inter- and intra-molecular (atomic) scales. This matrix was used as a backbone for graphical analysis of multidirectional positive and negative trends between atomic characteristics (chemical substitutions) at both mentioned scales. This new simplex approach was illustrated by cycloartane- based saponins of Astragalus genus by combining three desmosylation clusters characterized by relative glycosylation levels of different aglycones' carbons.


Computational chemistry, Simulation, Training, Molecular diversity, Cycloartane, Glycosylation, Desmosylation.


Institut Pasteur de Tunis, Laboratory of Bioinformatics, Biomathematics and Biostatistics (BIMS), Université de Tunis El Manar, 1002, Tunis, Institut Pasteur de Tunis, Laboratory of Biomedical Genomics and Oncogenetics, Université de Tunis El Manar, 1002, Tunis, National Institute of Applied Sciences and Technology (INSAT), Université de Carthage, 1080, Tunis, Université de Tunis El Manar, Institut Pasteur de Tunis, Laboratory of Bioinformatics, Biomathematics and Biostatistics (BIMS), 1002, Tunis

Graphical Abstract:

Read Full-Text article