Discrete Derivatives for Atom-Pairs as a Novel Graph Theoretical Invariant for Generating New Molecular Descriptors: Orthogonality, Interpretation and QSARs/ QSPRs on Benchmark Databases

This report presents a new mathematical method based on the concept of the derivative of a molecular graph (G) with respect to a given event (S) to codify chemical structure information. The derivate over each pair of atoms in the molecule is defined as ∂G/∂S(vi , vj)=(fi−2fij+fj)/fij, where fi (or fj) and fij are the individual frequency of atom i (or j) and the reciprocal frequency of the atoms i and j, respectively. These frequencies characterize the participation intensity of atom pairs in S. Here, the event space is composed of molecular sub-graphs which participate in the formation of the G skeleton that could be complete (representing all possible connected sub-graphs) or comprised of sub-graphs of certain orders or types or combinations of these. The atom level graph derivative index, Δi, is expressed as a linear combination of all atom pair derivatives that include the atomic nuclei i. Global [total or local (group or atom-type)] indices are obtained by applying the so called invariants over a vector of Δi values. The novel MDs are validated using a data set of 28 alkyl-alcohols and other benchmark data sets proposed by the International Academy of Mathematical Chemistry. Also, the boiling point for the alcohols, the adrenergic blocking activity of N,N-dimethyl-2-halo-phenethylamines and physicochemical properties of polychlorinated biphenyls and octanes are modeled. These models exhibit satisfactory predictive power compared with other 0–3D indices implemented successfully by other researchers. In addition, tendencies of the proposed indices are investigated using examples of various types of molecular structures, including chain-lengthening, branching, heteroatoms-content, and multiple bonds. On the other hand, the relation of atom-based derivative indices with 17O NMR of a series of ethers and carbonyls reflects that the new MDs encode electronic, topological and steric information. Linear independence between the graph derivative indices and other 0-3D MDs is demonstrated by using principal component analysis on a dataset of 41 heterogeneous molecules. It is concluded that the graph derivative indices are independent indices containing important structural information to be used in QSPR/QSAR and drug design studies, and permit obtaining easier, more interpretable and robust mathematical models than the majority of those reported in the literature.
Generalized incidence matrix, Sub-graph, Invariant, Molecular descriptors, DIVATI, Genetic algorithm, QS
Martínez‐Santiago, O., Millán‐Cabrera, R., Marrero‐Ponce, Y., Barigye, S. J., Martínez‐López, Y., Torrens, F., & Pérez‐Giménez, F. (2014). Discrete Derivatives for Atom‐Pairs as a Novel Graph‐Theoretical Invariant for Generating New Molecular Descriptors: Orthogonality, Interpretation and QSARs/QSPRs on Benchmark Databases. Molecular Informatics, 33(5), 343-368.DOI: 10.1002/minf.201300173