Article Highlight | 9-May-2026

Deep learning‑enhanced QSPR model improves prediction of supercritical properties for thousands of organic compounds

Higher Education Press

Supercritical fluids are essential in cleaning, extraction, and chromatography, but determining their critical temperature and pressure experimentally is challenging—especially for polar or thermally unstable compounds. A study published in Frontiers of Chemical Science and Engineering introduces a deep learningenhanced QSPR framework that directly incorporates complete molecular structures to boost predictive accuracy.

The team compiled a diverse data set of 1359 organic compounds (alkanes, alkenes, alcohols, acids, esters, etc.). Using density functional theory (B3LYP/6311G(d,p)), they optimized each molecular structure and extracted threedimensional electron density grids. From RDKit, 400 molecular descriptors were calculated, and the maximal information coefficient (MIC) selected the 20 most relevant descriptors for Tc and pc.

Three models were built and compared:

  1. Traditional ANN (descriptoronly). Validation set: for TcR2=0.865, MAPE = 4.14 %; for pcR2=0.913, MAPE = 4.77 %.
  2. 3D ResNet (CNN) (molecular structure only). The model suffered severe overfitting due to the highdimensional input (108×98×44) and limited data; validation R2 dropped to 0.419 for Tc and 0.658 for pc.
  3. CNNenhanced ANN – a pretrained (frozen) ANN predicts from descriptors, while a trainable ResNet learns the residual error. This hybrid strategy achieved the best performance. Validation results:
    • TcR2=0.888, Pearson r=0.947, MAPE = 5.03 %, MSE = 1682.
    • pcR2=0.919, r=0.960, MAPE = 6.37 %, MSE = 11.7.

Tenfold crossvalidation confirmed robustness (average R2=0.875 for Tc, 0.924 for pc). All models were compared with the JOBACK group contribution method, which gave R2=0.815 (MAPE = 4.92 %) for Tc and R2=0.916 (MAPE = 5.64 %) for pc, and failed to predict 74 compounds. The CNNenhanced ANN consistently outperformed both JOBACK and the descriptoronly ANN.

The authors note that while their broadcompound model has slightly lower accuracy than models specialized for single classes, it provides reliable predictions across a very wide chemical space. A current limitation is the need for DFT optimization, which raises the barrier for nonspecialists; future work will explore faster methods or transfer learning.

This study demonstrates that integrating complete molecular structure information via deep learning can significantly enhance QSPR models, offering a powerful tool for predicting supercritical properties of organic compounds in engineering applications.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.