Wiley, one of the world’s largest publishers and a global leader in research and learning, today announced the release of the new Wiley Database of Predicted IR Spectra. The database combines over 60 years of expertise in infrared (IR) spectroscopy and spectral data curation with the most current machine-learning techniques to significantly expand the number of IR spectral data available for spectral analysis.
With over 250,000 predicted spectra, this new library was created by Wiley Science Solutions using an AI-powered spectrum prediction engine derived from its high-quality Fourier-transform infrared spectroscopy (FTIR) empirical spectral database collection—the largest commercially available. The predicted library can be used along with Wiley’s empirical IR spectral reference databases in the spectral analysis of unknown samples and is especially useful for rarer compounds and materials, when a match cannot be found in any empirical database.
“Leveraging AI, we’re proud to have achieved such high levels of accuracy and performance levels approaching that of empirical libraries,” said Graeme Whitley, Director, New Business Development at Wiley. “We are committed to continuing our development and progress in this area to help scientists to better solve the most universal analysis problems.”
The Wiley Database of Predicted IR Spectra is a general IR library that covers a broad range of chemical compound classes, including general organics, flavors and fragrances, industrial compounds, androstanes, estrogens and steroids, metabolites, lipids, geochemicals, petrochemicals, biomarkers, drugs, pharmaceuticals, pesticides, toxicology, terpenes and other volatiles found in food, natural products, as well as monomers, PFAS, and more.
Wiley conducted both external and internal studies by subject matter experts (SMEs) to validate results from the predicted database. From these two studies, the SMEs concluded that the new database characterizes unknown spectral functional groups and performs well when searched against using sample spectra. They suggest that the optimal workflow for the predicted library is to use it when an empirical library results in either low hit quality index (HQI) scores, poor matching, or no matches to help users classify and determine the structural characteristics and possible identity of unknown compounds.
While Wiley has among the largest, most extensive commercially available library of high-quality empirical infrared data, the total coverage is significantly smaller than the overall chemical space in use by chemists, life scientists, and materials scientists – i.e., the total number of possible molecules and compounds within a set of elements and rules. Augmenting the empirical coverage within the bounds of a predictive model (the chemical space of the underlying training set) is a strategy to help improve the overall density of coverage within that space for identification of unknowns, especially for novel compounds.
The Wiley Database of Predicted IR Spectra is available for use exclusively with Wiley’s KnowItAll software, a comprehensive solution for spectral analysis and management.