Chemically Standardized Dataset of 512 Kinases for Statistical Modeling
datasetposted on 13.02.2020 by Lindsey Burggraaff
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Compound dataset consisting of structures and bioactivity data (classes) for 512 kinases. Chemical structures are available as InChIKey and bioactivity data as either active (pChEMBL >= 6.5) or inactive (pChEMBL < 6.5) (the meaning of the pChEMBL value can be found on: https://www.ebi.ac.uk/chembl/). The compound structures are chemically standardised by neutralising charges, removing salts, and keeping the largest fragment. The dataset was used in training and validation of statistical models (QSAR and PCM).