1. Development of a protein-ligand extended connectivity (PLEC) fingerprint and its application for binding affinity predictions
- Author
-
Pawel Siedlecki, Michał Kukiełka, Marta M. Stepniewska-Dziubinska, and Maciej Wójcikowski
- Subjects
Statistics and Probability ,Computer science ,Protein Data Bank (RCSB PDB) ,Computational biology ,ENCODE ,Ligands ,Biochemistry ,Machine Learning ,03 medical and health sciences ,Molecule ,Databases, Protein ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Drug discovery ,Ligand ,030302 biochemistry & molecular biology ,Fingerprint (computing) ,A protein ,Proteins ,Construct (python library) ,Ligand (biochemistry) ,Small molecule ,Original Papers ,Structural Bioinformatics ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Cheminformatics ,Protein Binding - Abstract
Motivation Fingerprints (FPs) are the most common small molecule representation in cheminformatics. There are a wide variety of FPs, and the Extended Connectivity Fingerprint (ECFP) is one of the best-suited for general applications. Despite the overall FP abundance, only a few FPs represent the 3D structure of the molecule, and hardly any encode protein–ligand interactions. Results Here, we present a Protein–Ligand Extended Connectivity (PLEC) FP that implicitly encodes protein–ligand interactions by pairing the ECFP environments from the ligand and the protein. PLEC FPs were used to construct different machine learning models tailored for predicting protein–ligand affinities (pKi∕d). Even the simplest linear model built on the PLEC FP achieved Rp = 0.817 on the Protein Databank (PDB) bind v2016 ‘core set’, demonstrating its descriptive power. Availability and implementation The PLEC FP has been implemented in the Open Drug Discovery Toolkit (https://github.com/oddt/oddt). Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2018