1. MDSubSampler: a posteriori sampling of important protein conformations from biomolecular simulations
- Author
-
Oues, N, Dantu, SC, Patel, RJ, and Pandini, A
- Abstract
Data availability: A sample of the trajectory data is included in the GitHub repository. The data underpinning this publication can be accessed from Brunel University London's data repository under CC BY license: https://doi.org/10.17633/rd.brunel.c.6620539 . Supplementary information: Supplementary data are available at Bioinformatics online at https://academic-oup-com.ezproxytest.brunel.ac.uk/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad427/7221036?searchresult=1#supplementary-data . Copyright © The Author(s) 2023.. Motivation: Molecular dynamics (MD) simulations have become routine tools for the study of protein dynamics and function. Thanks to faster GPU-based algorithms, atomistic and coarse-grained simulations are being used to explore biological functions over the microsecond timescale, yielding terabytes of data spanning multiple trajectories, thereby extracting relevant protein conformations without losing important information is often challenging. Results: We present MDSubSampler, a Python library and toolkit for a posteriori subsampling of data from multiple trajectories. This toolkit provides access to uniform, random, stratified, weighted sampling and bootstrapping sampling methods. Sampling can be performed under the constraint of preserving the original distribution of relevant geometrical properties. Possible applications include simulations post-processing, noise reduction and structures selection for ensemble docking. Availability: MDSubSampler is freely available at https://github.com/alepandini/MDSubSampler, along with guidance on installation and tutorials on how it can be used. NO is supported by a scholarship from Brunel University London EPSRC DTP (grant no. EP/T518116/1). This project made use of time on HPC granted via the UK High-End Computing Consortium for Biomolecular Simulation, HECBioSim (https://www.hecbiosim.ac.uk), supported by EPSRC (grant no. EP/X035603/1).
- Published
- 2023