1. Classification of particle trajectories in living cells: machine learning versus statistical testing hypothesis for fractional anomalous diffusion
- Author
-
Joanna Janczura, Patrycja Kowalek, Janusz Szwabiński, Aleksander Weron, and Hanna Loch-Olszewska
- Subjects
Cell Survival ,Anomalous diffusion ,Computer science ,FOS: Physical sciences ,Machine learning ,computer.software_genre ,Models, Biological ,01 natural sciences ,Quantitative Biology - Quantitative Methods ,Receptors, G-Protein-Coupled ,010305 fluids & plasmas ,Diffusion ,Machine Learning ,Set (abstract data type) ,GTP-Binding Proteins ,0103 physical sciences ,Physics - Biological Physics ,010306 general physics ,Quantitative Methods (q-bio.QM) ,Statistical hypothesis testing ,business.industry ,Biological Transport ,Class (biology) ,Single Molecule Imaging ,Random forest ,Identification (information) ,Biological Physics (physics.bio-ph) ,FOS: Biological sciences ,Trajectory ,Artificial intelligence ,Gradient boosting ,business ,computer - Abstract
Single-particle tracking (SPT) has become a popular tool to study the intracellular transport of molecules in living cells. Inferring the character of their dynamics is important, because it determines the organization and functions of the cells. For this reason, one of the first steps in the analysis of SPT data is the identification of the diffusion type of the observed particles. The most popular method to identify the class of a trajectory is based on the mean square displacement (MSD). However, due to its known limitations, several other approaches have been already proposed. With the recent advances in algorithms and the developments of modern hardware, the classification attempts rooted in machine learning (ML) are of particular interest. In this work, we adopt two ML ensemble algorithms, i.e. random forest and gradient boosting, to the problem of trajectory classification. We present a new set of features used to transform the raw trajectories data into input vectors required by the classifiers. The resulting models are then applied to real data for G protein-coupled receptors and G proteins. The classification results are compared to recent statistical methods going beyond MSD., 32 pages, 5 figures
- Published
- 2020