Start Over

selectBoost : a general algorithm to enhance the performance of variable selection methods

Authors :: Nicolas Jung
Laurent Vallat
Ismail Aouadi
Myriam Maumy-Bertrand
Seiamak Bahram
Raphael Carapito
Frédéric Bertrand
Institut de Recherche Mathématique Avancée (IRMA)
Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS)
Institut de Recherche en Mathématiques, Interactions et Applications (Labex_IRMIA)
Fédération Hospitalo-Universitaire (OMICARE)
Centre de Recherche d’Immunologie et d’Hématologie [Strasbourg]
Laboratoire Modélisation et Sûreté des Systèmes (LM2S)
Laboratoire Informatique et Société Numérique (LIST3N)
Université de Technologie de Troyes (UTT)-Université de Technologie de Troyes (UTT)
Immuno-Rhumatologie Moléculaire
Université de Strasbourg (UNISTRA)-Institut National de la Santé et de la Recherche Médicale (INSERM)
Laboratoire International Associé (LIA) INSERM, Strasbourg (France) - Nagano (Japan), Strasbourg, France
Université de Technologie de Troyes (UTT)
Bertrand, Frédéric
Source :: Bioinformatics, Bioinformatics, 2021, 37 (5), pp.659-668. ⟨10.1093/bioinformatics/btaa855⟩, Bioinformatics, Oxford University Press (OUP), 2021, 37 (5), pp.659-668. ⟨10.1093/bioinformatics/btaa855⟩
Publication Year :: 2021
Publisher :: HAL CCSD, 2021.
Abstract: Motivation With the growth of big data, variable selection has become one of the critical challenges in statistics. Although many methods have been proposed in the literature, their performance in terms of recall (sensitivity) and precision (predictive positive value) is limited in a context where the number of variables by far exceeds the number of observations or in a highly correlated setting. Results In this article, we propose a general algorithm, which improves the precision of any existing variable selection method. This algorithm is based on highly intensive simulations and takes into account the correlation structure of the data. Our algorithm can either produce a confidence index for variable selection or be used in an experimental design planning perspective. We demonstrate the performance of our algorithm on both simulated and real data. We then apply it in two different ways to improve biological network reverse-engineering. Availability and implementation Code is available as the SelectBoost package on the CRAN, https://cran.r-project.org/package=SelectBoost. Some network reverse-engineering functionalities are available in the Patterns CRAN package, https://cran.r-project.org/package=Patterns. Supplementary information Supplementary data are available at Bioinformatics online.

Subjects :: Big Data
Statistics and Probability
AcademicSubjects/SCI01060
Computer science
Big data
Gene Expression
Value (computer science)
Context (language use)
Feature selection
computer.software_genre
01 natural sciences
Biochemistry
010104 statistics & probability
03 medical and health sciences
[STAT.AP] Statistics [stat]/Applications [stat.AP]
Code (cryptography)
Sensitivity (control systems)
0101 mathematics
Molecular Biology
[INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]
030304 developmental biology
Structure (mathematical logic)
0303 health sciences
[STAT.AP]Statistics [stat]/Applications [stat.AP]
[SDV.MHEP] Life Sciences [q-bio]/Human health and pathology
[STAT.ME] Statistics [stat]/Methodology [stat.ME]
business.industry
Original Papers
Computer Science Applications
Computational Mathematics
model selection regression classification regularization prediction dimension cancer pls
Computational Theory and Mathematics
Research Design
Data mining
[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]
business
computer
[STAT.ME]Statistics [stat]/Methodology [stat.ME]
Algorithms
Software
Biological network
[SDV.MHEP]Life Sciences [q-bio]/Human health and pathology

Details

Language :: English
ISSN :: 13674803 and 13674811
Database :: OpenAIRE
Journal :: Bioinformatics, Bioinformatics, 2021, 37 (5), pp.659-668. ⟨10.1093/bioinformatics/btaa855⟩, Bioinformatics, Oxford University Press (OUP), 2021, 37 (5), pp.659-668. ⟨10.1093/bioinformatics/btaa855⟩
Accession number :: edsair.doi.dedup.....60379b5e9278843f6d90aaf92047b56f

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

selectBoost : a general algorithm to enhance the performance of variable selection methods

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

selectBoost : a general algorithm to enhance the performance of variable selection methods

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources