1. Real-Time Sampling Strategies for Regression with Irrelevant Features.
- Author
-
Cacciarelli, Davide, Tyssedal, John Sølve, and Kulahci, Murat
- Subjects
FEATURE selection ,LEARNING strategies ,ONLINE education ,PREDICTION models ,ARTIFICIAL intelligence ,BIG data - Abstract
In the era of big data, companies are increasingly driven to amass vast amounts of data, particularly in process industries where advanced sensor technologies are prevalent. However, obtaining accurate labels or product information through quality inspections can be prohibitively expensive. Active learning emerges as a promising approach to optimize data sampling by prioritizing the most informative data points. Nevertheless, active learning strategies heavily rely on predictive models that are iteratively updated. Aligning with the principles of data-centric AI, this study highlights the detrimental effects of passively incorporating all available process variables into a predictive model for guiding data collection. Specifically, in real-time sampling strategies based on online active learning, the inclusion of irrelevant features significantly hampers the efficiency of the learning process. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF