Back to Search
Start Over
Towards Interactively Improving ML Data Preparation Code via 'Shadow Pipelines'
- Publication Year :
- 2024
-
Abstract
- Data scientists develop ML pipelines in an iterative manner: they repeatedly screen a pipeline for potential issues, debug it, and then revise and improve its code according to their findings. However, this manual process is tedious and error-prone. Therefore, we propose to support data scientists during this development cycle with automatically derived interactive suggestions for pipeline improvements. We discuss our vision to generate these suggestions with so-called shadow pipelines, hidden variants of the original pipeline that modify it to auto-detect potential issues, try out modifications for improvements, and suggest and explain these modifications to the user. We envision to apply incremental view maintenance-based optimisations to ensure low-latency computation and maintenance of the shadow pipelines. We conduct preliminary experiments to showcase the feasibility of our envisioned approach and the potential benefits of our proposed optimisations.
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.2404.19591
- Document Type :
- Working Paper
- Full Text :
- https://doi.org/10.1145/3650203.3663327