1. Design and Application of Meteorological Algorithm Scheduling Framework Based on Data Perception Technology
- Author
-
Huo Qing, He Wenchun, Gao Feng, Chen Shiwang, Xu Yongjun, and Wang Qi
- Subjects
data perception technology ,task scheduling ,meteorological algorithm ,data processing line ,Meteorology. Climatology ,QC851-999 - Abstract
The generation efficiency of data products depends on both the computational efficiency and the startup efficiency of algorithms. In meteorological operations, algorithms are typically data-driven, meaning they are initiated immediately upon data arrival to accelerate the generation time of data products. Therefore, data-driven meteorological services urgently require an efficient task scheduling framework to achieve the goal of starting and running algorithms as soon as data arrive, and to improve the generation efficiency of meteorological data products. CMA Big Data and Cloud Platform, referred to as Tianqing and led by National Meteorological Information Center, began nationwide business operations in December 2021. The data processing line (DPL), as the core function of Tianqing, enables the unified management and centralized scheduling of meteorological algorithms. DPL has established various task scheduling capabilities, including timer-triggered scheduling, sequential scheduling, data-arrival scheduling, and manual scheduling. Among these methods, data-arrival scheduling based on data reporting status enables the algorithm to start immediately after data reporting, greatly improving the startup time of meteorological algorithms and the generation time of meteorological data products.Core functions of data-arrival scheduling include data state awareness components, task scheduling execution components, task scheduling post-processing components, and configuration management. Among these components, the data state perception component realizes real-time analysis of the reporting status of various meteorological data in the sky engine, and sends scheduling messages to the task scheduling execution component when the data reporting rate meets scheduling requirements. The task scheduling execution component combines the necessary resource information for the algorithm and computing nodes optimization, generates task scheduling instructions, and implements algorithm startup execution. The post-processing phase of task scheduling includes gathering and updating algorithm execution status, calculation node status, and sending of alarm information for algorithm execution abnormalities. Configuration management supports configuring data-aware scheduling parameters on the front-end page.Real-time analysis of data status and scheduling enables efficient task scheduling that starts and runs the algorithm as soon as data are reported. The scheduling delay is significantly reduced, compared to the original schedule, from 3784 ms to 11 ms. Data-arrival scheduling, as the core capability of CMA Big Data and Cloud Platform, is deployed and operated in real-time in provinces and regions. Currently, the system supports the efficient scheduling of 19 core business algorithms at the national level, with a total of approximately 6.67×105 daily scheduling tasks and an average scheduling delay of 31 ms. Efficient scheduling is achieved with 14 algorithms supported at the provincial level, encompassing a total of approximately 8×104 daily scheduling times and an average scheduling delay of 156 ms. In addition, data aware scheduling achieves seamless integration of upstream and downstream algorithms in meteorological services, providing a solution to eliminate the problem of disconnection between meteorological services and improve collaboration between meteorological services.
- Published
- 2024
- Full Text
- View/download PDF