The construction of intelligent fully mechanized mining faces in China’s coal mines is in its early stages, and the automation technology of hydraulic support electro-hydraulic control has been widely applied. However, at present, the automation control logic of a single solidification of hydraulic supports is difficult to adapt to complex, ever-changing, and dynamic production scenarios. The actual control process in mine face still mostly adopts a collaborative approach of automation and manual intervention. In response to the demand for human-machine interaction and cooperation tasks in the complex scenarios of fully mechanized mining faces, a theoretical method and technical principle of a multimodal human-machine collaborative control system for hydraulic supports following the shearer in the middle range mining face (hereinafter referred to as hydraulic supports following) was proposed. Firstly, four human-machine collaborative modes, namely manual, division of labor, approval, and veto, were designed. Based on the factors such as coal seam geology, gas and dust, shearer speed, hydraulic support intelligence level, system status, job technical level, and task load, an AND-OR graph model for hydraulic support human-machine collaborative mode selection was constructed, which achieved a modal selection with manual or machine preferences. Then, a human-machine collaborative control decision-making mechanism for hydraulic supports following was designed. On this basis, an AI inference technology for the secondary control strategy of hydraulic supports following was proposed. Specifically, using on-site data to learn manual operation experience, a decision tree classification model for whether the hydraulic support should be secondary controlled and a Bayesian regression model for estimating the secondary control time of pulling hydraulic support were constructed. Based on the above model, a human-machine collaborative control decision-making program for the hydraulic support following was developed, which achieved human-machine collaborative control of hydraulic supports following based on AND/OR inference for modal selection and AI inference for secondary control strategy. Finally, using cloud edge end architecture software and hardware technology, a multimodal human-machine collaborative control system for hydraulic supports following was developed, which achieved the control functions such as model evolution, operational reasoning, and program execution. The system has undergone industrial trial operation on the 3404 mining face of Shaqu No.2 Mine. The result shows the efficiency of the hydraulic support following has increased by an average of 2% compared to that before. This paper forms an efficient and safe human-machine interactive decision-making mechanism for the comprehensive mining equipment group, which will provide practical theoretical methods and feasible technical paths for the development of intelligent comprehensive mining working faces.