Currently, human sleep staging methods based on electroencephalogram (EEG) signals show a trend towards single-channel and deep network models, however, single-channel information acquisition makes EEG lose the positional information of brain regions, and the features characterizing sleep stages in EEG tend to be sparse and thus difficult to extract, at the same time, the common problems of deep networks-the artificial setting of the model and its training hyperparameters make the training process blind and inefficient, and these problems lead to the low accuracy of automatic sleep staging methods. Therefore, this paper proposed to use the inter-layer feature reuse function of DenseNet to explore the sleep state information hidden in EEG signals, and improved the DenseNet model for the low-frequency characteristics of single-channel EEG signals in the frequency domain and the long-range dependence of single-channel EEG signals in the time domain, so as to achieve the fast and accurate sleep staging of the human body. In order to further improve the performance of DenseNet, it used a deep deterministic policy gradient (DDPG) algorithm to optimize and automatically adjust the key hyperparameters of DenseNet using the reinforcement learning idea during the network learning and training process. The experimental results show that the staging accuracy of the algorithm model on the Sleep-EDFx dataset reaches 89.23%, and the overall performance is better than other advanced staging algorithms in recent years, demonstrating good application prospects. [ABSTRACT FROM AUTHOR]