1. Deep Q-network boosted with external knowledge for HVAC control
- Author
-
Khoder Jneid, Pierre Jallon, Stéphane Ploix, Patrick Reignier, Algorithms, Principles and TheorIes for collaborative Knowledge acquisition And Learning (APTIKAL), Laboratoire d'Informatique de Grenoble (LIG), Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA), Gestion et Conduite des Systèmes de Production (G-SCOP_GCSP), and Laboratoire des sciences pour la conception, l'optimisation et la production (G-SCOP)
- Subjects
business.industry ,Computer science ,020209 energy ,0211 other engineering and technologies ,Control engineering ,Rule-based system ,Robotics ,02 engineering and technology ,Energy consumption ,Optimal control ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,[INFO.INFO-IU]Computer Science [cs]/Ubiquitous Computing ,Model predictive control ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Air conditioning ,021105 building & construction ,HVAC ,0202 electrical engineering, electronic engineering, information engineering ,Reinforcement learning ,Artificial intelligence ,business ,ComputingMilieux_MISCELLANEOUS - Abstract
Heating, ventilation, and air conditioning (HVAC) systems consume nearly 40% of the total energy consumption in developed countries. Traditional techniques such as rule based control (RBC) fail to control these systems in an optimal way. Model predictive control (MPC) has been widely explored in literature as well but it doesn't represent a practical solution due to the complexity of buildings' dynamics that it relies on. Recently, deep reinforcement learning (DRL) has shown great success in the domain of optimal control such as robotics and gaming. In this paper, we develop two model-free DRL approaches to optimize the energy consumption of an office while maintaining thermal comfort and good indoor air quality through controlling the radiator and the opening/closing of a window and a door existing in the office. The two DRL approaches belong to deep-Q network (DQN): the first approach represents a DQN agent with no knowledge of the environment and the second approach represents a DQN agent with initial knowledge of the environment: A hybrid approach DQN+RBC. The goal of having external knowledge in DQN agent is to boost convergence by exploiting the RBC rules. We evaluate the performance of these two approaches against an RBC approach through simulations using a physical model of the office's dynamics. Experiments show that the two DRL approaches succeeded to maintain better thermal comfort and better indoor air quality compared with RBC approach while consuming nearly the same energy. In addition, experiments demonstrate that the DQN with knowledge outperforms the DQN with no knowledge in the beginning and converges faster to the optimal value.
- Published
- 2021
- Full Text
- View/download PDF