Rule-based control (RBC) is widely adopted in buildings due to its stability and
robustness. It resembles a behavior cloning methodology refined by human experts; however,
it is incapable of adapting to distribution drifts.
Reinforcement learning (RL) can adapt to changes but needs to
learn from scratch in the online setting. On the other hand, the learning ability is limited in offline settings
due to extrapolation errors caused by selecting out-of-distribution actions.
In this paper, we explore how to incorporate RL with a rule-based control policy to combine
their strengths to continuously learn a scalable and robust policy in both
online and offline settings.
We start with representative online and offline RL methods, TD3 and TD3+BC,
respectively. Then, we develop a dynamically weighted actor loss function to
selectively choose which policy for RL models to learn from at each training iteration.
With extensive experiments across various weather conditions in both deterministic and
stochastic scenarios, we demonstrate that our algorithm,
rule-based incorporated
control regularization (RUBICON), outperforms state-of-the-art
methods in offline settings by
- Successfully install Sinergym
- Git clone our repository
git clone https://github.com/HYDesmondLiu/RUBICON.git
cd ./RUBICON/01_BRL/
orcd ./RUBICON/02_OnlineRL/
- Modify the
Sinergym*.py
to fit your GPU availability. - Run
python Sinergym_BRL.py
orpython Sinergym.py
- The dataset we learned from for the offline approach is at https://github.com/HYDesmondLiu/B2RL
@inproceedings{liu2023rule,
title={Rule-based policy regularization for reinforcement learning-based building control},
author={Liu, Hsin-Yu and Balaji, Bharathan and Gupta, Rajesh and Hong, Dezhi},
booktitle={Proceedings of the 14th ACM International Conference on Future Energy Systems},
pages={242--265},
year={2023}
}