Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees
Kai-Chieh Hsu1, Allen Z. Ren1, Duy Phuong Nguyen, Anirudha Majumdar2, Jaime F. Fisac2
1equal contribution in alphabetical order 2equal advising
Princeton University, Artificial Intelligence Journal (AIJ), January 2023
Please raise an issue or reach out at kaichieh or allen dot ren at princenton dot edu if you need help with running the code.
Install the conda environment with dependencies (tested with Ubuntu 20.04):
conda env create -f environment_linux.yml
pip install -e .
- Generates the dataset using the following commands
- vanilla-normal
python data/process/vanilla/gen_normal.py
- vanilla-task
python data/process/vanilla/gen_lab_task.py -sf <src_folder> -tf <tar_folder>
- vanilla-dynamics
python data/process/vanilla/gen_lab_task.py -sf <src_folder> -tf <tar_folder> -p -xpl 0.3 0.5 -xpu 0.8 1.0
- vanilla-normal
- All sim and lab training configuration files are provided in config/vanilla. Note you would need to update the paths to the policy models and datasets in the config file.
- run sim and lab training for PAC-Perf method with the following commands.
python script/sim_prior.py -cf config/vanilla/sim_pac_perf.yaml python script/sim_posterior.py -cf config/vanilla/lab_pac_perf.yaml
- run sim and lab training for Base method with the following commands.
python script/sim_naive_rl.py -cf config/vanilla/sim_base.yaml python script/sim_naive_rl.py -cf config/vanilla/lab_base.yaml
- run sim and lab training for PAC-Perf method with the following commands.
- Please follow the instructions here to generate the environments using 3D-FRONT dataset.
- All sim and lab training configuration files are provided in config/advanced. For example, run sim and lab training for PAC-Perf method with the following commands. Note you would need to update the paths to the policy models and datasets in the config file.
python script/sim_prior.py -cf config/advanced/dense/sim_pac_perf.yaml
python script/sim_posterior.py -cf config/advanced/realistic/lab_pac_perf.yaml
- We recommend using at least 80GB of RAM for Advanced-Dense and Advanced-Realistic training. The runtime also largely depends on (1) the number of CPU threads used for parallelized environments, and (2) whether using a GPU for policy training.
We highly recommend trying out the visualization scripts to get a sense of the environments and policies before running any policy training.
You can visualize the Vanilla environments using pre-trained policies with:
python script/test_vanilla.py -cf config/vanilla/test_pac_perf.yaml
You can visualize the Advanced Dense and Realistic environments using pre-trained policies with:
python script/test_advanced_dense.py -cf config/advanced/test_dense.yaml
python script/test_advanced_realistic.py -cf config/advanced/test_reliastic.yaml
If you find our paper or code useful, please consider citing us with:
@article{hsuren2022slr,
title = {Sim-to-Lab-to-Real: Safe Reinforcement Learning with Shielding and Generalization Guarantees},
journal = {Artificial Intelligence},
pages = {103811},
year = {2022},
issn = {0004-3702},
doi = {https://doi.org/10.1016/j.artint.2022.103811},
author = {Kai-Chieh Hsu and Allen Z. Ren and Duy P. Nguyen and Anirudha Majumdar and Jaime F. Fisac},
}