Skip to content

dhyeythumar/PPO-algo-with-custom-Unity-environment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PPO Algorithm with a custom environment

This repo contains the implementation of the Proximal Policy Optimization algorithm using the Keras library on a custom environment made with Unity 3D engine.


Important details about this repository:

Windows environment binary is used in this repo. But if you want to use the Linux environment binary, then change the ENV_NAME in train.py & test.py scripts to the correct path pointing to those binaries stored over here.

What’s In This Document

Introduction

  • Check out this video to see the trained agent using the learned navigation skills to find the flag in a closed environment, which is divided into nine different segments.
  • And if you want to see the training phase/process of this agent, then check out this video.

Environment Specific Details

These are some details which you should know before hand. And I think without knowing this, you might get confused because some of the Keras implementations are environment-dependent.

Check this doc for detailed information.

A small overview of the environment:

  • Observation/State space: Vectorized (unlike Image)
  • Action space: Continuous (unlike Discrete)
  • Action shape: (num of agents, 2) (Here num of agents alive at every env step is 1, so shape(1, 2))
  • Reward System:
    • (1.0/MaxStep) per step (MaxStep is used to reset the env irrespective of achieving the goal state) & the same reward is used if the agent crashes into the walls.
    • +2 if the agent reaches the goal state.

Setup Instructions

Install the ML-Agents github repo release_1 branch, but if you want to use the different branch version then modify the python APIs to interact with the environment.

  • Clone this repos:

    $ git clone --branch release_1 https://github.com/Unity-Technologies/ml-agents.git
    
    $ git clone https://github.com/Dhyeythumar/PPO-algo-with-custom-Unity-environment.git
  • Create and activate the python virtual environment: (Python version used - 3.8.x)

    $ python -m venv myvenv
    $ myvenv\Scripts\activate
  • Install the dependencies: (check the exact dependency versions in requirements.txt file)

    (myvenv) $ pip install -e ./ml-agents/ml-agents-envs
    (myvenv) $ pip install tensorflow
    (myvenv) $ pip install keras
    (myvenv) $ pip install tensorboardX

Getting Started

  • Now to start the training process use the following commands:

    (myvenv) $ cd PPO-algo-with-custom-Unity-environment
    (myvenv) $ python train.py
  • Activate the tensorboard:

    $ tensorboard --logdir=./training_data/summaries --port 6006

Motivation and Learning

This video by OpenAI inspired me to develop something in the field of reinforcement learning. So for the first phase, I decided to create a simple RL agent who can learn navigation skills.

After completing the first phase, I gained much deeper knowledge in the RL domain and got some of my following questions answered:

  • How to create custom 3D environments using the Unity engine?
  • How to use ML-Agents (Unity's toolkit for reinforcement learning) to train the RL agents?
  • And I also learned to implement the PPO algorithm using the Keras library. 😃

What's next? 🤔

So I have started working on the next phase of this project, which will include a multi-agent environment setup and, I am also planning to increase the difficulty level. So for more updates, stay tuned for the next video on my youtube channel.

License

Licensed under the MIT License.

Acknowledgements

  1. Unity ML-Agents Python Low Level API
  2. rl-bot-football