Usage Guide#
Quick Facts#
Here we provide Examples of how to use D4MARL .
You can train offline MARL policy by runnning python run_** .
You can customize the configuration of the algorithm by running on the visible platform streamlit run visualize.py .
You can run a run to download and train policy by adding the parameter download_dataset .
You can run an evaluation by simply clicking the compare methods .
You can choose the training curve color of each method by clicking the color .
Train Policy#
Example
You can train an offline MARL policy by running:
if [ mode == "baseline" ]
then
python -u run_baseline_sc2.py \
--offline_data_dir $path_to_data \
--download_dataset \ # download demo dataset to start a quick training
--algorithm $baseline_algorithm \
elif [ mode == "madt" ]
then
python -u run_madt_sc2.py \
--offline_data_dir $path_to_data \
--download_dataset \
fi
Here we provide an example of training MADT models using 2m_vs_1z data:
Hint
The above command will train a policy with baseline algorithms including ICQ, BCQ, CQL, or MADT, and the total training steps is 1024. The vector environment number is 1. The algo_cfgs:steps_per_epoch is default as 500. If there is no local offline dataset in the offline_data_dir, the command will download the dataset automatically from our online storage.
Customize Configuration#
Example
You can also customize the configuration of the algorithm by running:
streamlit run visualize.py
Here we provide a user interface, in this platform, you can choose which specific task and approach need to be trained offline:
Hint
We developed a visible training tool that integrates data preparation, hyperparameter configuration, model training, and evaluation of pre-trained models based on the Streamlit platform Link to Write the Docs.