Usage Guide#

Quick Facts#

Here we provide Examples of how to use D4MARL .
You can train offline MARL policy by runnning python run_** .
You can customize the configuration of the algorithm by running on the visible platform streamlit run visualize.py .
You can run a run to download and train policy by adding the parameter download_dataset .
You can run an evaluation by simply clicking the compare methods .
You can choose the training curve color of each method by clicking the color .

Train Policy#

Example

You can train an offline MARL policy by running:

if [ mode == "baseline" ]
then
    python -u run_baseline_sc2.py \
        --offline_data_dir $path_to_data \
        --download_dataset \   # download demo dataset to start a quick training
        --algorithm $baseline_algorithm \
elif [ mode == "madt" ]
then
    python -u run_madt_sc2.py \
        --offline_data_dir $path_to_data \
        --download_dataset \
fi

Here we provide an example of training MADT models using 2m_vs_1z data:

Hint

The above command will train a policy with baseline algorithms including ICQ, BCQ, CQL, or MADT, and the total training steps is 1024. The vector environment number is 1. The algo_cfgs:steps_per_epoch is default as 500. If there is no local offline dataset in the offline_data_dir, the command will download the dataset automatically from our online storage.

Customize Configuration#

Example

You can also customize the configuration of the algorithm by running:

streamlit run visualize.py

Here we provide a user interface, in this platform, you can choose which specific task and approach need to be trained offline:

Hint

We developed a visible training tool that integrates data preparation, hyperparameter configuration, model training, and evaluation of pre-trained models based on the Streamlit platform Link to Write the Docs.