SOLVE:
RL Baseline
Learning to Solve Voxel Building Embodied Tasks from Pixels and Natural Language Instructions
Overview:
The adoption of pre-trained language models to generate action plans for embodied agents is a promising research strategy. However, execution of instructions in real or simulated environments requires verification of the feasibility of actions as well as their relevance to the completion of a goal. We propose a new method that combines a language model and reinforcement learning for the task of building objects in a Minecraft-like environment according to the natural language instructions. Our method first generates a set of consistently achievable sub-goals from the instructions and then completes associated sub-tasks with a pre-trained RL policy. The proposed method formed the RL baseline at the IGLU 2022 competition.
@article{skrynnik2022learning,
title={Learning to Solve Voxel Building Embodied Tasks from Pixels and Natural Language Instructions},
author={Skrynnik, Alexey and Volovikova, Zoya and C{\^o}t{\'e}, Marc-Alexandre and Voronov, Anton and Zholus, Artem and Arabzadeh, Negar and Mohanty, Shrestha and Teruel, Milagro and Awadallah, Ahmed and Panov, Aleksandr and Burtsev, Mikhail and Kiseleva, Julia },
journal={arXiv preprint arXiv:2211.00688},
year={2022}
}